This endpoint generates a succinct version of the original text that relays the most important information.
Ideal use cases include, but are not limited to: news articles, blogs, chat transcripts, scientific articles, meeting notes, and any text that you should like to see a summary of!
The endpoint can:
- Summarize a single document
- Control output length
These features are extremely experimental. Using these feature could lead to a substantial decrease in performance over the overall model. It is included as a feature based on user feedback — and our team is actively working on delivering a better solution. Because it is critical for some applications, we have exposed an experimental version. If you do try it out, we welcome your feedback.
- Ability to format chosen output
- Long document summaries
- Ability to provide additional instructions to focus the summary
We recommend to leverage the playground for quick use cases, but for any repeated utilizations we strongly recommend the API. An example is provided below.
In this example, we want to summarize a passage from a news article into its main point.
1. Set up
Install the SDK, if you haven't already.
$ pip install cohere
Next, set up the Cohere client.
import cohere co = cohere.Client(api_key)
2. Create prompt
Store the document you want to summarize into a variable
text ="""It's an exciting day for the development community. Cohere's state-of-the-art language AI is now available through Amazon SageMaker. This makes it easier for developers to deploy Cohere's pre-trained generation language model to Amazon SageMaker, an end-to-end machine learning (ML) service. Developers, data scientists, and business analysts use Amazon SageMaker to build, train, and deploy ML models quickly and easily using its fully managed infrastructure, tools, and workflows. At Cohere, the focus is on language. The company's mission is to enable developers and businesses to add language AI to their technology stack and build game-changing applications with it. Cohere helps developers and businesses automate a wide range of tasks, such as copywriting, named entity recognition, paraphrasing, text summarization, and classification. The company builds and continually improves its general-purpose large language models (LLMs), making them accessible via a simple-to-use platform. Companies can use the models out of the box or tailor them to their particular needs using their own custom data. Developers using SageMaker will have access to Cohere's Medium generation language model. The Medium generation model excels at tasks that require fast responses, such as question answering, copywriting, or paraphrasing. The Medium model is deployed in containers that enable low-latency inference on a diverse set of hardware accelerators available on AWS, providing different cost and performance advantages for SageMaker customers. """
3. Define model settings
The endpoint has a number of settings you can use to control the kind of output it generates. The full list is available in the API reference, but let’s look at a few:
summarize-medium. Generally, medium models are faster while larger models will perform better.
temperature- Ranges from 1 to 5. Controls the randomness of the output. Higher values tend to generate more creative outcomes, and gives you the opportunity of generating various summaries for the same input text. It also might include more hallucinations. Use a higher value if for example you plan to perform a selection of various summaries afterwards
length- You can choose between
long. Short summaries are roughly up to 2 sentences long,
mediumbetween 3 and 5 and
longmight have more 6 or more sentences.
format- You can choose between
bullets. Paragraph generates a coherent sequence of sentences, while
bulletsoutputs the summary in bullet points
4. Generate the summary
Call the endpoint via the
co.summarize() method, specifying the prompt and the rest of the model settings.
response = co.summarize( model='summarize-xlarge', length='medium', extractiveness='medium' ) summary = response.summary
As any work building on top of statistical large language models, there is the risk that the output contains facts not present in the original document. Those hallucinations might be innocuous, in the sense that they enrich the summary with additional facts, but can also contain inaccuracies.
The control parameters of
extractivenesss have an impact on the final output, but are not absolute. For instance, a
highly extractive summary can still contain a sentence taken verbatim from the original document, and a
long summary can still be less than 6 sentences long.
Updated 2 days ago