Summarize API
This Guide Uses the Summarize Endpoint.
You can find the API reference for the endpoint here.
This endpoint generates a succinct version of the original text that relays the most important information.
Ideal use cases include, but are not limited to: news articles, blogs, chat transcripts, scientific articles, meeting notes, and any text that you should like to see a summary of!
The endpoint can:
- Summarize a single document
- Control output length
Experimental Features
These features are extremely experimental. Using these feature could lead to a substantial decrease in performance over the overall model. It is included as a feature based on user feedback — and our team is actively working on delivering a better solution. Because it is critical for some applications, we have exposed an experimental version. If you do try it out, we welcome your feedback.
- Format chosen output
- Handle long documents
- Provide additional instructions to focus the summary
We recommend leveraging the playground for quick use cases, but for any repeated utilizations we strongly recommend the API. An example is provided below.
In this example, we want to summarize a passage from a news article into its main point.
1. Set up
First, let's install the SDK (the examples below are in Python, Typescript, and Go):
pip install cohere
npm i -s cohere-ai
go get github.com/cohere-ai/cohere-go/v2
Import dependencies and set up the Cohere client.
import cohere
co = cohere.Client('Your API key')
import { CohereClient } from "cohere-ai";
const cohere = new CohereClient({
token: "YOUR_API_KEY",
});
(async () => {
const prediction = await cohere.generate({
prompt: "hello",
maxTokens: 10,
});
console.log("Received prediction", prediction);
})();
import cohereclient "github.com/cohere-ai/cohere-go/v2/client"
client := cohereclient.NewClient(cohereclient.WithToken("<YOUR_AUTH_TOKEN>"))
(All the rest of the examples on this page will be in Python, but you can find more detailed instructions for getting set up by checking out the Github repositories for Python, Typescript, and Go.)
2. Create prompt
Store the document you want to summarize into a variable
text ="""It's an exciting day for the development community. Cohere's state-of-the-art language AI is now available through Amazon SageMaker. This makes it easier for developers to deploy Cohere's pre-trained generation language model to Amazon SageMaker, an end-to-end machine learning (ML) service. Developers, data scientists, and business analysts use Amazon SageMaker to build, train, and deploy ML models quickly and easily using its fully managed infrastructure, tools, and workflows.
At Cohere, the focus is on language. The company's mission is to enable developers and businesses to add language AI to their technology stack and build game-changing applications with it. Cohere helps developers and businesses automate a wide range of tasks, such as copywriting, named entity recognition, paraphrasing, text summarization, and classification. The company builds and continually improves its general-purpose large language models (LLMs), making them accessible via a simple-to-use platform. Companies can use the models out of the box or tailor them to their particular needs using their own custom data.
Developers using SageMaker will have access to Cohere's Medium generation language model. The Medium generation model excels at tasks that require fast responses, such as question answering, copywriting, or paraphrasing. The Medium model is deployed in containers that enable low-latency inference on a diverse set of hardware accelerators available on AWS, providing different cost and performance advantages for SageMaker customers.
"""
3. Define model settings
The endpoint has a number of settings you can use to control the kind of output it generates. The full list is available in the API reference, but let’s look at a few:
model
-command
orcommand-lite
. Generally, lite models are faster while larger models will perform better.temperature
- This parameter ranges from 1 to 5, and controls the randomness of the output. Higher values tend to generate more creative outcomes, and gives you the opportunity of generating various summaries for the same input text. It also might include more hallucinations, and it might make the model less likely to ground its replies in the context you've provided when using retrieval augmented generation. Use a higher value if for example you plan to perform a selection of various summaries afterwards.length
- You can choose betweenshort
,medium
andlong
.short
summaries are roughly up to two sentences long,medium
between three and five, andlong
might have more six or more sentences.format
- You can choose betweenparagraph
andbullets
. Paragraph generates a coherent sequence of sentences, whilebullets
outputs the summary in bullet points.extractiveness
- This parameter can be set atlow
,medium
,high
values.
4. Generate the summary
Call the endpoint via the co.summarize()
method, specifying the prompt and the rest of the model settings.
response = co.summarize(
text=text,
model='command',
length='medium',
extractiveness='medium'
)
summary = response.summary
5. Limitations
As with any work building atop statistical large language models, there is the risk that the output contains facts not present in the original document. These hallucinations might be innocuous, in the sense that they enrich the summary with additional facts, but they can also contain inaccuracies.
The control parameters of length
and extractivenesss
have an impact on the final output, but are not absolute. For instance, a low
extractive summary can still contain a sentence taken verbatim from the original document, and a long
summary can still be less than six sentences long.
Updated 14 days ago