This endpoint generates realistic text conditioned on a given input.


This is an interactive tutorial!

To run this tutorial, click on Examples and select one of the options.

The goal of text summarization is to condense the original text into a shorter version that retains the most important information. In this example, we want to summarize a passage from a news article into its main point.

1. Set up

Install the SDK, if you haven't already.

$ pip install cohere

Next, set up the Cohere client.

import cohere
co = cohere.Client(api_key)

2. Create prompt

Create a prompt consisting of a few examples passages and their summaries.

prompt = f"""Passage: Is Wordle getting tougher to solve? Players seem to be convinced that the game has gotten harder in recent weeks ever since The New York Times bought it from developer Josh Wardle in late January. The Times has come forward and shared that this likely isn't the case. That said, the NYT did mess with the back end code a bit, removing some offensive and sexual language, as well as some obscure words There is a viral thread claiming that a confirmation bias was at play. One Twitter user went so far as to claim the game has gone to "the dusty section of the dictionary" to find its latest words.

TLDR: Wordle has not gotten more difficult to solve.
Passage: ArtificialIvan, a seven-year-old, London-based payment and expense management software company, has raised $190 million in Series C funding led by ARG Global, with participation from D9 Capital Group and Boulder Capital. Earlier backers also joined the round, including Hilton Group, Roxanne Capital, Paved Roads Ventures, Brook Partners, and Plato Capital.

TLDR: ArtificialIvan has raised $190 million in Series C funding.
Passage: The National Weather Service announced Tuesday that a freeze warning is in effect for the Bay Area, with freezing temperatures expected in these areas overnight. Temperatures could fall into the mid-20s to low 30s in some areas. In anticipation of the hard freeze, the weather service warns people to take action now.


3. Define model settings

The Generate endpoint has a number of settings you can use to control the kind of output it generates. The full list is available in the API reference, but let’s look at a few:

  • model - Ranges from small, medium, large, and xlarge. Generally, smaller models are faster while larger models will perform better.
  • max_tokens - The maximum length of text to be generated. One word contains approximately 2-3 tokens.
  • temperature - Ranges from 0 to 5. Controls the randomness of the output. Lower values tend to generate more “predictable” output, while higher values tend to generate more “creative” output. The sweet spot is typically between 0 and 1.
  • stop_sequences - A stop sequence will cut off your generation at the end of the sequence. This effectively informs the model of when to stop. Add your stop sequence at the end of each example in the prompt (refer to the prompt we’d created, which uses “--” as the stop sequence).

4. Generate text

Call the Generate endpoint via the co.generate() method, specifying the prompt and the rest of the model settings.

response = co.generate( 
    prompt = prompt,

summary = response.generations[0].text
Click Try It! to start a request and see the response here!