This endpoint generates realistic text conditioned on a given input.

The Generate endpoint generates text given an input, called a prompt. The prompt provides a context for the text that we want the model to generate.

Prompt engineering is a fascinating topic. It is about figuring out the optimal way to prompt a model for a particular task, so we can shape the output to be how we want it to be.

In this example, we have a startup idea generator. We want the endpoint to generate a startup idea and its name, given an industry/vertical as the input.

1. Set up

Install the SDK, if you haven't already.

$ pip install cohere

Next, set up the Cohere client.

import cohere
co = cohere.Client(api_key)

2. Create prompt

A basic prompt format that generally works well contains:

  • A short description of the overall context
  • A few examples of prompts and completions
prompt = f"""  
This program generates a startup idea and name given the industry.

Industry: Workplace  
Startup Idea: A platform that generates slide deck contents automatically based on a given outline  
Startup Name: Deckerize  
--  
Industry: Home Decor  
Startup Idea: An app that calculates the best position of your indoor plants for your apartment  
Startup Name: Planteasy
--  
Industry: Healthcare  
Startup Idea: A hearing aid for the elderly that automatically adjusts its levels and with a battery lasting a whole week  
Startup Name: Hearspan

--  
Industry: Education  
Startup Idea: An online school that lets students mix and match their own curriculum based on their interests and goals  
Startup Name: Prime Age

--  
Industry: Productivity  
Startup Idea:"""

3. Define model settings

The Generate endpoint has a number of settings we can use to control the kind of output it generates. The full list is available in the API reference, but let’s look at a few:
model - Either medium or xlarge. Generally, smaller models are faster while larger models will perform better.
max_tokens - The maximum length of text to be generated. One word contains approximately 2-3 tokens.
temperature - Ranges from 0 to 5. Controls the randomness of the output. Lower values tend to generate more “predictable” output, while higher values tend to generate more “creative” output. The sweet spot is typically between 0 and 1.
stop_sequences - A stop sequence will cut off your generation at the end of the sequence. This effectively informs the model of when to stop. Add your stop sequence at the end of each example in the prompt (refer to the prompt we’d created, which uses “--” as the stop sequence).

4. Generate text

Call the Generate endpoint via the co.generate() method, specifying the prompt and the rest of the model settings.

response = co.generate(  
    model='xlarge',  
    prompt = prompt,  
    max_tokens=40,  
    temperature=0.6,  
    stop_sequences=["--"])

startup_idea = response.generations[0].text
Language
Authentication
Bearer
Click Try It! to start a request and see the response here!