Skip to main content

Generate

This endpoint generates realistic text conditioned on a given input. See prompt engineering to learn how to shape the input in order to solve problems like text summarization and entity extraction.

Usage#

    Sample Response#

    Cohere version 2021-11-08 or higher:

    {
    "generations":[
    {
    "text": " Brumana, lived the country folk. Among them was Zhulan Noyan, an adventurer and immortal, one who possesses an innate power of changing forms and abilities, just like a person. His goal was to hunt down"
    }
    ]
    }

    No Cohere version specified (deprecated):

    {
    "text": " Brumana, lived the country folk. Among them was Zhulan Noyan, an adventurer and immortal, one who possesses an innate power of changing forms and abilities, just like a person. His goal was to hunt down"
    }

    Request#

    prompt#

    string
    Represents the prompt or text to be completed. Trailing whitespaces will be trimmed. If your use case requires trailing whitespaces, please contact ivan@cohere.ai.

    max_tokens (optional)#

    integer

    Defaults to 20. Denotes the number of tokens to predict per generation. See BPE Tokens for more details.

    Can only be set to 0 if return_likelihoods is set to ALL to get the likelihood of the prompt.

    model (optional)#

    string

    The size of model to generate with, currently available models are small, medium, large, xlarge (beta), defaults to xlarge. Small models are faster, while larger models will perform better. Finetuned models can also be supplied with their full ID.

    preset (optional)#

    string

    The ID of a custom playground preset. You can create presets in the playground. If you use a preset, all other parameters become optional, and any included parameters will override the preset's parameters.

    temperature (optional)#

    float

    Defaults to 0.75, min value of 0.0, max value of 5.0. A non-negative float that tunes the degree of randomness in generation. Lower temperatures mean less random generations. See Temperature for more details.

    num_generations (optional)#

    integer

    Defaults to 1, min value of 1, max value of 5. Denotes the maximum number of generations that will be returned. Requires the Cohere-Version header to be set with a version of 2021-11-08 or higher.

    k (optional)#

    integer

    Defaults to 0(disabled), which is the minimum. Maximum value is 500. Ensures only the top k most likely tokens are considered for generation at each step.

    p (optional)#

    float

    Defaults to 0.75. Set to 1.0 or 0 to disable. If set to a probability 0.0 < p < 1.0, it ensures that only the most likely tokens, with total probability mass of p, are considered for generation at each step. If both k and p are enabled, p acts after k.

    frequency_penalty (optional)#

    float

    Defaults to 0.0, min value of 0.0, max value of 1.0. Can be used to reduce repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation.

    presence_penalty (optional)#

    float

    Defaults to 0.0, min value of 0.0, max value of 1.0. Can be used to reduce repetitiveness of generated tokens. Similar to frequency_penalty, except that this penalty is applied equally to all tokens that have already appeared, regardless of their exact frequencies.

    stop_sequences (optional)#

    array of string

    A stop sequence will cut off your generation at the end of the sequence. Providing multiple stop sequences in the array will cut the generation at the first stop sequence in the generation, if applicable.

    return_likelihoods (optional)#

    string

    One of GENERATION|ALL|NONE to specify how and if the token likelihoods are returned with the response. Defaults to NONE.

    If GENERATION is selected, the token likelihoods will only be provided for generated text.

    If ALL is selected, the token likelihoods will be provided both for the prompt and the generated text.

    logit_bias (optional) (experimental)#

    map from int to float

    Used to prevent the model from generating unwanted tokens or to incentivize it to include desired tokens. The format is {token_id: bias} where bias is a float between -10 and +10. Tokens can be obtained from text using Tokenize.

    For example, if the value {11: -10} is provided, the model will be very unlikely to include the token 11 ("\n", the newline character) anywhere in the generated text. In contrast {11: +10} will result in generations that nearly only contain that token. Values between -10 and +10 will proportionally affect the likelihood of the token appearing in the generated text.

    Note: logit bias may not be supported for all finetune models.

    Response#

    The response is an object containing the following elements:

    generations#

    array of objects

    An array of objects with the following shape:

    {
    "text": string,
    }

    If return_likelihoods is set to GENERATION or ALL, the objects returned will have the following shape:

    {
    "text": string,
    "likelihood": float,
    "token_likelihoods": [
    "token": string,
    "likelihood": float
    ]
    }

    The likelihood refers to the average log-likelihood of the entire specified string, which is useful for evaluating the performance of your model, especially if you've finetuned a custom model. Individual token likelihoods provide the log-likelihood of each token. The first token will not have a likelihood.