Generate
This endpoint generates realistic text conditioned on a given input. See prompt engineering to learn how to shape the input in order to solve problems like text summarization and entity extraction.
#
Usage#
Sample ResponseCohere version 2021-11-08
or higher:
{"generations":[{"text": " Brumana, lived the country folk. Among them was Zhulan Noyan, an adventurer and immortal, one who possesses an innate power of changing forms and abilities, just like a person. His goal was to hunt down"}]}
No Cohere version specified (deprecated):
{"text": " Brumana, lived the country folk. Among them was Zhulan Noyan, an adventurer and immortal, one who possesses an innate power of changing forms and abilities, just like a person. His goal was to hunt down"}
#
Requestprompt
#
string max_tokens
#
integer Denotes the number of tokens to predict per generation. See BPE Tokens for more details.
Can only be set to 0
if return_likelihoods
is set to ALL
to get the likelihood of the prompt.
temperature
(optional)#
float Defaults to 0.75
, min value of 0.0
, max value of 5.0
. A non-negative float that tunes the degree of randomness in generation. Lower temperatures mean less random generations. See Temperature for more details.
num_generations
(optional)#
integer Defaults to 1
, max value of 5
. Denotes the maximum number of generations that will be returned. Requires the Cohere-Version
header to be set with a version of 2021-11-08
or higher. Must be 1
for the xlarge
(beta) model.
k
(optional)#
integer Defaults to 0
(disabled), which is the minimum. Maximum value is 500
. Ensures only the top k
most likely tokens are considered for generation at each step.
p
(optional)#
float Defaults to 0.75
. Set to 1.0
or 0
to disable. If set to a probability 0.0 < p < 1.0
, it ensures that only the most likely tokens, with total probability mass of p
, are considered for generation at each step. If both k
and p
are enabled, p
acts after k
.
frequency_penalty
(optional)#
float Defaults to 0.0
, max value of 1.0
. Can be used to reduce repetitiveness of generated tokens. The higher the value, the stronger a penalty is applied to previously present tokens, proportional to how many times they have already appeared in the prompt or prior generation.
presence_penalty
(optional)#
float Defaults to 0.0
, max value of 1.0
. Can be used to reduce repetitiveness of generated tokens. Similar to frequency_penalty
, except that this penalty is applied equally to all tokens that have already appeared, regardless of their exact frequencies.
stop_sequences
(optional)#
array of string A stop sequence will cut off your generation at the end of the sequence. Providing multiple stop sequences in the array will cut the generation at the first stop sequence in the generation, if applicable.
return_likelihoods
(optional)#
string One of GENERATION|ALL|NONE
to specify how and if the token likelihoods are returned with the response. Defaults to NONE
.
If GENERATION
is selected, the token likelihoods will only be provided for generated text.
If ALL
is selected, the token likelihoods will be provided both for the prompt and the generated text.
#
ResponseThe response is an object containing the following elements:
generations
#
array of objects An array of objects with the following shape:
If return_likelihoods
is set to GENERATION
or ALL
, the objects returned will have the following shape:
The likelihood refers to the average log-likelihood of the entire specified string, which is useful for evaluating the performance of your model, especially if you've finetuned a custom model. Individual token likelihoods provide the log-likelihood of each token. The first token will not have a likelihood.