Number of Generations

When you call the Generate endpoint, you have the option to generate multiple generations in a single call. This is done by setting the num_generations parameter.

Generating multiple outputs in a single API call

The model’s outputs will vary depending on the generation settings you have specified, such as temperature, top-k, and top-p.

Each generation comes with its set of likelihood values, which consists of:

  • The likelihood of each generated token
  • The average likelihood of all generated tokens.

The following is one example set of outputs that a Large model generates.

The input entered is “This curved gaming monitor delivers ...

The outputs generated (maximum token set at 4), sorted by average token likelihood are as follows:

-0.96a truly immersive experience
-1.11a virtually seamless view
-1.70the ultimate viewing experience
-2.15a 144Hz rapid
-2.44a comfortable and stylish

You can use these outputs in a number of ways, for example, by selecting the one with the highest likelihood as the final output, or by presenting these as options in your application.