Model parameter now optional.
Our APIs no longer require a model to be specified. Each endpoint comes with great defaults. For more control, a model can still be specified by adding a model param in the request.
Updated Small, Medium, and Large Generation Models
Updated small
, medium
, and large
models are more stable and resilient against abnormal inputs due to a FP16 quantization fix. We also fixed a bug in generation presence & frequency penalty, which will result in more effective penalties.
New Extremely Large Model!
Our new and improved xlarge
has better generation quality and a 4x faster prediction speed. This model now supports a maximum token length of 2048 tokens and frequency and presence penalties.
New & Improved Generation and Representation Models
We've retrained our small
, medium
, and large
generation and representation models. Updated representation models now support contexts up to 4096 tokens (previously 1024 tokens). We recommend keeping text lengths below 512 tokens for optimal performance; for any text longer than 512 tokens, the text is spliced and the resulting embeddings of each component are then averaged and returned.
Finetuning Available + Policy Updates
Finetuning is Generally Available
New & Improved Generation Models
We’ve shipped updated small
, medium
, and large
generation models. You’ll find significant improvements in performance that come from our newly assembled high quality dataset.
Classification Endpoint
Classification is now available via our classification endpoint. This endpoint is currently powered by our generation models (small
and medium
) and supports few-shot classification. We will be deprecating support for Choose Best by May 18th. To learn more about classification at Cohere check out the docs here.
Extremely Large (Beta) Release
Our biggest and most performant generation model is now available. Extremely Large (Beta)
outperforms our previous large
model on a variety of downstream tasks including sentiment analysis, named entity recognition (NER) and common sense reasoning, as measured by our internal benchmarks. You can access Extremely Large (Beta)
as xlarge-20220301
. While in Beta, note that this model will have a maximum token length of 1024 tokens and maximum num_generations
of 1.
Larger Representation Models
Representation Models are now available in the sizes of medium-20220217
and large-20220217
as well as an updated version of small-20220217
. Our previous small
model will be available as small-20211115
. In addition, the maximum tokens length per text has increased from 512 to 1024. We recommend keeping text lengths below 128 tokens for optimal performance; for any text longer than 128 tokens, the text is spliced and the resulting embeddings of each component are then averaged and returned.