Skip to main content

Finetune Troubleshooting

In this post, we answer frequently asked questions about finetuning.

Troubleshooting Representation finetunes#

Gathering enough data#

While our Classify endpoint enables a user to build a classifier with just 5 examples per label, this classifier runs on our baseline model which has not been trained for specific use cases. Your dataset must contain at least 250 labelled examples to start training.

If you are unable to locate a relevant labelled dataset from online sources, we suggest trying to generate labelled examples using our Generate endpoint. Check out this sample preset of a user generating product feedback to finetune a product feedback classifier:

This is a screenshot of Cohere's Playground generating sample data for representation finetuning.

Data formatting best practices#

Ensure your data is in a two-column csv. One column should be the sample text you'd like to classify or search, and the second column should be a label for the text. We recommend using a comma , as your delimiter.

Here are a few example lines from a dataset that could be used to train a model that classifies headlines as positive, negative, and neutral with our Classify endpoint:

The major construction companies of Finland are operating in Russia,neutral
$ESI on lows, down $1.50 to $2.50 BK a real possibility,negative
$SPY wouldn't be surprised to see a green close,positive

To pass data validation, ensure that:

  • There are at least 5 examples for each label in your dataset
  • Your dataset contains at least unique 250 examples in total (not 250 examples per label)
  • Your data is encoded in UTF-8
  • There are no duplicate examples (We will automatically deduplicate your dataset)

Formatting errors for classification tasks#

Cohere's Classify endpoint will return predictions for classes that sum up to 1. We currently do not support outputting classifications for multiple labels (known as multi-label classification). Each example text should be mapped to one label only.

Take this example below:

"How Robots Can Assist Students with Disabilities","technology,health"
"As Gas Prices Went Up, So Did the Hunt for Electric Vehicles","technology,economy"

Currently, we will process this data and train with two labels, technology,health and technology,economy instead of the desired three labels, technology, health, and economy.

In this case you will need to select one label for each headline.

Formatting for search tasks#

At this time, if you are intending to finetune a representation model to use Cohere's Embed endpoint to perform a search task (not predicting a label), you will still need to assign a label to texts for representation finetuning.

For example, if you are building a search engine for Hacker News posts and you want to either cluster similar posts or associate posts with a certain keyword, you would create a labelled dataset with the post titles mapped to the keyword. See a few sample lines labelled below:

post title,keyword
Advice for a new and inexperienced tech lead?,career
How do you deal with getting old and feeling lost?,personal
Best way to learn modern C++?,skills

If you are topic modelling and trying to find clusters, we recommend trying the baseline model. Check out our blog post on topic modelling Hacker News posts.

Troubleshooting auto evaluation metrics#

When you are viewing auto evaluation metrics during or after your finetune has completed, you may find that the F1, Recall, and Precision metrics are missing. This may occur if your dataset is extremely imbalanced (e.g. A binary dataset with 95% positive labels and 5% negative labels) and the finetuned model fails to predict one of the labels at all. This does not prevent you from using this finetuned model, it is simply a warning.

To resolve this warning, try adding more examples for labels with less data.

How long does it take to finetune?#

Finetunes are completed sequentially, and when you launch a finetune it is added to the end of a queue. Depending on the length of our finetuning queue, finetunes may take between 1 hour to a day to complete.

Using your finetuned model#

To use your finetuned model in our API or SDKs, you must call it by its model UUID and not by its name. To get the model UUID, select the model in the playground and click Export Code. Select the library you are using and copy the code to call the model.

This is a screenshot of how to locate the model path to call your finetune.

Restarting paused finetunes#

All finetuned models are paused after 14 days of inactivity. To restart your model, select your model in the finetuned models panel and click on the Wake button, pictured below:

This is a screenshot of how to awaken a paused model.

Troubleshooting failed finetunes#

Our engineers review every individual failed finetune and will attempt to rectify it without any action on your part. We reach out to individuals with failed finetunes we cannot resolve manually.

Please reach out to or post in our co:mmunity Discord if you have unanswered questions about finetuning.