Skip to main content

Finetuning Generation Models

An Overview of Finetuning#

Finetuning is the process of taking a pre-trained Large Language Model (LLM) and customizing it with a dataset to excel at a specific task. Finetuning LLMs tends to lead to some of the best performing models in NLP for a wide number of tasks.

A baseline model already comes pre-trained with a huge amount of text data. Finetuning builds on that by taking in and adapting to your own training data. The result is a custom, finetuned model, which produces outputs that are more attuned to the task you have at hand.

Creating a custom model via finetuning

Finetuning uses training data to turn a baseline pre-trained model into a custom, finetuned model

In this article, we look at finetuning a generation model. See here for finetuning a representation model.

When to Finetune#

Finetuning large language models is only required to teach the model something extremely niche, such as the different gaits of a horse, or your company's unique knowledge base. Common knowledge, like the colour of the sky, does not require finetuning. Finetuning is also helpful for generating or understanding data in a specific writing style or format. Finetuning may be helpful regardless of which of our endpoints you are using.

Data Input#

Our platform allows data upload and linking to data. To link to data you can use any url that is publicly accessible. If you would like to link data to a Google or AWS Bucket, while keeping the files secured, you can use a Signed URL. The easiest way to obtain a signed URL for GCS is to copy the download link in the web UI.

Separators#

When finetuning, there is an option to pass in a separator to denote a "unit" of training data. We recommend using a special string, such as --SEPARATOR--, to distinguish training examples from one another. When generating or processing longform text, ensure that the resulting examples are not too short compared with the length of text that you would like to generate after the separator has been applied.

For example, when finetuning a model to generate haikus, an example input .txt file might look like:

visualizations
of computational graphs
as the thunder storms
--SEPARATOR--
when i die, bury
me under a v3-8
in europe west 4
--SEPARATOR--
i can make you cry
using just five syllables:
anisotropy
--SEPARATOR--
beneath the oak tree
gazing into the distance
watching tensors flow
--SEPARATOR--
these shenanigans
will not be the death of me
scatter_nd will
--SEPARATOR--
torch or tensorflow?
the answer is crystal clear:
it's obviously jax.
--SEPARATOR--
i have on my ribs
attention is all you need
tattooed in red ink
--SEPARATOR--
spin me in weight space
paint me in half precision
we're best in chaos
--SEPARATOR--
the one thing worse than
good ol' anisotropy:
off-by-one error
--SEPARATOR--
seventeen zero
six. zero three seven six
two; is all you need

Data Size#

The following lists down the minimum data needed for finetuning:

  • File size: minimum 100KB (we recommend at least 1MB)
  • Number of examples (only if you use separators): 200

In general, we recommend finetuning with as much data as possible for the best results.

Data Quality#

We recommend performing some common checks on data quality and removing:

  • data with excessive spacing or newlines
  • highly repetitive data

Finetuning a Generation Model: Step-by-step#

Finetuning a generation model consists of a few simple steps. Let’s go through the steps for finetuning a generation model.

On the Cohere platform, go to the Dashboard and click on ‘Create Finetune’.

Creating a finetune

Creating a finetune

Choose the Baseline Model#

Choose ‘Generation (Generate)’ as the model type and select the size of your choice. There is a tradeoff—in general, bigger models exhibit better performance while smaller models are faster to finetune.

Upload Your Data#

Upload your data by clicking on ‘Choose a .txt file’. Your data should be in TXT format. If your data contains separators to distinguish between training examples, add the separator string in the ‘Data separator’ field (for example: --SEPARATOR--). If your data doesn’t contain separators, leave the field blank.

Choosing model and uploading data

Choosing model and uploading data

Once done, click on ‘Preview data’.

Previewing data

Previewing data

Preview Your Data#

The preview window will show a few samples of your data that has been split. If you included data separators, the data will be split according to the separator. If you didn’t, the split will be done automatically.

If you are happy with how the samples look, click on ‘Review data’.

Reviewing data

Reviewing data

Start Finetuning#

Now, everything is set for finetuning to begin. Click on ‘Start finetuning’ to proceed.

Starting finetuning

Starting finetuning

Monitor the Status#

You can view the status of the finetuning by going to the Dashboard.

Monitoring status

Monitoring status

You can also monitor a more detailed log by hovering over your finetuning task and clicking on ‘View logs’.

Viewing logs

Viewing logs

Here you can track the progress of the finetuning task.

Detailed logs

Detailed logs

Once finetuning is completed, the status will be shown as ‘Ready’.