(Qiu et al., 2020) describe the history, technical aspects, and applications of pre-trained language models like the ones which power the Cohere API. We recommend reading this survey and other language modeling research to learn what kinds of knowledge are encoded in language models and how to use their outputs responsibly in downstream tasks.
Language models might encode the following:
- Linguistic information such as subject-verb agreement, part-of-speech, and other simple syntactic structures (Liu et al., 2019; Hewitt et al., 2019).
- World knowledge, including relational and commonsense knowledge such as where famous individuals were born or the color of the sky, limited by what is contained in the training data.
- Social biases, such as stereotypes common on the internet or in Western culture (May et al., 2019).
The following factors may impact our language models’ performance.
Note: we use the Similarity function for demonstrations throughout the Model Limitations section. Similarity predicts the Target text that our language model identifies to be most semantically similar to the Anchor text.
Language: Due to the lack of available training data and evaluation datasets for the majority of the world’s languages, the model is unlikely to perform well on languages other than the dominant dialects of English (Joshi, 2020, Dodge, 2021).
Example: Our language models may fail to meaningfully represent non-English phrases, and should not be used to find good translations, for example.
Socio-economic: The majority of publicly available data used to train the model is from wealthier individuals in more developed countries, and is largely Western-centric (Pew, 2021). As a result, performance will likely degrade on text about concepts, people and places from other regions, especially that of the Global South.
Example: The models may prefer phrases and ideas associated with Western ideals or with wealthier and more technologically developed cultures.
Historical: At any point, the model will only represent the concepts, events, places, and people from data on which it was trained. Information from after the dataset was gathered will not be represented. For example, if an event occurred today, the model would not be able to return a meaningful representation of the event name. Additionally, the model may amplify outdated societal biases about groups of people.
Ungrounded: Model outputs are derived statistically rather than from any direct modeling of the meaning of words and phrases, and therefore they should not be interpreted as a grounded means for understanding text (Bender, 2020).
Example: The models may fail to perform simple reasoning tasks involving chronological, arithmetic, or other logical relationships.
Biases: Language models capture the hegemonic viewpoint, reflecting and magnifying biases that exist on the internet (Bender, 2021). As a result, marginalized groups can be harmed by entrenching existing stereotypes, or producing demeaning portrayals (Crawford, 2017). Despite our ongoing efforts to mitigate these biases, we acknowledge that this is an ongoing research area (Gonen, 2019).
Example: The models may associate gender, racial, and other identities with professions and concepts which are semantically unrelated to those identities. These associations are likely to reflect biases present in the historical data the model was trained on. (Kurita et al.) investigate this in more detail.