Skip to main content

Tokenize

This endpoint splits input text into smaller units called tokens using byte-pair encoding (BPE). To learn more about tokenization and byte pair encoding, see the tokens page.

Usage#

    Sample Response#

    {
    "tokens": [34160, 974, 514, 34, 1420, 69]
    }

    Request#

    text#

    string

    The string to be tokenized.

    Response#

    tokens#

    array of tokens

    An array of tokens, where each token is an integer.