Skip to main content

Sentiment Analysis

In this section, we show how to use the Choose Best endpoint to do sentiment classification for reviews of a bakery.

The problem we want to solve#

For this tutorial, let's assume that we want to classify a set of reviews for a bakery into positive and negative reviews. We might for instance have a review like this:

'Simply best cookies in town. The hype everyone has is real. A bit too sweet so perfect to share with someone.'

that we want to automatically classify as a positive review.

Naturally, the same techniques that we'll use for this problem can be used for any other task where we want to classify a given text according to a fixed set of classes.

How the Choose Best endpoint works#

Choose Best takes in a query as well as a list of options, and returns scores that indicate how likely each of the options is to follow (or preceed, depending on the request) the given query.

Using Choose Best for our task#

In our case, we could pass a given review as the query and the different classes, e.g. Positive and Negative, as options to Choose Best. The scores we obtain from the endpoint then direclty indicate how likely it is that a given review falls into the given classes.

There are a few design decisions with regards to the query and the options that can greatly impact the quality of the classification we get from the models when using Choose Best:

  • In general, differences in the formulation of the options that seem small to us can have a large impact on the output of the model. (See prompt engineering)
  • Try to make the combination of query and a given option sound like something that you would write yourself, or something that you might read online. In our sentiment analysis example this means that we do not just submit Positive or Negative as options but instead use the options This is a positive review: and This is a negative review:

Example calls to Choose Best#

Putting all of the above together, we might then call the API with the following arguments:

  • Query: Simply best cookies in town. The hype everyone has is real. A bit too sweet so perfect to share with someone.,
  • Option 1: This is a positive review:
  • Option 2: This is a negative review:
  • Mode: PREPEND_OPTION

This then gives us the following return value:

{
"scores": [
5.57207,
5.23814
]
}

which indicates that, as we would expect, our model thinks that 'This is a positive review:' is more likely than 'This is a negative review:' to preceed the given review.

Passing the following review to the model:

Had a chcocolate chip walnut cookie.IMO the cookie was way to sweet for palate.
Yes lots of chocolate chips but it was overwhelming along with the high sugar content.
At first it did not not looked like a regular cookie. It actually resembled a scone.
Aside from being too sweet, lets tall price.
At $4 a cookie this pretty expensive and certainly overprice.
So sweet was the cookie that i was only able to eat half of the cookie.
I dont think i would come back for cookies..
my wife also bought a double chocolate chip cookie which we have yet to eat...cookies at home never last this long..

returns the likelihoods:

{
"scores": [
6.25480,
6.26256
]
}

which tells us that according to the model it is more likely that this review is negative rather than positive, albeit with quite a small margin.