How Spotify uses AI to have 248 million versions of the product?

#001

Mobile Phone - Set to Landscape (Horizontal) view for better visibility

GOLD NUGGETS

Sources - They are clickable if you need more information

Let's start the Spotify Application. 
Let's start the Spotify Application. 
And Check Their Recommendation Engine.
Home Screen is Opened.

A potential problem that occurs when the system has a new user, but it does not have information about its preference in order to make recommendations. The goal is to get some context about the user.

Cold Start

Home Screen is Opened.

AI KNOWLEDGE

Origin - The car engine has difficulty starting up when it's cold. But, once it warms up, it will run smoothly.

A potential problem that occurs when the system has a new user, but it does not have information about its preference in order to make recommendations. The goal is to get some context about the user.

Cold Start

Spotify's Approach to Tackle Cold Start

AI KNOWLEDGE

Origin - The car engine has difficulty starting up when it's cold. But, once it warms up, it will run smoothly.

By choosing at least 3 artist, the system is getting more context about us
By choosing at least 3 artist, the system is getting more context about us
The home screen will be organized based on the selected artists
The home screen is organized using a series of cards and shelves.
A card is a square image that represents a playlist, podcast episode, artist page, album, and so on.
 
A shelf is the row we use to group a series of cards.
Bookcase (Spotify Home)
Bookshelves (Shelves)
Books (Cards)

Bookcase Analogy

Tony Jebara - VP of Engineering, Head of ML at Spotify

What now? How to serve even more personalized content?

Control Group

Treatment Group

Classical Approach - A/B testing

A simple controlled experiment where two versions (A and B) of a single variable are compared. They are identical except for one variation that might affect a user's behavior. Version A might be the currently used version (control), while version B is modified in some respect (treatment). [1]

A/B Testing

Switch

shelves

AI KNOWLEDGE

A/B Test Example - Blood Pressure 

Control Group

Treatment Group

Placebo

Medicine

Compare Results

Cost of A/B Testing

In this quest to maximize conversions, there is a cost that incurs – a sizable portion of your traffic is routed to a losing variant directly reducing your business metrics (like sales or conversions).

 

Most classic A/B tests are, by design, forever in ‘exploration’ mode – after all, determining statistically significant results is their reason for existence, hence the perpetual exploration. In an A/B test, the focus is on discovering the exact conversion rate of variations.

Cost of A/B Testing

In this quest to maximize conversions, there is a cost that incurs – a sizable portion of your traffic is routed to a losing variant directly reducing your business metrics (like sales or conversions).

 

Most classic A/B tests are, by design, forever in ‘exploration’ mode – after all, determining statistically significant results is their reason for existence, hence the perpetual exploration. In an A/B test, the focus is on discovering the exact conversion rate of variations.

Exploration?

The Exploration–Exploitation Dilemma

The best long-term strategy may involve short-term sacrifices.

 

Exploit

Make the best decision given current information

Explore

Gather more information to make a better decision in the future

 

 Online Advertising

Exploit: Show the most successful advert Explore: Show a new advert

 Restaurant Selection

Exploit: Go to your favorite restaurant Explore: Try a new restaurant

Oil drilling

Exploit: Drill at the best-known location Explore: Drill at a new potential oil field

The Exploration–Exploitation Dilemma

The best long-term strategy may involve short-term sacrifices.

 

Exploit

Make the best decision given current information

Explore

Gather more information to make a better decision in the future

 

AI KNOWLEDGE

Dilemma Examples

However, Spotify doesn't use this approach, but a more sophisticated one... Let me explain.
How can we add a twist to A/B Testing – exploitation? 

How to produce faster results since there is no need to wait for a single winning variation?

How to have a ‘smarter’  version of A/B testing that uses machine learning algorithms to drive more traffic to variations that are performing well, while giving less traffic to variations that are underperforming?

Multi-Armed Bandit

Goal - We want to identify the machine with the highest payout and exploit it — i.e. pull it more than the others.

Problem  - How to most efficiently identify the best machine to play and exploit it, while still exploring the many options in real-time?

Solution - Multi-Armed Bandit algorithm 

Multi-Armed Bandit

Bandit - old slot machines that rob those who play them

Armed - the machines are used by pulling an arm

Multi - there are more than 2 of these machines

Multi-Armed Bandit

Bandit - old slot machines that rob those who play them

Armed - the machines are used by pulling an arm

Multi - there are more than 2 of these machines

 eCommerce

Limited money - eCommerce customers

Bandit - eCommerce product

Problem - Determine which product to show to which customer? [1]

Solution - MAB algorithm

GOLD NUGGETS

Summary - A/B Testing vs Multi-Armed Bandit (MAB)

Selected

MAB - Action & Reward

Action - Pulling the arm

Reward - Payout after pulling the arm

MAB - Action & Reward

Action - Pulling the arm

Reward - Payout after pulling the arm

Can we get more context?

By choosing at least 3 artist, the system is getting more context about us

Remember how Spotify does it?

The multi-armed bandit algorithm outputs an action but doesn’t use any information about the state of the environment (context).

If you use a multi-armed bandit to choose whether to display cat images or dog images to the user of your website, you’ll make the same random decision even if you know something about the preferences of the user.

The contextual bandit extends the multi-armed bandit by making the decision conditional on the state of the environment.

Contextual Bandit

Action

Reward

Action

Reward

State

Contextual Bandit in Spotify Song Selection Scenario

The context is information about the user: where they come from, previously visited pages of the site, device information, geolocation, etc.

An action is a user's choice of what song to display.
An outcome is whether the user clicked on a song or not.
A reward is binary: 0 if there is no click, 1 if there is a click.

GOLD NUGGETS

However, Spotify didn't stop with Contextual Bandit...

BaRT - Bandits for Recsplanations as Treatments

BaRT - Spotify's contextual bandit approach to performing exploration-exploitation with recsplanations. 
The goal - quickly help users find something they are going to enjoy listening to

BaRT - Bandits for Recsplanations as Treatments

BaRT - Spotify's contextual bandit approach to performing exploration-exploitation with recsplanations. 
The goal - quickly help users find something they are going to enjoy listening to

BaRT - Bandits for Recsplanations as Treatments

Recsplanations are now a common way for a recommender to tell a user why they are being recommended a particular item

BaRT learns and predicts satisfaction (e.g., click-through rate, consumption probability) for any combination of item, explanation, and context and, through careful logging and contextual bandit retraining, can learn from its mistakes in an online setting.

AI KNOWLEDGE

BaRT Experimental Evaluation 

The plot shows the probability of a user streaming the recommended song for more than 30 seconds.


BaRT significantly outperforms random situation.

GOLD NUGGETS

BaRT Experimental Evaluation 

The plot shows the probability of a user streaming the recommended song for more than 30 seconds.


BaRT significantly outperforms random situations.

GOLD NUGGETS

The plot shows the probability of a user streaming the recommended song for more than 30 seconds.


BaRT significantly outperforms random situation.

GOLD NUGGETS

BaRT Experimental Evaluation 

How long did someone listen to the recommended song compared to a random recommendation.

The plot shows the probability of a user streaming the recommended song for more than 30 seconds.


BaRT significantly outperforms random situations.

GOLD NUGGETS

GOLD NUGGETS

Tony Jebara - VP of Engineering, Head of ML at Spotify

Or is it...
2021
Embracing Tech Readers

Congratulations!

For those that really liked the case study...

How did you like the case study?

You want to share the case study with others?
You want to receive news about new case studies?
You want to talk with me about AI use-cases or content creation?