How Spotify uses AI to have 248 million versions of the product?

#001

Mobile Phone - Set to Landscape (Horizontal) view for better visibility

GOLD NUGGETS

Sources - They are clickable if you need more information

Let's start the Spotify Application.

Let's start the Spotify Application.

And Check Their Recommendation Engine.

Home Screen is Opened.

A potential problem that occurs when the system has a new user, but it does not have information about its preference in order to make recommendations. The goal is to get some context about the user.

Cold Start

Home Screen is Opened.

AI KNOWLEDGE

Origin - The car engine has difficulty starting up when it's cold. But, once it warms up, it will run smoothly.

Cold Start

Spotify's Approach to Tackle Cold Start

AI KNOWLEDGE

Origin - The car engine has difficulty starting up when it's cold. But, once it warms up, it will run smoothly.

By choosing at least 3 artist, the system is getting more context about us

By choosing at least 3 artist, the system is getting more context about us

The home screen will be organized based on the selected artists

The home screen is organized using a series of cards and shelves.

A card is a square image that represents a playlist, podcast episode, artist page, album, and so on.

A shelf is the row we use to group a series of cards.

Bookcase (Spotify Home)

Bookshelves (Shelves)

Books (Cards)

Bookcase Analogy

Running with that analogy, each person’s bookcase is uniquely curated by their interests and reading history of the books they collect over time. However, unlike a physical bookcase, Spotify uses machine learning to personalize the shelves and cards based on the content they previously enjoyed or might enjoy, and present it to millions of users.

Tony Jebara - VP of Engineering, Head of ML at Spotify

What now? How to serve even more personalized content?

Control Group

Treatment Group

Classical Approach - A/B testing

[1] A/B Testing

A simple controlled experiment where two versions (A and B) of a single variable are compared. They are identical except for one variation that might affect a user's behavior. Version A might be the currently used version (control), while version B is modified in some respect (treatment). [1]

A/B Testing

Switch

shelves

AI KNOWLEDGE

A/B Test Example - Blood Pressure

Control Group

Treatment Group

Placebo

Medicine

Compare Results

Cost of A/B Testing

In this quest to maximize conversions, there is a cost that incurs – a sizable portion of your traffic is routed to a losing variant directly reducing your business metrics (like sales or conversions).

Most classic A/B tests are, by design, forever in ‘exploration’ mode – after all, determining statistically significant results is their reason for existence, hence the perpetual exploration. In an A/B test, the focus is on discovering the exact conversion rate of variations.

Cost of A/B Testing

In this quest to maximize conversions, there is a cost that incurs – a sizable portion of your traffic is routed to a losing variant directly reducing your business metrics (like sales or conversions).

Most classic A/B tests are, by design, forever in ‘exploration’ mode – after all, determining statistically significant results is their reason for existence, hence the perpetual exploration. In an A/B test, the focus is on discovering the exact conversion rate of variations.

Exploration?

The Exploration–Exploitation Dilemma

The best long-term strategy may involve short-term sacrifices.

Exploit

Make the best decision given current information

Explore

Gather more information to make a better decision in the future

Online Advertising

Exploit: Show the most successful advert Explore: Show a new advert

Restaurant Selection

Exploit: Go to your favorite restaurant Explore: Try a new restaurant

Oil drilling

Exploit: Drill at the best-known location Explore: Drill at a new potential oil field

The Exploration–Exploitation Dilemma

The best long-term strategy may involve short-term sacrifices.

Exploit

Make the best decision given current information

Explore

Gather more information to make a better decision in the future

[1] Bandits in Recommender Systems

AI KNOWLEDGE

Dilemma Examples

However, Spotify doesn't use this approach, but a more sophisticated one... Let me explain.

How can we add a twist to A/B Testing – exploitation? 

How to produce faster results since there is no need to wait for a single winning variation?

How to have a ‘smarter’  version of A/B testing that uses machine learning algorithms to drive more traffic to variations that are performing well, while giving less traffic to variations that are underperforming?

Multi-Armed Bandit

Goal - We want to identify the machine with the highest payout and exploit it — i.e. pull it more than the others.

Problem - How to most efficiently identify the best machine to play and exploit it, while still exploring the many options in real-time?

Solution - Multi-Armed Bandit algorithm

Multi-Armed Bandit

Bandit - old slot machines that rob those who play them

Armed - the machines are used by pulling an arm

Multi - there are more than 2 of these machines

Multi-Armed Bandit

Bandit - old slot machines that rob those who play them

Armed - the machines are used by pulling an arm

Multi - there are more than 2 of these machines

eCommerce

Limited money - eCommerce customers

Bandit - eCommerce product

Problem - Determine which product to show to which customer? [1]

Solution - MAB algorithm

GOLD NUGGETS

[1] What are contextual bandits? Is A/B testing dead?

Summary - A/B Testing vs Multi-Armed Bandit (MAB)

Selected

MAB - Action & Reward

Action - Pulling the arm

Reward - Payout after pulling the arm

MAB - Action & Reward

Action - Pulling the arm

Reward - Payout after pulling the arm

Can we get more context?

By choosing at least 3 artist, the system is getting more context about us

Remember how Spotify does it?

The multi-armed bandit algorithm outputs an action but doesn’t use any information about the state of the environment (context).

If you use a multi-armed bandit to choose whether to display cat images or dog images to the user of your website, you’ll make the same random decision even if you know something about the preferences of the user.

The contextual bandit extends the multi-armed bandit by making the decision conditional on the state of the environment.

Contextual Bandit

Action

Reward

Action

Reward

State

Contextual Bandit in Spotify Song Selection Scenario

The context is information about the user: where they come from, previously visited pages of the site, device information, geolocation, etc.

An action is a user's choice of what song to display.
An outcome is whether the user clicked on a song or not.
A reward is binary: 0 if there is no click, 1 if there is a click.

GOLD NUGGETS

However, Spotify didn't stop with Contextual Bandit...

BaRT - Bandits for Recsplanations as Treatments

BaRT - Spotify's contextual bandit approach to performing exploration-exploitation with recsplanations.

The goal - quickly help users find something they are going to enjoy listening to

BaRT - Bandits for Recsplanations as Treatments

BaRT - Spotify's contextual bandit approach to performing exploration-exploitation with recsplanations.

The goal - quickly help users find something they are going to enjoy listening to

BaRT - Bandits for Recsplanations as Treatments

Recsplanations are now a common way for a recommender to tell a user why they are being recommended a particular item

BaRT learns and predicts satisfaction (e.g., click-through rate, consumption probability) for any combination of item, explanation, and context and, through careful logging and contextual bandit retraining, can learn from its mistakes in an online setting.

[1] Explore, Exploit, and Explain: Personalizing Explainable Recommendations with Bandits

AI KNOWLEDGE

BaRT Experimental Evaluation

Source: Explore, Exploit, and Explain: Personalizing Explainable Recommendations with Bandits

The plot shows the probability of a user streaming the recommended song for more than 30 seconds.

BaRT significantly outperforms random situation.

GOLD NUGGETS

BaRT Experimental Evaluation

The plot shows the probability of a user streaming the recommended song for more than 30 seconds.

BaRT significantly outperforms random situations.

GOLD NUGGETS

Source: Explore, Exploit, and Explain: Personalizing Explainable Recommendations with Bandits

The plot shows the probability of a user streaming the recommended song for more than 30 seconds.

BaRT significantly outperforms random situation.

GOLD NUGGETS

BaRT Experimental Evaluation

How long did someone listen to the recommended song compared to a random recommendation.

The plot shows the probability of a user streaming the recommended song for more than 30 seconds.

BaRT significantly outperforms random situations.

GOLD NUGGETS

Tony Jebara - VP of Engineering, Head of ML at Spotify

We like to say there is no ‘one’ true Spotify. Essentially, there are 248 million versions of the product, one for every user!

Or is it...

Embracing Tech Readers

Congratulations!

For those that really liked the case study...

How did you like the case study?

You want to share the case study with others?

You want to receive news about new case studies?

You want to talk with me about AI use-cases or content creation?