A/B test!!!

Let's serve orange and banana to a number of random people. 

three weeks later...

It seems that there are more happy customers served with banana than those served with orange. 

Banana is awesome, let's serve only banana going forward.

 

How about we have each customer taste a bit of both orange and banana, and we pick the one that made her happier more often?

bandit Algorithm

Interesting article:

Is it too good to be possible?

What's new in this algorithm

  • deep Convolution Neural Network

  • Memory Pool

  • Stickiness to old models. 

Proof of Concept

  • Based on a simplified version of DQN
  • Learns to play different genres preferred by the user when engaged with different activities. 
  • Learns to alternate between preferred genres to maximize variety of the music

some Example applications

  • Which recommendation algo/settings to use for this customer now

  • When do we prompt for social sharing, registration, social sharing, etc

  • Auto Equalizer settings 
    "songs on iheart just sound better!" 
    (the success of beats headphones)

  • Multi-step user registration invitation

Recommendation Models

  • Cues:

    • Location, Time, Accelerometer Reading, Recent Play History
  • Actions:

    • ​which recommendation model to use
  • Rewards: 

    • Thumbs up, Favoriting, Share, Listening Time
    • Thumbs down, Skips, Stops 

Prompt Sharing

  • Cues:

    • User share history, recent user interaction 
  • Actions:

    • ​To share or not to share
  • Rewards: 

    • Choose to share
    • Dismissed, App closed

Auto equalizer settings

  • Cues:

    • Acoustic profile, Time, Location, Accelerometer Reading
  • Actions:

    • Which equalizer preset to use
  • Rewards: 

    • Thumbs up, Fave, Share, Listening Time
    • Thumbs down, Skip, Volume Down 
  • MULTI-STEP USER REGISTRATION INVITATION

A salesperson makes a conversation before asking you the question:

 

Do you want one? 

Introducing Shifu

A production ready scala port of the original DQN algorithm from DeepMind

Shifu

Neural Network

Linear Algebra Lib

Agent

Akka Interface

DB

How Do we start?

A/B/C testing!

  1. A for strategy A
  2. B for strategy B
  3. C for Shifu driven decision between A and B

Get the Data in!

and then

Quesitons/Discussion

DQN - internal

By Kailuo Wang

DQN - internal

  • 1,006