Machine Learning Canvas

[ Recommender Systems ]

About Me

Garrett Eastham

edgecase

Founder & Chief Data Scientist

  • AI & Ecommerce Focus
  • CS @ Stanford
  • Background in Web Analytics
  • Career in Product Management
  • Prior: Bazaarvoice, RetailMeNot

Today's Talk

Framing Data Science Problems

Recommendation Science

Example: Real-Time Movie Recommendations

Q&A

Blending Art & Science

Data Exhaust

Data Scientists

Product Managers

Customer Portal

Ecommerce Site

Mobile App

Chat Logs

?

People Also Bought...

35%

(Total Product Sales)

Product Recommendation

Given: User (U')

?

Given: Products (|P|)

Choose: P' that maximizes user preference over |P|

The Machine Learning Canvas

Decisions

Making Predictions

Offline Evaluation

Features

ML Task

Data Sources

Collecting Data

Building Models

Value Propositions

Live Evaluation and Monitoring

B

D

A

F

H

C

I

E

G

J

The Machine Learning Canvas

Decisions

Making Predictions

Offline Evaluation

Features

ML Task

Data Sources

Collecting Data

Building Models

Value Propositions

Live Evaluation and Monitoring

B

D

A

F

H

C

I

E

G

J

The Machine Learning Canvas

Decisions

Making Predictions

Offline Evaluation

Features

ML Task

Data Sources

Collecting Data

Building Models

Value Propositions

Live Evaluation and Monitoring

B

D

A

F

H

C

I

E

G

J

The Machine Learning Canvas

Decisions

Making Predictions

Offline Evaluation

Features

ML Task

Data Sources

Collecting Data

Building Models

Value Propositions

Live Evaluation and Monitoring

B

D

A

F

H

C

I

E

G

J

ML Canvas: Recommenders

Decisions

Making Predictions

Offline Evaluation

Features

ML Task

Data Sources

Collecting Data

Building Models

Value Propositions

Live Evaluation and Monitoring

"Show me products that I might want."

Product carousel on PDP's.

Rendered on page-load.

Ranking

(i.e. - candidate scoring)

Clickstream.

Reviews.

Customer feedback.

First party through beacon or feed.

User-Product affinities.

Predict historical user-product affinity

Nightly batch + online updates

A/B testing for improved page performance goals (i.e. - conversion, revenue / user, etc.)

Merchants vs. Machine

Financial Outcome

Merchant

Effort

Coordinate Experience

Coordinate Concepts

Coordinate Inventory

Selecting a title for an email campaign

Updating an Open-to-Buy order

Choosing a set of similar products to show a user

Data Quality is Pivotal

16gb Ram

15'' Screen

4gb Ram

13'' Screen

32gb Ram

13'' Screen

16gb Ram

17'' Screen

Customer searches for "large screen laptop for gaming"

Has a known affinity for "Design-Centric" consumer goods

Candidate Selection

97%

14%

Candidate Ranking

Multiple Screens

Great for Gaming

Design-Centric

Predicting Product Affinity

Given: User (U')

Predict: Affinity (U' | P')

Model Selection

|U|

|P|

User-Product Affinity Matrix (Q)

Q is seeded with known affinities.

Issue: Very few "explicit" affinities are known (sparsity).

Model: Can we predict the full matrix Q using <0.01% of known affinities?

Approximate Matrix Q

User-Product Affinity Matrix (Q)

~

~

User Factor Matrix (X)

Y^T
YTY^T
X
XX

Item Factor Matrix (X)

\cdot
\cdot

Method: Matrix Factorization via Alternating Least Squares

Data: Product Affinities

Movie Ratings (MovieLens)

userId,movieId,rating,timestamp
1,169,2.5,1204927694
1,2471,3.0,1204927438
1,48516,5.0,1204927435
2,2571,3.5,1436165433
2,109487,4.0,1436165496
2,112552,5.0,1436165496
2,112556,4.0,1436165499
3,356,4.0,920587155
3,2394,4.0,920586920
3,2431,5.0,920586945
3,2445,4.0,920586945

Example Records (Movies)

ml-latest/ratings.csv

Determine Hyperparameters

// Load ratings data -- using Spark CSV (https://github.com/databricks/spark-csv)
val ratings = sqlContext.read.format("com.databricks.spark.csv")
                        .option("header", "true").option("inferSchema", "true")
                        .load("file:///.../movies/ml-latest/ratings.csv")
                        .map(r => {
                            Rating(r(0).toString.toInt, r(1).toString.toInt, r(2).toString.toDouble)
                        }).registerTempTable("ratings")

// Create training, validation, and test datasets
val Array(training, test) = ratings.randomSplit(Array(0.8, 0.2))

// Iterate over models and train
def trainAndTest(rank: Int, lambda: Double, numIter: Int) = {
    // Build the recommendation model using ALS on the training data
    val model = ALS.train(training, rank, numIter, lambda)

    // Evaluate the model on test data
    val usersProducts = test.map { case Rating(user, product, rate) =>
        (user, product)
    }
    val predictions = model.predict(usersProducts).map { case Rating(user, product, rate) =>
        ((user, product), rate)
    }

    val ratesAndPreds = test.map { case Rating(user, product, rate) =>
        ((user, product), rate)
    }.join(predictions)

    val MSE = ratesAndPreds.map { case ((user, product), (r1, r2)) =>
        val err = (r1 - r2)
        err * err
    }.mean()

    println("RMSE (validation) = " + MSE + " for the model trained with rank = " 
                                   + rank + ", lambda = " + lambda 
                                   + ", and numIter = " + numIter + ".")
}

Apache Spark (Scala API)

Train on Full Dataset

/* ==== Test Results ====
	[Baseline]
	RMSE (validation) = 0.6832349868066907 for the model
            trained with rank = 5, lambda = 0.1, and numIter = 5.

	[Pivot on Rank]
	RMSE (validation) = 0.6800594873467324 for the model 
            trained with rank = 10, lambda = 0.1, and numIter = 5.
	RMSE (validation) = 0.690144394236897 for the model 
            trained with rank = 20, lambda = 0.1, and numIter = 5.
	RMSE (validation) = 0.698129529945344 for the model 
            trained with rank = 50, lambda = 0.1, and numIter = 5.

	[Pivot on Lambda]
	RMSE (validation) = 0.6925651848679597 for the model 
            trained with rank = 5, lambda = 0.01, and numIter = 5.
	RMSE (validation) = 1.0692672408983346 for the model 
            trained with rank = 5, lambda = 0.5, and numIter = 5.
	RMSE (validation) = 1.7497220606946313 for the model 
            trained with rank = 5, lambda = 1.0, and numIter = 5.

	[Use Best Performing Parameters]
	RMSE (validation) = 0.6649058015762571 for the model 
            trained with rank = 10, lambda = 0.1, and numIter = 20.
 */

// Train full model
val model = ALS.train(ratings, 10, 10, 0.1)

// Save model
model.save(sc, "file:///.../movies/models/ml/v1")

Con Air Recommender

// Open raw movie data
val movies = sqlContext.read.format("com.databricks.spark.csv")
                       .option("header", "true").option("inferSchema", "true")
                       .load("file:///.../movies/ml-latest/movies.csv")
                       .registerTempTable("movies")

// Load item factors
case class ItemFactor(id: Int, features: Array[Double])
case class IndexedItemFactor(id: Int, features: Array[Double], index: Int)
val item_factors = sqlContext.load("file:///.../movies/models/ml/v1/data/product","parquet")
                             .as[ItemFactor].rdd.zipWithIndex.map(x => {
                                IndexedItemFactor(x._1.id, x._1.features, x._2.toInt)
                             })
item_factors.toDF().registerTempTable("item_factors")

// Get the feature vector for Con Air
val target_item_factors = item_factors.filter(i => i.id == 1552).take(1)(0).features

// Iterate over all other item factors and map a cosine distance
case class SimilarItem(movie_id: Int, similarity: Double)
val similar_items = sc.parallelize(item_factors.collect()).map(i => {
    // Calculate similarity
    val similarity = CosineSimilarity.cosineSimilarity(target_item_factors, i.features)

    (similarity, i)
}).takeOrdered(20)(Ordering[Double].reverse.on(x=>x._1)).map(x => SimilarItem(x._2.id, x._1))

// Prep for Spark SQL
sc.parallelize(similar_items).toDF().registerTempTable("similar_items")

// Merge with book lookup and show top 10 similar books
sqlContext.sql("SELECT m.title, s.similarity, s.rank 
                FROM movies m JOIN similar_items s ON m.movieId = s.movie_id 
                ORDER BY s.similarity DESC LIMIT 20").foreach(println)

Similar Films to Con Air

scala> sqlContext.sql("SELECT m.title, s.similarity 
                       FROM movies m 
                       JOIN similar_items s
                        ON m.movieId = s.movie_id 
                       ORDER BY s.similarity
                       DESC LIMIT 20").foreach(println)

[Con Air (1997),1.0]
[Bad Boys (1995),0.9885250995229632]
[Striking Distance (1993),0.9868059080297423]
[City Hunter (Sing si lip yan) (1993),0.9857155838604543]
[Rock, The (1996),0.9855154341766271]
[Another 48 Hrs. (1990),0.9842966650016308]
[Program, The (1993),0.9830120663020663]
[Young Guns II (1990),0.982577633206861]
[Face/Off (1997),0.9825008152957304]
[Navy Seals (1990),0.9823346682235141]
[Assassins (1995),0.9820721829300805]
[Lethal Weapon 3 (1992),0.9820630838696822]
[Sharpe's Challenge (2006),0.9820047546907581]
[My Avatar and Me (Min Avatar og mig) (2010),0.9817191629728049]
[Blue Streak (1999),0.9817106802763232]
[Days of Thunder (1990),0.9816860526998341]
[Outbreak (1995),0.9804433142553622]
[Whispers in the Dark (1992),0.9803792985641578]
[Rapid Fire (1992),0.9803632101298569]
[Cowboy Way, The (1994),0.9799338957065918]

Predicting New Affinities

Given: Known Product Affinities

Predict: Unknown Product Affinities

P_u
PuP_u
P_u'
PuP_u'

?

User-Product Affinity Matrix (Q)

P_u
PuP_u

Offline Computation

Online Computation

  • Add P(U) to training set and re-run matrix factorization
P_u = X_u \cdot Y^T
Pu=XuYTP_u = X_u \cdot Y^T
  • Must find an approximation of user factor X(U)
P_u \approx X_u \cdot Y^T
PuXuYTP_u \approx X_u \cdot Y^T
Y^T
YTY^T
X
XX
\cdot
\cdot
X_u
XuX_u

Capturing (Implicit) Affinities

Nicolas Cage

Browsing Time (Session)

Derived Movie Affinities

People also liked...

76s

18s

29s

P_u
PuP_u
P_u'
PuP_u'

Approximating X(U)

Q = X \cdot Y^T
Q=XYTQ = X \cdot Y^T
Q_u = X_u \cdot Y^T
Qu=XuYTQ_u = X_u \cdot Y^T
P_u \approx X_u \cdot Y^T
PuXuYTP_u \approx X_u \cdot Y^T

Find: 

X_u = X(U|P_u)
Xu=X(UPu)X_u = X(U|P_u)
(Y^T Y)(Y^TY)^{-1} = 1
(YTY)(YTY)1=1(Y^T Y)(Y^TY)^{-1} = 1
Y[Y(Y^TY)^{-1}] = 1
Y[Y(YTY)1]=1Y[Y(Y^TY)^{-1}] = 1
Y(Y^TY)^{-1} = (Y^T)^{-1}
Y(YTY)1=(YT)1Y(Y^TY)^{-1} = (Y^T)^{-1}

Identity Property

P_u (Y^T)^{-1} \approx X_u \cdot Y^T (Y^T)^{-1}
Pu(YT)1XuYT(YT)1P_u (Y^T)^{-1} \approx X_u \cdot Y^T (Y^T)^{-1}
X_u \approx P_u (Y^T)^{-1} = P_u \cdot Y(Y^T Y)^{-1}
XuPu(YT)1=PuY(YTY)1X_u \approx P_u (Y^T)^{-1} = P_u \cdot Y(Y^T Y)^{-1}

Computing P(U)

// Instantiate the user product matrix (in-memory)
val number_of_factors = item_factors.first.features.length
val number_of_items = item_factors.count().toInt
val Y = matrix(item_factors.collect().map(i => i.features), number_of_items, number_of_factors)

// Set up preferred films
val preferred_movies = Array(
  // Provide captured implicit movie preference weights
)

// Create G = Y * (Y.t * Y) ^ -1
val G = Y * inv(Y.t * Y)

// Find predicted user factor vector X_u => P_u * Y * (Y.t * Y) ^ -1'
val X_u = preferred_movies.map(x => x._3 * G(x._2,::))
                          .fold(DenseMatrix.zeros[Double](1, number_of_factors))((acc, v) => {
  acc + v
})

// Use X_u to get personalized movie scores Q_u = X_u * Y.t
val Q_u = X_u * Y.t

// ..... TEST .......

// Sort and pair with movie id
case class ItemRating(id: Int, score: Double)
val rated_items = (0 to (number_of_items - 1)).map(i => ItemRating(i, Q_u(0, i)))
                                              .sortBy(ir => -1.0 * ir.score).toArray
sc.parallelize(rated_items).toDF().registerTempTable("rated_items")

// Find the top recommended movies
sqlContext.sql("SELECT m.title, s.similarity, r.score 
                FROM rated_items r 
                JOIN movies m ON r.id = m.movieId 
                JOIN similar_items s ON s.movie_id = r.id 
                ORDER BY r.score DESC LIMIT 30").foreach(println)

Real-Time Personalization

// Taste Profile: "Lord of War was my JAM!"
val preferred_movies = Array(
  (3717, 403, 10.0),       // Gone in Sixty Seconds
  (1835, 30170, 50.0),     // City of Angels
  (36529, 27388, 100.0),   // Lord of War
  (733, 15060, 25.0)       // The Rock
)
// Taste Profile: "National Treasure was just the best..."
val preferred_movies = Array(
  (3717, 403, 10.0),       // Gone in Sixty Seconds
  (1835, 30170, 50.0),     // City of Angels
  (8972, 30893, 100.0),    // National Treasure
  (47810, 8756, 50.0)      // The Wicker Man
)

A

B

Improving the Results

|U|

|P|

User-Product Affinity Matrix (Q)

A

t

B

How to Contact Me

garrett@dataexhaust.io

  • Follow up questions
  • Model development / implementation
  • Product team training
  • Moral support

Thank You!

Questions?

Machine Learning Canvas: Recommender Systems

By Garrett Eastham

Machine Learning Canvas: Recommender Systems

  • 436