Presentations
Templates
Features
Teams
Pricing
Log in
Sign up

Log in
Sign up

Anime Recommendation

Data Source

From Kaggle Anime Recommendations Database
Recommendation data from 76,000 users at myanimelist.net
https://www.kaggle.com/CooperUnion/anime-recommendations-database
with Two files : anime.csv rating.csv

Data Source

anime.csv

Data Source

anime.csv

anime_id - myanimelist.net's unique id identifying an anime.
genre - comma separated list of genres for this anime.
rating - average rating out of 10 for this anime.

Data Source

rating.csv

Data Source

rating.csv

user_id - non identifiable randomly generated user id.
anime_id - the anime that this user has rated.
rating - rating out of 10 this user has assigned (-1 if the user watched it but didn't assign a rating).

Data Visualization

Data Preprocessing

There are 41 kinds of genres for all animes.
For each anime, we create a 41 * 1 vector to represent.
And for each user, we create a 41 * 1 vector from the ratings.

Data Preprocessing

How to deal with the missing values?

for each user, replace -1 with

5 (If there are no ratings from this user)

10 - average rating from this user (otherwise)

Data Preprocessing

User Instance

{\sum anime\_vector * rating}\ ./ \ {\sum anime\_vector}

{\sum anime\_vector * rating}\ ./ \ {\sum anime\_vector}

Evaluation

NDCG@50

CG

Cumulative Gain
Example:
gain = {3, 5, 9, 6}
CG = 3 + 5 + 9 + 6 = 23

DCG

Discounted Cumulative Gain
Example:
gain = {3, 5, 9, 6}
DCG = 3 / lg(2) + 5 / lg(3) + 9 / lg(4) + 6 / lg(5) = 13.24

NDCG@k

Normalized Discounted Cumulative Gain
k: number of items in DCG
NDCG = DCG / iDCG
iDCG: ideal DCG
Example:
gain = {3, 5, 9, 6}
DCG = 3 / lg(2) + 5 / lg(3) + 9 / lg(4) + 6 / lg(5) = 13.24
iDCG = 9 / lg(2) + 6 / lg(3) + 5 / lg(4) + 3 / lg(5) = 16.58
NDCG@4 = 13.24 / 16.58 = 0.80

Pros & Cons

Offline: needless of user response
Relevance could be any real number (in contrast to MAP which only allows binary relevance)
The returning value itself is nontrivial and thus can only be used to distinguish between models.

Usage

Reference

https://www.microsoft.com/en-us/research/publication/evaluating-recommender-systems/

Model

Model 1: KNN

Put anime vector to build a k-d tree
Fit user vector with k-d tree with cosine distance
Recommend some nearest animes

Model 2: Anime Clustering

Group animes into several clusters
Fit user instance into one of the clusters
Recommend animes in that cluster with high rating

Model 3: User Clustering

Group users into several clusters
Fit user instance into one of the clusters
Recommend animes in that cluster with high rating

Anime Recommendation

Machine Learning

By deror1869107

Made with Slides.com

Machine Learning

8 years ago
192

deror1869107

More from deror1869107

PCCA Winter Camp 2018 Day 2

deror1869107

1015
NCTU+ October

deror1869107

178
2016武陵電資推廣暑訓(Shortest Path + MST)

deror1869107

176
Machine Learning

deror1869107

146

Tour

Presentations Trending decks Templates Features Pricing Slides for Teams Slides for Developers

Help

Forum Knowledge Base Developers Docs Leave Feedback Report an Issue

Company

News Changelog About Slides Security Partners

Resources

Make slides with AI Embed Google Maps Embed Google Forms Embed YouTube Convert PDF to Slides Convert PPT to Slides Convert Markdown to Slides

Terms • Privacy • © 2025 Slides, Inc.

BESbswyBESbswyBESbswyBESbswy