Unpacking YouTube’s Recommender System

23 August 2024

2

August 2013 — Created by Jason Krieger. Available for download from www.kriegs.net.

Over the past couple of years, YouTube has come under fire for its recommender system, with the media suggesting that it is promoting violent content, or banning LGBT content for violating its terms of service. Seemingly in response to all of this, Google has finally released a paper explaining YouTube’s recommender system, including how it makes recommendations and the information it gathers in doing so.

The paper, by Zhe Zhao, Lichan Hong, Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Kumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, and Ed Chi, discusses some of the problems that common/”normal” recommender systems face, some of the specific ones that a platform as big as YouTube faces, and the architecture they used to create their system.

One of the biggest issues the program had to tackle was that of scalability. Basically, no other recommender system has to work with such a large user platform, or with so many individual pieces of content. This meant that the team at Google had to make sure their system would be “effective at training and efficient at serving.”

YouTube’s Recommender System Overview

YouTube’s system learns from two types of user feedback: user engagement (clicks, watches, etc) and satisfaction behaviors (likes, dislikes). They model their ranking problem as a “combination of classification problems and regression problems with multiple objectives. Given a query, candidate, and context, the ranking model predicts the probabilities of user taking actions such as clicks, watches, likes, and dismissals.”

This is a point-wise prediction system, rather than a pair-wise or list-wise. While Google recognizes that the latter two could be used to improve the diversity of suggestions, for the time being, a point-wise system is the most efficient for a platform like YouTube.

Training the Model

The team trains their proposed models and baseline models sequentially, on the YouTube platform, thus creating and testing the model in its desired environment, rather than simulating it elsewhere. So too, by working sequentially, Google’s models can learn from and adapt to the most recent data as it comes in. They work both offline (“AUC for classification task and squared error for regression tasks”) and online aka “live” with A/B testing.

Discussions

The paper concludes with proposed points of discussion, including some limitations they see in their current model, as well as more insights and further research. Some of these include:

Exploring new model architecture for multi-objective ranking which balances stability, trainability, and expressiveness
Understanding and learning to factorize
Model compression to reduce serving costs

Unpacking YouTube’s Recommender System

YouTube’s Recommender System Overview

Training the Model

Discussions

Run Local AWS Cloud Stack using LocalStack on Linux

Learn Terraform Automation in 3 days using Video Courses

How To Expose Ansible AWX Service using Nginx Ingress

LEAVE A REPLY Cancel reply

Most Popular

5 Best Antiviruses With Keylogger Protection in 2025 by Tyler Cross

Best VPNs for School in 2025 That Work With Firewalls by Toma Novakovic

How to Watch the Super Bowl From Anywhere in 2025 by Raven Wu

Best Malware Removal + Protection Software in 2025 by Raven Wu

Recent Comments

EDITOR PICKS

5 Best Antiviruses With Keylogger Protection in 2025 by Tyler Cross

Best VPNs for School in 2025 That Work With Firewalls by Toma Novakovic

How to Watch the Super Bowl From Anywhere in 2025 by Raven Wu

POPULAR POSTS

5 Best Antiviruses With Keylogger Protection in 2025 by Tyler Cross

Best VPNs for School in 2025 That Work With Firewalls by Toma Novakovic

How to Watch the Super Bowl From Anywhere in 2025 by Raven Wu

POPULAR CATEGORY

ABOUT US

FOLLOW US