xai-org/x-algorithm: Trending on GitHub
The X For You Feed Algorithm: A Deep Dive into the Technology Behind Personalized Content
Overview
The X For You feed algorithm is a complex system that retrieves, ranks, and filters posts from two sources: in-network content from accounts you follow and out-of-network content discovered through machine learning-based retrieval. The algorithm combines these sources and ranks them together using a Grok-based transformer model that predicts engagement probabilities for each post. In this article, we'll take a closer look at the system architecture, components, and key design decisions that make the X For You feed algorithm tick.
System Architecture
The system architecture of the X For You feed algorithm can be visualized as a pipeline with several stages:
- Query Hydration: Fetch the user's recent engagement history and metadata (e.g., following list)
- Candidate Sourcing: Retrieve candidates from Thunder (recent posts from followed accounts) and Phoenix (ML-discovered posts from the global corpus)
- Candidate Hydration: Enrich candidates with core post data, author information, and video duration
- Pre-Scoring Filters: Remove posts that are duplicates, too old, from the viewer themselves, from blocked/muted accounts, or containing muted keywords
- Scoring: Apply multiple scorers sequentially, including the Phoenix Scorer, Weighted Scorer, and Author Diversity Scorer
- Selection: Sort by score and select the top K candidates
- Post-Selection Processing: Final validation of post candidates to be served
Components
The X For You feed algorithm consists of several components, each responsible for a specific stage in the pipeline:
- Home Mixer: The orchestration layer that assembles the For You feed
- Thunder: An in-memory post store and real-time ingestion pipeline that tracks recent posts from all users
- Phoenix: The ML component with two main functions: retrieval and ranking
- Candidate Pipeline: A reusable framework for building recommendation pipelines
How It Works
The pipeline stages work together to produce a ranked feed of posts for each user. Here's a step-by-step explanation of how it works:
- Query Hydration: The user's recent engagement history and metadata are fetched.
- Candidate Sourcing: Candidates are retrieved from Thunder and Phoenix.
- Candidate Hydration: Candidates are enriched with core post data, author information, and video duration.
- Pre-Scoring Filters: Posts are removed that are duplicates, too old, from the viewer themselves, from blocked/muted accounts, or containing muted keywords.
- Scoring: Multiple scorers are applied sequentially, including the Phoenix Scorer, Weighted Scorer, and Author Diversity Scorer.
- Selection: The top K candidates are selected based on their scores.
- Post-Selection Processing: Final validation of post candidates to be served.
Scoring and Ranking
The Phoenix Grok-based transformer model predicts probabilities for multiple engagement types, including likes, replies, reposts, and more. The Weighted Scorer combines these predictions into a final score, which is used to rank the candidates.
Filtering
Filters are applied at two stages: pre-scoring and post-selection. Pre-scoring filters remove posts that are duplicates, too old, from the viewer themselves, from blocked/muted accounts, or containing muted keywords. Post-selection filters remove posts that are deleted, spam, violence, or gore.
Key Design Decisions
The X For You feed algorithm relies on several key design decisions, including:
- No Hand-Engineered Features: The system relies entirely on the Grok-based transformer to learn relevance from user engagement sequences.
- Candidate Isolation in Ranking: Candidates cannot attend to each other during transformer inference.
- Hash-Based Embeddings: Both retrieval and ranking use multiple hash functions for embedding lookup.
- Multi-Action Prediction: The model predicts probabilities for many actions, rather than a single "relevance" score.
- Composable Pipeline Architecture: The candidate-pipeline crate provides a flexible framework for building recommendation pipelines.
Conclusion
The X For You feed algorithm is a complex system that retrieves, ranks, and filters posts from two sources: in-network content from accounts you follow and out-of-network content discovered through machine learning-based retrieval. The algorithm combines these sources and ranks them together using a Grok-based transformer model that predicts engagement probabilities for each post. By understanding the system architecture, components, and key design decisions, we can gain insights into how the algorithm works and how it can be improved.




