Design a News Feed System
3 min readStep 1 - Understand the problem and establish design scope #
Requirements:
- Mobile + web app
- Features: publish posts, view friends' posts on news feed
- Sorted by reverse chronological order (simplified)
- Max 5000 friends per user
- 10 million DAU
- Supports media files (images, videos)
Step 2 - Propose high-level design and get buy-in #
Two flows: feed publishing (write) and news feed building (read).
APIs #
Feed publishing: POST /v1/me/feed — params: content, auth_token
Newsfeed retrieval: GET /v1/me/feed — params: auth_token
Feed publishing flow #

- User → Load balancer → Web servers → Post service (persists to DB + cache) → Fanout service (pushes to friends' news feed cache)
- Notification service informs friends
Newsfeed building flow #

- User → Load balancer → Web servers → Newsfeed service (fetches from newsfeed cache)
Step 3 - Design deep dive #
Feed publishing deep dive #

Web servers: Authenticate via auth_token + rate-limiting to prevent spam.
Fanout models:
| Model | Pros | Cons |
|---|---|---|
| Fanout on write (push) | Real-time, fast reads | Hotkey problem (users with many friends); wastes compute on inactive users |
| Fanout on read (pull) | No wasted compute, no hotkey problem | Slow reads (not pre-computed) |
Hybrid approach: Push for most users; pull for celebrities/users with many followers. Consistent hashing mitigates hotkey problem.

Fanout service workflow:
- Fetch friend IDs from graph database
- Get friends info from user cache; filter by user settings (muted friends, selective sharing)
- Send friend list + post ID to message queue
- Fanout workers store
<post_id, user_id>in news feed cache - Only IDs are stored (not full objects) to limit memory; configurable size limit (latest content only)

Newsfeed retrieval deep dive #

- User requests feed → 2. Load balancer → 3. Web server → 4. Newsfeed service gets post IDs from cache → 5. Hydrate with full user + post objects from user cache and post cache → 6. Return JSON to client
- Media content served via CDN
Cache architecture (5 layers) #

- News Feed: Post IDs
- Content: Full post data (popular → hot cache)
- Social Graph: User relationships
- Action: Like/reply/action data
- Counters: Like count, reply count, follower/following counts
Step 4 - Wrap up #
Additional talking points:
- Database scaling: vertical vs horizontal, SQL vs NoSQL, master-slave replication, read replicas, consistency models, sharding
- Keep web tier stateless
- Cache data as much as possible
- Support multiple data centers
- Decouple components with message queues
- Monitor: QPS at peak, feed refresh latency
Reference materials [1] How News Feed Works: https://www.facebook.com/help/327131014036297/ [2] Friend of Friend recommendations Neo4j: http://geekswithblogs.net/brendonpage/archive/2015/10/26/friend-of-friend-recommendations-with-neo4j.aspx