Design a Notification System
3 min readThree notification types: mobile push notification, SMS, Email.

Step 1 - Understand the problem and establish design scope #
Requirements:
- Supports push notification, SMS, and email
- Soft real-time: fast delivery, slight delay acceptable under high load
- Supported devices: iOS, Android, laptop/desktop
- Notifications triggered by client apps or scheduled server-side
- Users can opt out
- Daily volume: 10M mobile push, 1M SMS, 5M emails
Step 2 - Propose high-level design and get buy-in #
Notification types #
iOS Push: Provider → APNS (Apple Push Notification Service) → iOS device. Requires device token + payload (JSON).

Android Push: Similar flow using Firebase Cloud Messaging (FCM).

SMS: Third-party services like Twilio, Nexmo.

Email: Commercial services like Sendgrid, Mailchimp.

Contact info gathering flow #

When a user installs/signs up, API servers collect contact info. DB schema:

usertable: email, phone numberdevicetable: device tokens (one user → multiple devices)
Initial notification sending flow #

- Services 1–N: Microservices, cron jobs, etc. that trigger notifications
- Notification system: Centerpiece; provides APIs, builds payloads for third-party services
- Third-party services: Deliver notifications. Must be extensible (FCM unavailable in China → alternatives like Jpush, PushY)
Problems with initial design: SPOF (single server), hard to scale independently, performance bottleneck.
Improved design #

Changes:
- DB and cache moved out of notification server
- Multiple notification servers with horizontal scaling
- Message queues decouple components — each notification type gets its own queue (outage in one third-party service doesn't affect others)
Components (left to right):
- Services 1–N: Send notifications via APIs
- Notification servers: Validate requests, fetch metadata from cache/DB, push events to message queues
- Cache: User info, device info, notification templates
- DB: User, notification, settings data
- Message queues: Separate queue per notification type (iOS PN, Android PN, SMS, Email)
- Workers: Pull events from queues, send to third-party services
API example:
POST https://api.example.com/v/sms/send

Flow:
- Service calls API → 2. Notification server fetches metadata → 3. Event pushed to queue → 4. Workers pull and send → 5. Third-party delivers → 6. User receives
Step 3 - Design deep dive #
Reliability #
Prevent data loss: Notification data persisted in a notification log DB + retry mechanism.

Exactly-once delivery? No — distributed systems can produce duplicates. Use dedupe via event ID: if event ID seen before, discard. (See ref [5] for why exactly-once is impossible.)
Additional components #
Notification templates: Preformatted notifications with customizable parameters (item name, date, CTA). Benefits: consistent format, fewer errors, saves time.
Notification settings: Users control opt-in per channel (user_id, channel, opt_in). Check before sending.
Rate limiting: Cap frequency to prevent overwhelming users (excessive notifications → users disable them entirely).
Retry mechanism: Failed delivery → re-enqueue. Persistent failure → alert developers.
Security: appKey/appSecret for push notification APIs. Only verified clients can send.
Monitor queued notifications: Track total queued count. If large → add more workers.

Events tracking: Track open rate, click rate, engagement via analytics service integration.

Updated design #

Additions: authentication + rate-limiting on notification servers, retry mechanism, notification templates, monitoring and tracking systems.
Step 4 - Wrap up #
Key design elements: message queues for decoupling, robust retry mechanism, security via appKey/appSecret, tracking/monitoring at every stage, user opt-out respect, rate limiting.
Reference materials [1] Twilio SMS: https://www.twilio.com/sms [2] Nexmo SMS: https://www.nexmo.com/products/sms [3] Sendgrid: https://sendgrid.com/ [4] Mailchimp: https://mailchimp.com/ [5] You Cannot Have Exactly-Once Delivery: https://bravenewgeek.com/you-cannot-have-exactly-once-delivery/ [6] Security in Push Notifications: https://cloud.ibm.com/docs/services/mobilepush [7] RabbitMQ: https://bit.ly/2sotIa6