Back-of-the-Envelope Estimation

2 min read

Back-of-the-envelope estimation uses thought experiments and common performance numbers to evaluate whether a design meets requirements. Three foundational concepts:

Power of two

Power	Approx. value	Full name	Short name
10	1 Thousand	1 Kilobyte	1 KB
20	1 Million	1 Megabyte	1 MB
30	1 Billion	1 Gigabyte	1 GB
40	1 Trillion	1 Terabyte	1 TB
50	1 Quadrillion	1 Petabyte	1 PB

Latency numbers every programmer should know

Key conclusions:

Memory is fast but disk is slow
Avoid disk seeks if possible
Simple compression algorithms are fast
Compress data before sending over internet
Data centers in different regions = significant transmission time

Availability numbers

High availability measured as percentage uptime. SLA defines guaranteed uptime.

Availability %	Downtime per year
99% (two nines)	3.65 days
99.9% (three nines)	8.76 hours
99.99% (four nines)	52.56 minutes
99.999% (five nines)	5.26 minutes
99.9999% (six nines)	31.5 seconds

Example: Estimate Twitter QPS and storage

Assumptions:

300 million monthly active users
50% use Twitter daily
2 tweets per day per user on average
10% of tweets contain media
Data stored for 5 years

QPS estimate:

DAU = 300M × 50% = 150 million
Tweets QPS = 150M × 2 / 24h / 3600s = ~3,500
Peak QPS = 2 × QPS = ~7,000

Media storage estimate:

Average tweet size: tweet_id 64 bytes, text 140 bytes, media 1 MB
Media storage per day: 150M × 2 × 10% × 1 MB = 30 TB/day
5-year media storage: 30 TB × 365 × 5 = ~55 PB

Tips

Rounding and Approximation: Use round numbers (e.g., 100,000/10 instead of 99,987/9.1). Precision not expected.
Write down assumptions for later reference.
Label your units (5 MB not just "5").
Commonly asked estimates: QPS, peak QPS, storage, cache, number of servers. Practice makes perfect.

Reference materials

[1] J. Dean. Google Pro Tip: Use Back-Of-The-Envelope-Calculations To Choose The Best Design: http://highscalability.com/blog/2011/1/26/google-pro-tip-use-back-of-the-envelope-calculations-to-choo.html [2] System design primer: https://github.com/donnemartin/system-design-primer [3] Latency Numbers Every Programmer Should Know: https://colin-scott.github.io/personal_website/research/interactive_latency.html [4] Amazon Compute Service Level Agreement: https://aws.amazon.com/compute/sla/ [5] Compute Engine Service Level Agreement (SLA): https://cloud.google.com/compute/sla [6] SLA summary for Azure services: https://azure.microsoft.com/en-us/support/legal/sla/summary/

Power of two #

Latency numbers every programmer should know #

Availability numbers #

Example: Estimate Twitter QPS and storage #

Tips #

Power of two

Latency numbers every programmer should know

Availability numbers

Example: Estimate Twitter QPS and storage

Tips