When designing a system either in an interview (or real life), you will probably need to do some capacity estimation. Knowing some high-level estimates of memory, storage, bandwidth, user traffic as well as being able to quickly do approximate calculations would be crucial in system design interviews.
A group of 8 bits is a byte. An ASCII character takes one byte of memory, whereas a UTF-8 character may take 1 to 4 bytes of memory. In most of the estimate calculations from storage to memory, you will find powers of 2 and their approximations extremely useful.
Power | Value | Approximation | Alias |
---|---|---|---|
10 | 1024 | 1000 | 1 KB (Kilobyte) |
20 | 1024 x 1024 | 1,000,000 | 1 MB (Megabyte) |
30 | 1024 x 1024 x 1024 | 1,000,000,000 | 1 GB (Gigabyte) |
40 | 1024 x 1024 x 1024 x 1024 | 1,000,000,000,000 | 1 TB (Terabyte) |
50 | 1024 x 1024 x 1024 x 1024 x 1024 | 1,000,000,000,000,000 | 1 PB (Petabyte) |
Here are some common things to keep in mind when doing back-of-the-envelope estimation:
- User Base
- Know the estimated or specified number of users or concurrent users who will be using the system. This is a fundamental parameter for many calculations.
- Traffic Volume
- Estimate the expected incoming and outgoing traffic in terms of requests per second, page views per day, or data transfer rates. This helps you understand the scale of the system.
- Data Size
- Estimate the size of the data the system will handle. This includes user-generated content, databases, files, and any other data sources.
- Request Rate
- Understand the expected rate of incoming requests or transactions. This is crucial for estimating load on components like web servers, application servers, and databases.
- Data Storage
- Estimate the amount of storage needed for databases, file storage, and caches. Consider the growth rate of data over time.
- Latency Targets
- Know the desired or acceptable response time or latency for various user interactions. This helps you design components for optimal performance.
- Throughput Goals
- Determine the desired throughput for the system, such as the number of transactions processed per second or data transfer rates.
- Peak Load
- Identify and estimate the peak usage hours or events that might cause spikes in traffic. Peak load considerations are essential for capacity planning.
- Availability Requirements
- Understand the required availability of the system, often expressed as a percentage (e.g., 99.99% uptime). This affects redundancy and fault tolerance planning.
- Data Backup and Retention
- Estimate data backup requirements, including backup frequency, data retention policies, and the size of backup archives.
- Concurrency
- Consider the expected level of concurrency, which impacts the design of components like databases and message queues.
- Cache Hit Rates
- If caching is part of your system design, estimate cache hit rates and cache sizes. This helps determine cache efficiency.
- Network Bandwidth
- Understand the network bandwidth required for data transfer between components, especially in distributed systems.
- CPU and Memory Usage
- Estimate CPU and memory usage for various components, including application servers, databases, and caching layers.
- Storage Redundancy
- If the system requires data redundancy, estimate the number of replicas or backup servers needed.
- Load Balancing
- Consider load balancing strategies and estimate the distribution of traffic among multiple servers or instances.
- Database Queries per Second
- Estimate the number of database queries or transactions per second. This helps determine database capacity requirements.
- Message Queues
- If message queues are part of your design, estimate the rate of messages produced and consumed.
- Third-Party Services
- Consider any third-party services or APIs your system relies on and estimate their performance and availability.
- Failure Rates
- Know the failure rates of hardware components or services and factor in redundancy and fault tolerance.
Example: Estimate Twitter QPS and storage requirements
Please note the following numbers are for this exercise only as they are not real numbers from Twitter.
Assumptions:
- 1 billion monthly active users.
- 50% of users use tweet daily.
- Users post 2 tweets per day on average.
- 10% of tweets contain media.
- Data is stored for 5 years.
Estimations:
- Query per second (QPS) estimate:
- Daily active users (DAU) = 300 million * 50% = 150 million
- Tweets QPS = 150 million * 2 tweets / 24 hour / 3600 seconds = ~3500
- Peek QPS = 2 * QPS = ~7000
- We will only estimate media storage here.
- Average tweet size:
- tweet_id 64 bytes
- text 140 bytes
- media 1 MB
- Mediastorage:
- 150 million * 2 * 10% * 1 MB = 30 TB per day
- 5-year media storage = 30 TB * 365 * 5 = ~55PB
- Average tweet size:
Things to keep in mind:
- Back-of-the-envelope estimation is all about the process. Solving the problem is more important than obtaining results. Interviewers may test your problem-solving skills. Here are a few tips to follow:
- Rounding and Approximation. It is difficult to perform complicated math operations during the interview. For example, what is the result of “99987 / 9.1”? There is no need to spend valuable time to solve complicated math problems. Precision is not expected. Use round numbers and approximation to your advantage. The division question can be simplified as follows: “100,000 / 10”.
- Write down your assumptions. It is a good idea to write down your assumptions to be referenced later.
- Label your units. When you write down “5”, does it mean 5 KB or 5 MB? You might confuse yourself with this. Write down the units because “5 MB” helps to remove ambiguity.
- Commonly asked back-of-the-envelope estimations: QPS, peak QPS, storage, cache, number of servers, etc. You can practice these calculations when preparing for an interview.