System design interviews are 45 minutes of structured ambiguity. The question is intentionally underspecified — you're expected to drive the conversation. Most candidates fail not because they lack knowledge, but because they dump everything they know without structure. This cheat sheet gives you the framework, the numbers, and the component-level knowledge to approach any system design question at Indian product companies.
The 45-Minute Framework
- 0–5 min: Requirements clarification — functional requirements (what it does), non-functional (scale, latency, consistency, availability)
- 5–10 min: Capacity estimation — DAU, QPS, storage, bandwidth
- 10–20 min: High-level design — draw the boxes (client, API gateway, services, DB, cache, queue)
- 20–35 min: Deep dive — pick 2 components and go deep (usually DB design plus one interesting service)
- 35–45 min: Bottlenecks and tradeoffs — what breaks at 10x load, what you would do differently
Numbers Every Engineer Should Memorise
- 1 million seconds is approximately 11.5 days — useful for capacity math
- 1 billion requests per day is approximately 11,500 req/sec (10K QPS is a useful mental shorthand)
- L1 cache: ~1ns | L2 cache: ~10ns | RAM: ~100ns | SSD: ~100 microseconds | HDD: ~10ms | Network roundtrip India: ~20–50ms
- Postgres can handle ~10K QPS on a single well-tuned instance for simple queries
- Redis: ~100K+ ops/sec single instance, sub-millisecond latency
- Kafka: 1M+ messages/sec per broker with proper partitioning
- S3/blob storage: treat as infinite, ~100ms latency, 11 nines of durability
- CDN cache hit: ~5ms | CDN miss (origin pull): ~50–200ms
- 1 char = 1 byte | 1 UUID = 36 bytes | 1 photo (compressed) is approximately 300KB | 1 minute video is approximately 50MB
Step 1: Requirements Clarification — Questions to Ask
- Who are the users and what are the core use cases? (Do not assume)
- What scale are we designing for — DAU? Peak QPS?
- What are the consistency requirements — eventual OK or strong required?
- What is the acceptable latency for read vs write operations?
- Is this read-heavy or write-heavy? (Changes storage and caching strategy entirely)
- Do we need to support mobile clients with intermittent connectivity?
- What is the expected data retention period?
Practice this in real-time — free
Sign up for Intervue and get a free 15-minute AI-assisted interview session. No payment required.
Start Free Session →Core Components and When to Use Each
- SQL (Postgres/MySQL): relational data, strong consistency, complex queries, transactions. Use when: user accounts, orders, payments, anything with foreign keys.
- NoSQL (Cassandra/DynamoDB): write-heavy, time-series, wide rows, eventual consistency. Use when: activity feeds, logs, IoT events, anything with a clear partition key.
- Redis: cache (TTL-based), sessions, rate limiting, leaderboards, pub/sub, real-time counters. Never use as primary storage.
- Kafka/SQS: async decoupling, event sourcing, fan-out, retry with backoff, between services with different throughput.
- Elasticsearch: full-text search, filtering, aggregations. Mirror writes from your primary DB via a change stream.
- CDN (CloudFront/Akamai): static assets, images, video — anything that does not change per user. Also useful for DDoS mitigation.
- Object storage (S3): blobs, media, backups, data lake. Cheap, durable, infinitely scalable.
Real Examples from Indian Product Companies
Flipkart Big Billion Day: 10x normal traffic spike compressed into 2 hours. Key pattern: pre-warm inventory in Redis with atomic DECR for cart operations, Kafka to decouple checkout from payment processing, read replicas for product catalog, aggressive CDN for all static product pages. The bottleneck is almost always inventory contention — design for that first.
Swiggy peak delivery (dinner rush, 8–9 PM): 500K concurrent orders, GPS location updates every 5 seconds from delivery partners. Key pattern: WebSocket gateway sharded by order ID, Redis Geo for real-time location, separate ETA service consuming location stream via Kafka, push notifications only when ETA changes by more than 3 minutes (not on every location update). The bottleneck is the location write volume — Redis Cluster handles this.
Amazon India search: 100M+ product catalog, 50K search QPS. Key pattern: Elasticsearch for search (mirrored from product DB via Kafka), Redis cache for hot queries (top 1% of queries account for 40% of traffic), ML ranking model runs offline and writes scores to Elasticsearch. The bottleneck is relevance ranking at scale — pre-compute as much as possible offline.
Common System Design Mistakes
- Jumping to solutions without clarifying requirements — interviewers often plant scale traps
- Ignoring the read/write ratio — it determines everything about storage and caching
- Designing for today's scale, not the stated scale requirement
- Treating a cache as a primary data store — caches can evict data
- Not talking about failure modes — what happens when the DB goes down? When the cache is cold?
- Proposing microservices for everything — adds operational complexity that hurts more than it helps at fewer than 10M users
- Forgetting the API design — interviewers often want to see how services communicate, not just that they exist