System Design Interview Cheat Sheet for Indian Engineers

System design interviews are 45 minutes of structured ambiguity. The question is intentionally underspecified — you're expected to drive the conversation. Most candidates fail not because they lack knowledge, but because they dump everything they know without structure. This cheat sheet gives you the framework, the numbers, and the component-level knowledge to approach any system design question at Indian product companies.

The 45-Minute Framework

0–5 min: Requirements clarification — functional requirements (what it does), non-functional (scale, latency, consistency, availability)
5–10 min: Capacity estimation — DAU, QPS, storage, bandwidth
10–20 min: High-level design — draw the boxes (client, API gateway, services, DB, cache, queue)
20–35 min: Deep dive — pick 2 components and go deep (usually DB design plus one interesting service)
35–45 min: Bottlenecks and tradeoffs — what breaks at 10x load, what you would do differently

Numbers Every Engineer Should Memorise

1 million seconds is approximately 11.5 days — useful for capacity math
1 billion requests per day is approximately 11,500 req/sec (10K QPS is a useful mental shorthand)
L1 cache: ~1ns | L2 cache: ~10ns | RAM: ~100ns | SSD: ~100 microseconds | HDD: ~10ms | Network roundtrip India: ~20–50ms
Postgres can handle ~10K QPS on a single well-tuned instance for simple queries
Redis: ~100K+ ops/sec single instance, sub-millisecond latency
Kafka: 1M+ messages/sec per broker with proper partitioning
S3/blob storage: treat as infinite, ~100ms latency, 11 nines of durability
CDN cache hit: ~5ms | CDN miss (origin pull): ~50–200ms
1 char = 1 byte | 1 UUID = 36 bytes | 1 photo (compressed) is approximately 300KB | 1 minute video is approximately 50MB

Step 1: Requirements Clarification — Questions to Ask

Who are the users and what are the core use cases? (Do not assume)
What scale are we designing for — DAU? Peak QPS?
What are the consistency requirements — eventual OK or strong required?
What is the acceptable latency for read vs write operations?
Is this read-heavy or write-heavy? (Changes storage and caching strategy entirely)
Do we need to support mobile clients with intermittent connectivity?
What is the expected data retention period?

Practice this in real-time — free

Start Free Session →

Core Components and When to Use Each

SQL (Postgres/MySQL): relational data, strong consistency, complex queries, transactions. Use when: user accounts, orders, payments, anything with foreign keys.
NoSQL (Cassandra/DynamoDB): write-heavy, time-series, wide rows, eventual consistency. Use when: activity feeds, logs, IoT events, anything with a clear partition key.
Redis: cache (TTL-based), sessions, rate limiting, leaderboards, pub/sub, real-time counters. Never use as primary storage.
Kafka/SQS: async decoupling, event sourcing, fan-out, retry with backoff, between services with different throughput.
Elasticsearch: full-text search, filtering, aggregations. Mirror writes from your primary DB via a change stream.
CDN (CloudFront/Akamai): static assets, images, video — anything that does not change per user. Also useful for DDoS mitigation.
Object storage (S3): blobs, media, backups, data lake. Cheap, durable, infinitely scalable.

Real Examples from Indian Product Companies

Flipkart Big Billion Day: 10x normal traffic spike compressed into 2 hours. Key pattern: pre-warm inventory in Redis with atomic DECR for cart operations, Kafka to decouple checkout from payment processing, read replicas for product catalog, aggressive CDN for all static product pages. The bottleneck is almost always inventory contention — design for that first.

Swiggy peak delivery (dinner rush, 8–9 PM): 500K concurrent orders, GPS location updates every 5 seconds from delivery partners. Key pattern: WebSocket gateway sharded by order ID, Redis Geo for real-time location, separate ETA service consuming location stream via Kafka, push notifications only when ETA changes by more than 3 minutes (not on every location update). The bottleneck is the location write volume — Redis Cluster handles this.

Amazon India search: 100M+ product catalog, 50K search QPS. Key pattern: Elasticsearch for search (mirrored from product DB via Kafka), Redis cache for hot queries (top 1% of queries account for 40% of traffic), ML ranking model runs offline and writes scores to Elasticsearch. The bottleneck is relevance ranking at scale — pre-compute as much as possible offline.

Common System Design Mistakes

Jumping to solutions without clarifying requirements — interviewers often plant scale traps
Ignoring the read/write ratio — it determines everything about storage and caching
Designing for today's scale, not the stated scale requirement
Treating a cache as a primary data store — caches can evict data
Not talking about failure modes — what happens when the DB goes down? When the cache is cold?
Proposing microservices for everything — adds operational complexity that hurts more than it helps at fewer than 10M users
Forgetting the API design — interviewers often want to see how services communicate, not just that they exist