System Design

Notes

Understand the problem and establish the design scope
1. What features?
2. How many users?
3. Anticipation of scale?
4. Technology stack?
5. Existing services?
Propose high level design and get buy-in
1. Initial blueprint
2. Ask for feedback
3. Draw boxes diagrams with components
4. Back-of-the-envelope calculations to check scale
Design deep dive
Wrap up
1. Bottlenecks
2. Potential Improvements
3. Recap
4. Next scale curve

Client-side, server-side, or middleware
Send back 429 Too Many Requests when throttled
Algorithms for rate limiting
1. Token bucket
2. Leaky bucket
3. Fixed window counter
4. Sliding window log / counter
In-memory cache supports time-based expiration
Rules for limiting in cache / disk
Race conditions / synchronization

hash(key) % N
1. good for fixed-size pools
2. bad when servers are added / removed
Consistent Hashing: only K / N keys need to be remapped
Introduce a hash ring and hash server / nodes onto it
Server lookup is done clockwise on the ring
Problems with partition size and non-uniform distribution
1. Solution: virtual nodes (more nodes, better balance)

Auto increment in relational database -> does not scale
Multi-master replication (auto_increment with K) -> problem when adding / removing nodes
UUID -> not numeric, don't go up with time
Centralised ticket server -> single point of failure
Snowflake approach: f(timestamp, data center ID, machine ID, sequence number)
Clock synchronisation?
Section length tuning?

REST APIs with URL redirecting (Location header, temporal vs. permanent redirect)
Intuitive approach: hash table (URL -> tinyurl.com/${hash(URL)}) -> does not scale
<shortURL, longURL> tuples in RDBMS
Hash + collision resolution vs. base62 conversion
Rate limiting
Analytics
Availability vs. consistency

On app install or sign up, collect mobile token, phone number and email address
Decouple notification systems from actuators (iOS, Android, SMS, email servers) through message queues
Prevent data loss through at-least once delivery (with retries)
Templates to avoid building content from scratch
Respect user preferences with respect to communication channels
Rate limiting
Event tracking / analytics

Clients connect to chat servers via WebSockets for two-ways communication (send/receive messages)
Other operations (login, group management, user profile) are stateless and can be done over HTTP
Notification service for newer message
Chat data is very large, read-to-write ratio is 1:1 -> prefer key-value store over SQL
Service discovery to find out chat server to connect to
One message sync queue per user -> either delivery the message or store them when offline
Online presence through heartbeat messages
End-to-end encryption of messages
Caching messages on the client side

One request per input character, with low latency
Data gathering system, takes input queries and aggregates them (real-time or batch)
Query service, given a prefix return the most 5 searched items
Using prefix trees / tries is crucial for scalability
Limiting max length for query makes it a O(1) operation
Cache top queries at each node to avoid full traversal
Tries are not suited for SQL, better to use document store or key-value store (where the key is the prefix and the value is the trie node)
AJAX requests to save a full page re-render
Browser caching for data changing infrequently
Store tries in CDNs for local queries

Use CDN (expensive!) for streaming videos, API servers for other operations
Video uploading: original storage, transcoding servers (multiple formats and bit rates), transcoded storage, CDN, metadata servers
Video streaming: access metadata servers for search, then CDN for playback
Pre-signed uploads and DRM/encryption for security
Send only popular videos to CDN to save costs
CDN also for geographical content (popular only in one country)

Reliability is extremely important (data loss in unacceptable)
Bandwidth usage needs to be contained
Web server to handle upload/download
Database to keep track of metadata
Storage system for actual files (e.g., S3) with cold storage for inactive data
The block server analyses deltas between versions and only sends changed blocks (saves bandwidth)
Needs strong consistency by default (different clients must see the same file)
Notification service via WebSockets for updates
De-duplicate blocks to save on storage data