2.1 System Requirements
Functional Requirements
- what features must be included
Non-Functional Requirements
- must be highly available and fast
Example: Design a scalable, highly available, and fast messaging system
Thought process
- we need to scale for writes and reads, since every message will be written to the messaging system and consumed at some later point
- to scale the system for writes, i will need to partition messages, store them in multiple queues
- which partition strategy should I use? maybe a hash strategy
- where do I store messages quickly? Memory or disk
- if using a memory, it can be a bounded queue
- if using disk, i can use either an append-only log, or an embedded database
- if using database, should I pick a B+ tree or LSM tree? B+ trees are used for fast search and insertion, LSM trees are used when we have write-intensive datasets and reads are not that high, since messaging queues doesn't require search, we'll use LSM tree because LSM trees are faster for writes
- LSM trees implies NoSQL databases
- for scaling, we'll use partitioning. but should I use push or pull for reading messages?
- if using pull, I should make sure the system supports long polling in order to decrease the number of read requests
- for high availability, i need to replicate messages. should I use leader-based or leaderless replication?
- most likely leader-based, but then I will need to solve a leader election problem, maybe use a coordination service or a databse that guarantees strong consistency
- for reliability, I may also need to implement some protection mechanisms, like load shedding and rate limiting, or shuffle sharding
- and should I use a reverse proxy?
- maybe, it will simplify the client-side logic for both message producers and consumers, since the reverse proxy will take care of partitions discovery and message routing
- how to make my messaging system fast. I should consider batching and compressing messages.
- if my messaging system is inside a trusted environment, I may use TCP instead of HTTP for client-server communication