Skip to main content

2.1 System Requirements

Functional Requirements

  • what features must be included

Non-Functional Requirements

  • must be highly available and fast

Example: Design a scalable, highly available, and fast messaging system

Thought process

  • we need to scale for writes and reads, since every message will be written to the messaging system and consumed at some later point
  • to scale the system for writes, i will need to partition messages, store them in multiple queues
  • which partition strategy should I use? maybe a hash strategy
  • where do I store messages quickly? Memory or disk
  • if using a memory, it can be a bounded queue
  • if using disk, i can use either an append-only log, or an embedded database
  • if using database, should I pick a B+ tree or LSM tree? B+ trees are used for fast search and insertion, LSM trees are used when we have write-intensive datasets and reads are not that high, since messaging queues doesn't require search, we'll use LSM tree because LSM trees are faster for writes
    • LSM trees implies NoSQL databases
  • for scaling, we'll use partitioning. but should I use push or pull for reading messages?
    • if using pull, I should make sure the system supports long polling in order to decrease the number of read requests
  • for high availability, i need to replicate messages. should I use leader-based or leaderless replication?
    • most likely leader-based, but then I will need to solve a leader election problem, maybe use a coordination service or a databse that guarantees strong consistency
  • for reliability, I may also need to implement some protection mechanisms, like load shedding and rate limiting, or shuffle sharding
  • and should I use a reverse proxy?
    • maybe, it will simplify the client-side logic for both message producers and consumers, since the reverse proxy will take care of partitions discovery and message routing
  • how to make my messaging system fast. I should consider batching and compressing messages.
  • if my messaging system is inside a trusted environment, I may use TCP instead of HTTP for client-server communication