Skip to main content
Definitions
Bloom Filter
- Searching for a key in a cache system might take a long time in a LSM tree because it would require to look through all SSTables before finally realizing the key doesn't exists
- Bloom filter uses multiple different hash values to determine if the key probably exists. Like 2 dimensional array, if all hash values of the key has a flag set to 1, then the key probably exists. "probably" because hash collision is possible, let's say 3 other keys combined have the same exist hash values flaged to 1, then this is a false positive.
- It is not a perfect solution, but it does eliminate out some of the long-reads for non-existing keys.
- Another draw back is that it is impossible to delete a key from the hash values. Because of hash collisions, unset those flags might messes up other keys.
Hashing - Consistent Hashing
- Using a hash ring to divide hashing space into consistent servers
- Goal is to minimize the impact when adding/removing servers due to expected or unexpected reasons.
- Key => hash value => position in hash ring => move in clockwise until reaching the first avaialble server
- in case of hot spots, dynamically adding more server into that section has limited impact to other servers
- used in Memcached and Redis, Cassandra and Riak, CDN, Load Balancer
Tree - Inorder Traversal (DFS)
- Left subtree => tree node => right subtree
- in BST => prints order
Tree - Preorder Traversal
- tree node => left subtree => right subtree
- used in copy tree?
Tree - Postorder Traversal
- left subtree => right subtree => tree node
- used in delete the tree?
Tree - B Tree
- Fast read, slow writes because it might involve updating multiple pages on disk
Tree - B+ Tree
Tree - LSM Tree (Log-Structured Merge-Tree)
- used by NoSQL
- fast writes
Tree - Levelorder Traversal (BFS)
- level1 left to right => level2 left to right => ...