Definitions
Bloom Filter
- Searching for a key in a cache system might take a long time in a LSM tree because it would require to look through all SSTables before finally realizing the key doesn't exists
- Bloom filter uses multiple different hash values to determine if the key probably exists. Like 2 dimensional array, if all hash values of the key has a flag set to 1, then the key probably exists. "probably" because hash collision is possible, let's say 3 other keys combined have the same exist hash values flaged to 1, then this is a false positive.
- It is not a perfect solution, but it does eliminate out some of the long-reads for non-existing keys.
- Another draw back is that it is impossible to delete a key from the hash values. Because of hash collisions, unset those flags might messes up other keys.
Hashing - Consistent Hashing
- Using a hash ring to divide hashing space into consistent servers
- Goal is to minimize the impact when adding/removing servers due to expected or unexpected reasons.
- Key => hash value => position in hash ring => move in clockwise until reaching the first avaialble server
- in case of hot spots, dynamically adding more server into that section has limited impact to other servers
- used in Memcached and Redis, Cassandra and Riak, CDN, Load Balancer
Mutex lock
What is it
- mutual exclusion
- a synchronization primitive used in concurrent programming to ensure that only one thread can access a shared resource at a time
How it works
- Locking:
- a thread that needs to access a sahred resource acquires the mutex lock
- if the mutex is already locked by another thread, the requesting thread is blocked until the lock is released
- Critical Section
- once the lock is acquired, the thread enters a critical section, where it accesses and modifies the shared resource
- Unlocking
- After the thread finishes with the shared resource, it releases the mutex lock
Key Points:
-
Atomic: mutexes ensures that operations on shared resources are atomic, meaning they are indivisible and cannot be interupted by other threads
-
Deadlocks: Improper use of mutexes can lead to deadlocks, where two or more threads are waiting for each other to release locks, resulting in a stalemate
-
Starvation: In some cases a thread may be repeatedly denied access to a shared resource, even if it's available, due to other threads continuously acquiring and releasing the lock.
Example:
package main
import (
"fmt"
"sync"
)
var counter int
var mu sync.Mutex
func increment() {
mu.Lock()
defer mu.Unlock()
counter++
}
func main() {
for i:=0; i<1000; i++ {
go increment()
}
// wait for al lgoroutines to finish
fmt.Println("Final counter value: ", counter)
}
Tree - Inorder Traversal (DFS)
- Left subtree => tree node => right subtree
- in BST => prints order
Tree - Preorder Traversal
- tree node => left subtree => right subtree
- used in copy tree?
Tree - Postorder Traversal
- left subtree => right subtree => tree node
- used in delete the tree?
Tree - B Tree
- Fast read, slow writes because it might involve updating multiple pages on disk
Tree - B+ Tree
Tree - LSM Tree (Log-Structured Merge-Tree)
- used by NoSQL
- fast writes
Tree - Levelorder Traversal (BFS)
- level1 left to right => level2 left to right => ...