Cassandra Cheatsheet
ref: Jordan
Key characteristics
Wide Column Store DatabaseWideColumnStoreClusterEventualKeyConsistencySortTunableKeyACID- LSM Tree Index
- Bloom filter
Designed for single partition read and write
- It's recommended that read and write of the same data go to the same
partitionshard
Index
- no global secondary index, only local indexes
Leaderless Replication
- Using a quorum (can be configured), can even use 1 which will read and write from 1 partition
- Each write goes to all replicas, writes are considered successful if at least a quorum of nodes succeededs
- read from quorum nodes, if there are differences, use the latest timestamp
- read repair, other outdated values will be overwritten
- This is not save, Riak uses CRDT for write conflict resolution
Read - There is also a background process call Anti Entropy (merkle tree) that syncs the differences between each node
WriteHinted Conflicts?Handoff
- if
Lastsome replicas cannot handle writes, the coordinator node will store the writewins,tocanbeleadsent tolostthemwriteslater
Gossip Protocol
as opposedused toRiakdetechusesnodeCRDTs to resolve write conflictsfailures
Single Node
-
Memtable + SSTables. Fast write and slow read
-
only row level locking, no ACID transaction
Use Cases
- Write heavy Applications
- if data is generally self contained, and only needs to be fetched with other data from its partition, then use Cassandra
- ex): Sensor Readings, Chat Messages, User Activity Tracking, etc.
Pros
- good when data is generally self contained, and only needs to be fetched with other data from its partition (no joins)
- useful for write heavy applications like sensor readings, chat messages, user activity tracking, etc.
- designed for massive scale (sharded with leaderless replica)
Cons
- lack of strong consistency
- lack of support for data relationships
- lack of global secondary indexes