2.7 Durability
High durability means low probability of data loss.
Once data is successfully submitted to the system, it is not lost. How? By creating and maintaining multiple copies of data.
Backups
- copy data from a non-volatile storage, such as disk, periodically
- 3 common strategies:
- Full
- Easy to implement, easy to restore
- Takes a long time to backup everything, takes more storage space
- Differential
- used with full backus
- longer restoration time (take detours due to changes)
- Full
-
- Incremental
- Same as before, but only backup differences
- Longer restoration time
- Incremental
RAID
Redundant Array of Independent Disks
- Multiple disks making copies
- Act has a single virtual disk to the application
- RAID 0
- Blocks of data are written to disks alternately (blocks 1, 3, 5, 7 on drive 1, blocks 2, 4, 6, 8 on drive 2)
- in descreases durability, because each one of them fails will cause issues, but it increases read and write throughput because we have multiple drives
- RAID 1 (mirrored volume)
- Multiple disks containing the same data
Replication
Making copies at the application level
Checksum (Data Integrity)
We need to verify that data on different disks are not corrupted. Taking a checksum when storing, and recalculate the checksum upon retrieval.

