Skip to main content

2.7 Durability

High durability means low probability of data loss.

Once data is successfully submitted to the system, it is not lost. How? By creating and maintaining multiple copies of data.

Backups

  • copy data from a non-volatile storage, such as disk, periodically
  • 3 common strategies:
    • Full
      • Easy to implement, easy to restore
      • Takes a long time to backup everything, takes more storage space
    • Differential
      • used with full backus
      • longer restoration time (take detours due to changes)

image.png

    • Incremental
      • Same as before, but only backup differences
      • Longer restoration time

image.png

RAID

Redundant Array of Independent Disks

  • Multiple disks making copies
  • Act has a single virtual disk to the application
  • RAID 0
    • Blocks of data are written to disks alternately (blocks 1, 3, 5, 7 on drive 1, blocks 2, 4, 6, 8 on drive 2)
    • in descreases durability, because each one of them fails will cause issues, but it increases read and write throughput because we have multiple drives
  • RAID 1 (mirrored volume)
    • Multiple disks containing the same data

Replication

Making copies at the application level

Checksum (Data Integrity)

We need to verify that data on different disks are not corrupted. Taking a checksum when storing, and recalculate the checksum upon retrieval.