Durability - Cloud Applications

# Durability: Cloud Applications rel:: [[Databases MOC]] - Author: [[evanjones.ca]] - Full Title: *Durability: Cloud Applications* - Category: #stack/Articles - URL: https://www.evanjones.ca/durability-cloud.html ## Summary ### Instance Storage Instance storage is not guaranteed to survive reboots. Any system relying on instance storage for durability needs multi-zone replication. ### Network Disks Network disk in one zone are 20x as reliable as Network disks replicated across zones good for 5 9s availability, with EBS volumes claiming a 0.1 - 0.2 percent Annual Failure Rate (AFR). AFR for physical disks is 4%. EBS writes are recorded to nonvolatile storage before writes are ack'd to the OS. Explicit FUA/flush/barriers are not required for durability. GCP disks have similar behavior. ### Object Storage ![[#^81daca]] ## References - [durability guarantees provided by the NVMe disk interface](https://www.evanjones.ca/durability-nvme.html) - [Linux file APIs durability](https://www.evanjones.ca/durability-filesystem.html). - [AWS says "local instance store volumes are not intended to be used as durable disk storage"](https://docs.aws.amazon.com/whitepapers/latest/aws-storage-services-overview/durability-and-availability-4.html). - [Google says "Local SSDs are suitable only for temporary storage such as caches, processing space, or low value data"](https://cloud.google.com/compute/docs/disks/local-ssd). - [a comment from an AWS engineer that claims](https://news.ycombinator.com/item?id=18980193) > All writes to EBS are durability recorded to nonvolatile storage before the write is acknowledged to the operating system running in the [[AWS EC2]] instance \[...\] explicit FUA / flush / barriers are not required for data durability". ### Highlights - I previously wrote about the durability guarantees provided by the NVMe disk interface, and by Linux file APIs. ([View Highlight](https://instapaper.com/read/1408976346/16298086)) - Instance storage can be lost when the machine is restarted by the cloud provider. I would only call it durable if the application is replicated across more than one zone. ([View Highlight](https://instapaper.com/read/1408976346/16298093)) - A network disk in one zone is like a physical server: on rare occasions data will be lost (e.g. 1 out of 100k disks/year). This is probably sufficient for many applications. However, you should take periodic snapshots ([View Highlight](https://instapaper.com/read/1408976346/16298096)) - Network disk replicated across zones should be highly available and durable. This is sufficient for mission critical applications with high availability requirements (e.g. 99.999% or greater availability). ([View Highlight](https://instapaper.com/read/1408976346/16298098)) - Instance storage replicated across zones can be highly available and durable, but you should be careful due to the risk of correlated failure (e.g. accidentally turning off all instances), or misconfigurations leading to everything running in one zone, or not being correctly replicated. ([View Highlight](https://instapaper.com/read/1408976346/16298101)) - Object storage is basically as durable as a storage system gets. All important applications need backups, even applications using object storage. ([View Highlight](https://instapaper.com/read/1408976346/16298103)) - AWS says "local instance store volumes are not intended to be used as durable disk storage". Google says "Local SSDs are suitable only for temporary storage such as caches, processing space, or low value data" ([View Highlight](https://instapaper.com/read/1408976346/16298111)) - local instance storage is lower durability than a single physical disk in a physical server. Not only will your data be lost if the disk fails, but your data will also be lost for any other issue that causes the cloud provider to turn off that machine, such applying a security patch. It also is not clear if your data will survive the machine losing power. AWS's documentation suggests reboots will always lose data, but Google's suggests it might, if it comes back within 60 minutes. ([View Highlight](https://instapaper.com/read/1408976346/16298118)) - "EBS volumes are designed for an annual failure rate (AFR) of between 0.1 and 0.2 percent [...] This AFR makes EBS volumes 20 times more reliable than typical commodity disk drives, which fail with an AFR of around 4 percent." ([View Highlight](https://instapaper.com/read/1408976346/16298127)) - application data is safer in a cloud network disk than on a single physical disk. ([View Highlight](https://instapaper.com/read/1408976346/16298132)) - I found a comment from an AWS engineer that claims "All writes to EBS are durability recorded to nonvolatile storage before the write is acknowledged to the operating system running in the [[AWS EC2]] instance [...] explicit FUA / flush / barriers are not required for data durability". This means writes will survive both the host failing and the entire zone losing power. I have not been able to find similar information for GCP persistent disk, but testing with FUA writes and cache flushes shows that they appear to have nearly zero impact on throughput or latency, so I suspect it is the same. This means that applications could get away with O_DIRECT writes only, and FUA writes and cache flushes are probably unnecessary. However, since there is nearly no performance impact, you might as well use O_DSYNC or fdatasync anyway, since then your application will work correctly on other disks. My guess is that Amazon and Google implement this with the equivalent of battery-backed RAM, just like hardware RAID disk controllers often do. ([View Highlight](https://instapaper.com/read/1408976346/16298152)) - The object storage systems (AWS S3, GCP GCS) provide very ambitious durability claims. They store data in multiple zones, and claim they will only lose some astronomically small fraction of objects. This type of storage is basically as good as it gets in terms of durability guarantees. ([View Highlight](https://instapaper.com/read/1408976346/16298154)) ^81daca