The Ops Community ⚙️

Cover image for Comparison of Cloud Storage Services
Eyal Estrin
Eyal Estrin

Posted on • Originally published at eyal-estrin.Medium

Comparison of Cloud Storage Services

When designing workloads in the cloud, it is rare to have a workload without persistent storage, for storing and retrieving data.

In this blog post, we will review the most common cloud storage services and the different use cases for choosing specific cloud storage.

Object storage

Object storage is perhaps the most commonly used cloud-native storage service.

It is been used by various use cases from simple storage or archiving of logs or snapshots to more sophisticated use cases such as storage for data lakes or AI/ML workloads.

Object storage is used by many cloud-native applications from Kubernetes-based workloads using CSI driver (such as Amazon EKS, Azure AKS, and Google GKE), and for Serverless / Function-as-a-Service (such as AWS Lambda, and Azure Functions).

As a cloud-native service, the access to object storage is done via Rest API, HTTP, or HTTPS.

Unstructured data is stored inside object storage services as objects, in a flat hierarchy, where most cloud providers call it buckets.

Data is automatically synched between availability zones in the same region (unless we choose otherwise), and if needed, buckets can be synched between regions (using cross-region replication capability).

To support different data access patterns, each of the hyperscale cloud providers, offers its customers different storage classes (or storage tiers), from real-time, near real-time, to archive storage, and a capability for configuring rules for moving data between storage classes (also known as lifecycle policies).

As of 2023, all hyperscale cloud providers enforce data encryption at rest in all newly created buckets.

Comparison between Object storage alternatives:

Image description

As you can read in the comparison table above, most features are available in all hyper-scale cloud providers, but there are still some differences between the cloud providers:

  • AWS – Offers a cheap storage tier called S3 One Zone-IA for scenarios where data access patterns are less frequent, and data availability and resiliency are not highly critical, such as secondary backups. AWS also offers a tier called S3 Express One Zone for single-digit millisecond data access requirements, with low data availability or resiliency, such as AI/ML training, Amazon Athena analytics, and more.
  • Azure – Most storage services in Azure (Blob, files, queues, pages, and tables), require the creation of an Azure storage account – a unique namespace for Azure storage data objects, accessible over HTTP/HTTPS. Azure also offers a Premium block blob for high-performance workloads, such as AI/ML, IoT, etc.
  • GCP – Cloud storage in Google, is not limited to a single region but can be provisioned and synched automatically to dual-regions and even multi-regions.

Block storage

Block storage is the disk volume attached to various compute services – from VMs, managed databases, Kubernetes worker notes, and mounted inside containers.

Block storage can be used as the storage for transactional databases, data warehousing, and workloads with high volumes of read and write.

Block storage is not just limited to traditional workloads deployed on top of virtual machines, they can be mounted as persistent volumes for container-based workloads (such as Amazon ECS), and for Kubernetes-based workloads using CSI driver (such as Amazon EKS, Azure AKS, and Google GKE).

Block storage volumes are usually limited to a single availability zone within the same region and should be mounted to a VM in the same AZ.

Comparison between Block storage alternatives:

Image description

As you can read in the comparison table above, most features are available in all hyper-scale cloud providers, but there are still some differences between the cloud providers:

File storage

File storage services are the equivalent of the traditional Storage Area Network (SAN).

All major hyperscale cloud providers offer managed file storage services, allowing customers to share files between multiple Windows (CIFS/SMB), and Linux (NFS) virtual machines.

File storage is not just limited to traditional workloads sharing files between multiple virtual machines, they can be mounted as persistent volumes for container-based workloads (such as Amazon ECS, Azure Container Apps, and Google Cloud Run), Kubernetes-based workloads using CSI driver (such as Amazon EKS, Azure AKS, and Google GKE, and for Serverless / Function-as-a-Service (such as AWS Lambda, and Azure Functions).

Other than the NFS or CIFS/SMB file storage services, major cloud providers also offer a managed NetApp files system (for customers who wish to have the benefits of NetApp storage) and managed Lustre file system (for HPC workloads or workloads that require extreme high-performance throughput).

Comparison between NFS File storage alternatives:

Image description

As you can read in the comparison table above, most features are available in all hyper-scale cloud providers, but there are still some differences between the cloud providers:

  • AWS – Offers cheap storage tier called EFS One Zone file system, for scenarios where data access pattern is less frequent, and data availability and resiliency are not highly critical. By default, data inside the One Zone file system is automatically backed up using AWS Backup.
  • Azure – Offers an additional security protection mechanism such as malware scanning and sensitive data threat detection, as part of a service called Microsoft Defender for Storage.
  • GCP – Offers enterprise-grade tier for critical applications such as SAP or GKE workloads, with regional high-availability and data replication called Enterprise tier.

Comparison between CIFS/SMB File storage alternatives:

Image description

Comparison between managed NetApp File storage alternatives:

Image description

Comparison between File storage for HPC workloads alternatives:

Image description

Summary

Persistent storage is required by almost any workload, including cloud-native applications.

In this blog post, we have reviewed the various managed storage options offered by the hyperscale cloud providers.

As best practice, it is crucial to understand the application's requirements, when selecting the right storage option.

About the Author

Eyal Estrin is a cloud and information security architect, and the author of the books Cloud Security Handbook, and Security for Cloud Native Applications, with more than 20 years in the IT industry.

You can connect with him on Twitter.

Opinions are his own and not the views of his employer.

👇Help to support my authoring👇

Buy me a coffee

Top comments (0)