The Ops Community ⚙️

Cover image for Checklist for designing cloud-native applications – Part 2: Security aspects
Eyal Estrin
Eyal Estrin

Posted on • Originally published at eyal-estrin.Medium

Checklist for designing cloud-native applications – Part 2: Security aspects

This post was originally published by the Cloud Security Alliance.

In Chapter 1 of this series about considerations when building cloud-native applications, we introduced various topics such as business requirements, infrastructure considerations, automation, resiliency, and more.

In this chapter, we will review security considerations when building cloud-native applications.

IAM Considerations - Authentication

Identity and Access Management plays a crucial role when designing new applications.

We need to ask ourselves – Who are our customers?

If we are building an application that will serve internal customers, we need to make sure our application will be able to sync identities from our identity provider (IdP).

On the other hand, if we are planning an application that will serve external customers, in most cases we would not want to manage the identities themselves, but rather allow authentication based on SAML, OAuth, or OpenID connect, and manage the authorization in our application.

Examples of managed cloud-native identity services: AWS IAM Identity Center, Microsoft Entra ID, and Google Cloud Identity.

IAM Considerations - Authorization

Authorization is also an important factor when designing applications.

When our application consumes services (such as compute, storage, database, etc.) from a CSP ecosystem, each CSP has its mechanisms to manage permissions to access services and take actions, and each CSP has its way of implementing Role-based access control (RBAC).

Regardless of the built-in mechanisms to consume cloud infrastructure, we must always follow the principle of least privilege (i.e., minimal permissions to achieve a task).

On the application layer, we need to design an authorization mechanism to check each identity that was authenticated to our application, against an authorization engine (interactive authentication, non-interactive authentication, or even API-based access).

Although it is possible to manage authorization using our own developed RBAC mechanism, it is time to consider more cloud-agnostic authorization policy engines such as Open Policy Agent (OPA).

One of the major benefits of using OPA is the fact that its policy engine is not limited to authorization to an application – you can also use it for Kubernetes authorization, for Linux (using PAM), and more.

Policy-as-Code Considerations

Policy-as-Code allows you to configure guardrails on various aspects of your workload.

Guardrails are offered by all major cloud providers, outside the boundary of a cloud account, and impact the maximum allowed resource consumption or configuration.

Examples of guardrails:

  • Limitation on the allowed region for deploying resources (compute, storage, database, network, etc.)
  • Enforce encryption at rest
  • Forbid the ability to create publicly accessible resources (such as a VM with public IP)
  • Enforce the use of specific VM instance size (number of CPUs and memory allowed)

Guardrails can also be enforced as part of a CI/CD pipeline when deploying resources using Infrastructure as Code for automation purposes – The IaC code is been evaluated before the actual deployment phase, and assuming the IaC code does not violate the Policy as Code, resources are been updated.

Examples of Policy-as-Code: AWS Service control policies (SCPs), Azure Policy, Google Organization Policy Service, HashiCorp Sentinel, and Open Policy Agent (OPA).

Data Protection Considerations

Almost any application contains valuable data, whether the data has business or personal value, and as such we must protect the data from unauthorized parties.

A common way to protect data is to store it in encrypted form:

  • Encryption in transit – done using protocols such as TLS (where the latest supported version is 1.3)
  • Encryption at rest – done on a volume, disk, storage, or database level, using algorithms such as AES
  • Encryption in use – done using hardware supporting a trusted execution environment (TEE), also referred to as confidential computing

When encrypting data we need to deal with key generation, secured vault for key storage, key retrieval, and key destruction.

All major CSPs have their key management service to handle the entire key lifecycle.

If your application is deployed on top of a single CSP infrastructure, prefer to use managed services offered by the CSP.

For encryption in use, select services (such as VM instances or Kubernetes worker nodes) that support confidential computing.

Secrets Management Considerations

Secrets are equivalent to static credentials, allowing access to services and resources.

Examples of secrets are API keys, passwords, database credentials, etc.

Secrets, similarly to encryption keys, are sensitive and need to be protected from unauthorized parties.

From the initial application design process, we need to decide on a secured location to store secrets.

All major CSPs have their own secrets management service to handle the entire secret’s lifecycle.

As part of a CI/CD pipeline, we should embed an automated scanning process to detect secrets embedded as part of code, scripts, and configuration files, to avoid storing any secrets as part of our application (i.e., outside the secured secrets management vault).

Examples of secrets management services: AWS Secrets Manager, Azure Key Vault, Google Secret Manager, and HashiCorp Vault.

Network Security Considerations

Applications must be protected at the network layer, whether we expose our application to internal customers or customers over the public internet.

The fundamental way to protect infrastructure at the network layer is using access controls, which are equivalent to layer 3/layer 4 firewalls.

All CSPs have access control mechanisms to restrict access to services (from access to VMs, databases, etc.)

Examples of Layer 3 / Layer 4 managed services: AWS Security groups, Azure Network security groups, and Google VPC firewall rules.

Some cloud providers support private access to their services, by adding a network load-balancer in front of various services, with an internal IP from the customer’s private subnet, enforcing all traffic to pass inside the CSP’s backbone, and not over the public internet.

Examples of private connectivity solutions: AWS PrivateLink, Azure Private Link, and Google VPC Service Controls.

Some of the CSPs offer managed layer 7 firewalls, allowing customers to enforce traffic based on protocols (and not ports), inspecting TLS traffic for malicious content, and more, in case your application or business requires those capabilities.

Examples of Layer 7 managed firewalls: AWS Network Firewall, Azure Firewall, and Google Cloud NGFW.

Application Layer Protection Considerations

Any application accessible to customers (internal or over the public Internet), is exposed to application layer attacks.

Attacks can range from malicious code injection, data exfiltration (or data leakage), data tampering, unauthorized access, and more.

Whether you are exposing an API, a web application, or a mobile application, it is important to implement application layer protection, such as a WAF service.

All major CSPs offer managed WAF services, and there are many SaaS solutions by commercial vendors that offer managed WAF services.

Examples of managed WAF services: AWS WAF, Azure WAF, and Google Cloud Armor.

DDoS Protection Considerations

Denial-of-Service (DoS) or Distributed Denial-of-Service (DDoS) is a risk for any service accessible over the public Internet.

Such attacks try to consume all the available resources (from network bandwidth to CPU/memory), directly impacting the service availability to be accessible by customers.

All major CSPs offer managed DDoS protection services, and there are many DDoS protection solutions by commercial vendors that offer managed DDoS protection services.

Examples of managed DDoS protection services: AWS Shield, Azure DDoS Protection, Google Cloud Armor, and Cloudflare DDoS protection.

Patch Management Considerations

Software tends to be vulnerable, and as such it must be regularly patched.

For applications deployed on top of virtual machines:

  • Create a "golden image" of a virtual machine, and regularly update the image with the latest security patches and software updates.
  • For applications deployed on top of VMs, create a regular patch update process.

For applications wrapped inside containers, create a "golden image" of each of the application components, and regularly update the image with the latest security patches and software updates.

Embed software composition analysis (SCA) tools to scan and detect vulnerable third-party components – in case vulnerable components (or their dependencies) are detected, begin a process of replacing the vulnerable components.

Examples of patch management solutions: AWS Systems Manager Patch Manager, Azure Update Manager, and Google VM Manager Patch.

Compliance Considerations

Compliance is an important security factor when designing an application.

Some applications contain personally identifiable information (PII) about employees or customers, which requires compliance against privacy and data residency laws and regulations (such as the GDPR in Europe, the CPRA in California, the LGPD in Brazil, etc.)

Some organizations decide to be compliant with industry or security best practices, such as the Center for Internet Security (CIS) Benchmark for hardening infrastructure components, and can be later evaluated using compliance services or Cloud security posture management (CSPM) solutions.

References for compliance: AWS Compliance Center, Azure Service Trust Portal, and Google Compliance Resource Center.

Incident Response

When designing an application in the cloud, it is important to be prepared to respond to security incidents:

  • Enable logging from both infrastructure and application components, and stream all logs to a central log aggregator. Make sure logs are stored in a central, immutable location, with access privileges limited for the SOC team.
  • Select a tool to be able to review logs, detect anomalies, and be able to create actionable insights for the SOC team.
  • Create playbooks for the SOC team, to know how to respond in case of a security incident (how to investigate, where to look for data, who to notify, etc.)
  • To be prepared for a catastrophic event (such as a network breach, or ransomware), create automated solutions, to allow you to quarantine the impacted services, and deploy a new environment from scratch.

References for incident response documentation: AWS Security Incident Response Guide, Azure Incident response, and Google Data incident response process.


In the second blog post in this series, we talked about many security-related aspects, that organizations should consider when designing new applications in the cloud.

In this part of the series, we have reviewed various aspects, from identity and access management to data protection, network security, patch management, compliance, and more.

It is highly recommended to use the topics discussed in this series of blog posts, as a baseline when designing new applications in the cloud, and continuously improve this checklist of considerations when documenting your projects.

About the Author

Eyal Estrin is a cloud and information security architect, and the author of the book Cloud Security Handbook, with more than 20 years in the IT industry. You can connect with him on Twitter.

Opinions are his own and not the views of his employer.

Top comments (0)