Jeiman Jeya

Posted on Apr 5, 2023

Let's Dive into AWS CloudFront

#aws #cdn #devops

What is AWS CloudFront?

AWS CloudFront is a content delivery network (CDN). It securely delivers data, videos, applications, and APIs to customers globally with low latency and high transfer speeds. It offers a simple and cost-effective way to distribute content to end users, with features such as edge caching and SSL/TLS encryption.

How does it work?

When a user or customer requests content, CloudFront routes the request to the edge location closest to the user in that region, delivering the content with low latency and high transfer speeds. CloudFront can also be used to encrypt content, protect against DDoS attacks, and integrate with other AWS services such as S3, Application Load Balancer, Elastic Beanstalk, and more.

Components in AWS CloudFront

Distribution

When you want to deliver or distribute your content to a location, you create what is called as a distribution. With a distribution, you can choose how to deliver your content to your end users based on configuration settings:

Content origin: Your source of information, that is, an S3 bucket, an Elastic Load Balancer, an Elastic Beanstalk server from which CloudFront gets the files to distribute
Origin request: If you require CloudFront to use a specific set of HTTP headers, cookies or query string in the requests that are sent to your origin
Access: Whether you want your content to be accessible by everyone or restrict access to certain users
Security: If you want your users to access your content using HTTPS and the various encryption protocols (TLS v1.1, v1.2)
Logs: Tell CloudFront to create standard logs or real-time logs that show viewer activity
Geographic restrictions: CloudFront can prevent users from selected countries in accessing your content

CloudFront provides flexibility in configuring your CDN for any web application according to your business or engineering needs.

Functions aka Lambda@Edge

Lambda@Edge is a feature of AWS CloudFront that allows you to run code closer to your application's users, improving performance and reducing latency. With Lambda@Edge, you don't need to provision or manage infrastructure in multiple locations worldwide. You only pay for the compute time you consume on that Lambda application, and there is no charge when your code is not running. By using Lambda@Edge, you can enhance your web applications by making them globally distributed and improving their performance, all without any server administration. Simply upload your code to AWS Lambda, which takes care of everything necessary to run and scale your code with high availability at an AWS location closest to your end user.

One use case for this is to introduce an HTTP Basic Authentication layer on your AWS services, such as S3 buckets or ECS Fargate, to provide an additional layer of security for end users accessing your application. When users access the website, they will be prompted with a native Basic auth form. Based on the logic defined in the Lambda function, you can approve or deny access. From there, simply attach the Lambda to your CloudFront distribution and every user will be prompted to enter login credentials to access content on your application.

Invalidation

Invalidation is the process of instructing CloudFront to purge all content from your distribution, refreshing it for your end users in all edge locations. The next time an end user requests a file or page, CloudFront returns to the origin to fetch the latest version of the file.

This is especially useful if you are running web applications on S3 or an HTTP server and want your users to receive the latest updates immediately, without waiting for the original TTL on the file, which could be up to 30 days. By default, CloudFront sets a TTL of 24 hours for all files, meaning that after 24 hours, it fetches new content from your origin. With invalidations, you can define which types of files, pages, or HTTP paths you would like to invalidate. This provides flexibility to meet your engineering needs.

Policies

CloudFront allows you to define the type of policy in your distribution. CloudFront offers 3 distinct policies:

Specify cache and compression settings: You can define which HTTP headers, cookies, and query strings CloudFront includes in the cache key with a CloudFront cache policy. The cache key is used to determine whether a viewer's HTTP request results in a cache hit (i.e., whether the object is served to the viewer from the CloudFront cache). Including fewer values in the cache key increases the likelihood of a cache hit. You can also specify TTL settings for objects in the CloudFront cache, enabling CloudFront to request and compress that object for your end users.
Specify values to include in origin requests: With a CloudFront origin request policy, you can specify the HTTP headers, cookies, and query strings that CloudFront includes in origin requests. These are the requests that CloudFront sends to the origin when there is a cache miss for your content.
Specify HTTP headers to remove or add in viewer responses: Using a CloudFront response headers policy, you can control the HTTP headers included in HTTP responses that CloudFront sends to viewers (web browsers or other clients). You can add or remove headers from the origin's HTTP response without modifying the origin or writing any code. All of this can be handled through CloudFront.

Above is an example of a CloudFront distribution with 2 different configurations of serving WordPress content from an S3 bucket and an Elastic Load Balancer connected to an EC2 instance.

Advantages of using AWS CloudFront

High performance: AWS CloudFront delivers content with low latency and high transfer speeds, improving user experience and reducing load times.
Flexible distribution options: CloudFront offers a range of configuration settings for content delivery, including geographic restrictions, access control, and security features.
Cost-effective: CloudFront offers a pay-as-you-go pricing model, making it a cost-effective solution for content delivery.
Integration with AWS services: CloudFront integrates seamlessly with other AWS services, such as S3, Elastic Beanstalk, and Application Load Balancer, making it easy to distribute content from these services.
Lambda@Edge: The Lambda@Edge feature allows developers to add custom code to CloudFront, improving application performance and reducing latency.

Disadvantages of using AWS CloudFront

Steep learning curve: CloudFront can be complex to set up and configure, requiring a significant amount of time, focus and expertise.
Limited content size to cache: CloudFront has a limit on the size of content that can be cached, which may not be suitable for larger files or applications that is above 30GB in size per file.
Potential cost overruns: While CloudFront is cost-effective, it can be easy to exceed usage limits, leading to unexpected costs. Keep a close eye on your cache requests and usage.
Limited geographic coverage: While CloudFront has a global network of edge locations, there may be regions where it is not available, limiting its availability for some users.
Invalidation process: The invalidation process can be slow and cumbersome, making it difficult to update content quickly when necessary if you are invalidating an application that has a large number of files at the origin.

Conclusion

AWS CloudFront is a powerful content delivery network (CDN) that works by caching content at edge locations around the world. CloudFront routes the request to the edge location closest to the user, delivering the content with low latency and high transfer speeds. CloudFront can also be used to encrypt content, protect against DDoS attacks, and integrate with other AWS services to further enhance and protect your applications.

The Ops Community ⚙️