Deploying an astro.build website on AWS S3, with CloudFront
It is only fitting that the first post on this blog would be about the construction of this website.
The soft requirements for this website were as follows:
- Have an easy-to-use static static site generator
- Have the website be as cheap as reasonably possible, while still hosted in the cloud.
- Use Terraform for deployment. I wanted practice with Terraform, and enjoy the paradigm of infrastructure-as-code.
After playing around with Astro, Hugo, and Pelican, I found Astro to have the best balance of ease-of-use and community support. So that was point (1) covered.
As far as point (2), I knew that it was possible, at least in theory, to deploy a website on AWS S3. If that was possible, it would almost certainly be cheaper than most other site hosting options, but would require a solely static site. This restriction suited me fine. I don’t ever see moving beyond a simple static site.
For point (3), obviously AWS has extensive Terraform integration. That all set, I underwent a journey of trying to find the best way to use S3 to host a static site.
Prerequisites
Throughout this process, I had a DNS record from my domain provider that
redirected all requests from bostonlee.com to www.bostonlee.com.
The Terraform version at the time of writing is 4.65.0. I note this
because I read a few blogs on this topic, and ended up having to resort
to reading documentation, because all of the Terraform resources had
changed.
Stage 1: S3 Static site support
S3 has the ability to host a static site directly, through S3 website endpoints. This lets you put your website files into a bucket, and then have those files hosted at a designated endpoint.
The terraform data source for this resource is here. This requires you to make the bucket publicly accessible, which requires a bucket policy. The Terraform documentation for this resource is here.
From there, I simply had to add a CNAME record pointing from
www.bostonlee.com to the website endpoint given by S3.
My domain provider for some reason would not route traffic correctly
unless the bucket name started with “www”, so the S3 website endpoint
looked like
http://www.bostonlee.com.s3-website.us-east-1.amazonaws.com instead of
http://bostonlee.com.s3-website.us-east-1.amazonaws.com. That is, I
had to name the bucket www.bostonlee.com.
This approach worked! I was able to successfully access my website. However, I underestimated the importance of HTTPS to website perception. Astute readers (or readers whose eyes are drawn to color) probably noticed the large red box on the bottom of the S3 website endpoint documentation. An excerpt:
Amazon S3 website endpoints do not support HTTPS or access points. If you want to use HTTPS, you can use Amazon CloudFront to serve a static website hosted on Amazon S3.
When I showed my rough-draft website to friends and family, the first comment from everyone was about the lack of secured connection. Long gone are the days when I would have to chastise relatives about getting HTTPS Everywhere. Now I was the one getting a look for not having a secure connection. And, fair enough. When everyone’s browser gives them explicit warning about an insecure connection, it is probably not a great idea to simply use a service with no support for secure connection and move on.
Luckily, the AWS documentation provided a specific solution: Serve the website on CloudFront.
Stage 2: CloudFront serving a website from an S3 bucket
This was quite a complicated process. I admit, I don’t fully understand all of the possible arguments for a CloudFront distribution resource. However, I will do my best to lay out the pieces of infrastructure that make up the website.
I set up my Terraform to use an S3 bucket (manually-managed) as a backend:
terraform {
required_providers {
aws = {
version = ">= 2.7.0"
source = "hashicorp/aws"
}
}
backend "s3" {
bucket = "bostonlee.com-terraform"
key = "terraform.tfstate"
region = "us-east-1"
}
}
Then, I created some variables that I could use to abstract my domain name out of the components:
variable "domain_name" {
default = "bostonlee.com"
type = string
}
variable "bucket_name" {
default = "www.bostonlee.com"
type = string
}
I then used those variables to set up an S3 bucket:
resource "aws_s3_bucket" "website_bucket" {
bucket = var.bucket_name
}
resource "aws_s3_bucket_public_access_block" "block_public_access" {
bucket = aws_s3_bucket.website_bucket.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
resource "aws_s3_bucket_policy" "access_control" {
bucket = aws_s3_bucket.website_bucket.id
policy = data.aws_iam_policy_document.access_control.json
}
data "aws_iam_policy_document" "access_control" {
statement {
actions = ["s3:GetObject"]
resources = ["${aws_s3_bucket.website_bucket.arn}/*"]
principals {
type = "Service"
identifiers = ["cloudfront.amazonaws.com"]
}
condition {
test = "StringEquals"
variable = "AWS:SourceArn"
values = [aws_cloudfront_distribution.s3_distribution.arn]
}
}
}
Note that the bucket blocks public access, and only allows access from the (yet-to-be-created) CloudFront distribution. Special thanks to this blog for the policy document configuration. The application order of the blocks in this file is not straightforward. The bucket policy depends on the policy document, which in turn depends on the CloudFront distribution. So, the CloudFront distribution will be deployed before the policy is applied (unless I am wildly misunderstanding…). However, I opted to keep all of the S3-related configuration in the same place.
Next up is a Cloudfront distribution that uses that S3 bucket. I had to play around with the arguments for this, given that some of the arguments are required, but the documentation does not give an easy answer for what to do in the simplest possible case. The caching behavior, for instance, is simply copied from the Terraform example docs. Seeing as I was hosting a minimal static site, I wanted the easiest possible solution. Here is what I came up with:
locals {
s3_origin_id = "bolee_website_origin"
}
resource "aws_cloudfront_origin_access_control" "website_access_control" {
name = "example"
origin_access_control_origin_type = "s3"
signing_behavior = "always"
signing_protocol = "sigv4"
}
resource "aws_cloudfront_distribution" "s3_distribution" {
origin {
domain_name = aws_s3_bucket.website_bucket.bucket_regional_domain_name
origin_access_control_id = aws_cloudfront_origin_access_control.website_access_control.id
origin_id = local.s3_origin_id
}
enabled = true
is_ipv6_enabled = true
default_root_object = "index.html"
aliases = ["www.${var.domain_name}"]
default_cache_behavior {
allowed_methods = ["GET", "HEAD", "OPTIONS"]
cached_methods = ["GET", "HEAD"]
forwarded_values {
query_string = false
cookies {
forward = "none"
}
}
min_ttl = 0
default_ttl = 3600
max_ttl = 86400
viewer_protocol_policy = "redirect-to-https"
target_origin_id = local.s3_origin_id
function_association {
event_type = "viewer-request"
function_arn = aws_cloudfront_function.index_redirect.arn
}
}
price_class = "PriceClass_100"
restrictions {
geo_restriction {
restriction_type = "whitelist"
locations = ["US", "CA", "GB", "DE"]
}
}
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.website_acm_certificate.id
ssl_support_method = "sni-only"
}
}
resource "aws_cloudfront_function" "index_redirect" {
name = "index_redirect"
runtime = "cloudfront-js-1.0"
comment = "https://docs.astro.build/en/guides/deploy/aws/#cloudfront-functions-setup"
publish = true
code = file("${path.module}/redirect_function.js")
}
Let’s break down the pieces of this. The components of the Cloudfront distribution are:
-
A Cloudfront distribution, where the website will actually be served. This is relatively self-explanatory, though there is a lot of configuration that can be set. The part of the configuration that makes this distribution markedly different than hosting on S3 is the
viewer_certificateblock. This block allows the site to be secured (note the “redirect-to-https” argument as well). This requires a separate ACM certificate, which I created as follows:resource "aws_acm_certificate" "website_acm_certificate" { domain_name = "www.${var.domain_name}" validation_method = "EMAIL" } resource "aws_acm_certificate_validation" "website_certificate_validation" { certificate_arn = aws_acm_certificate.website_acm_certificate.arn }This allows for the creation of a certificate with email validation. I had an email address set as a catchall on my domain provider (NameCheap). This allowed me to easily verify the certificate. Whenever the certificate must be recreated upon a
terraform apply, the deployment process will wait for the ACM certificate to be verified. So, this method is not very prone to automation. However, it is convenient. -
Notice the CloudFront Function block. This has to do with a quirk in how Astro handles URLs. URLs in astro are represented as paths, such as
www.mydomain.com/blog/this-blog/, but in reality those links need to get redirected “under the hood” to actual index documents, likewww.mydomain.com/blog/this-blog/index.html. Because I was simply planning on deploying a pre-built static site to CloudFront, the astro server would not be doing this work for me. Luckily, the Astro docs provide a nice guide for making links work in CloudFront. The guide also mentions a couple of other AWS hosting methods, including S3 static sites (I wish I had looked in the Astro docs from the outset…). The recommended function looks as follows:function handler(event) { var request = event.request; var uri = request.uri; // Check whether the URI is missing a file name. if (uri.endsWith('/')) { request.uri += 'index.html'; } // Check whether the URI is missing a file extension. else if (!uri.includes('.')) { request.uri += '/index.html'; } return request; }I opted to simply include this function in the same directory as my Terraform files for the time being.
-
There is also an Origin Acces Control block, which helps secure the site further.
Once all of that was in place,
I simply had to run terraform apply,
and verify the certificate through my email.
Heading to the console, I could see the CloudFront distribution URL: d3oizcfhaagn9u.cloudfront.net. Sure enough, this link is to my site.
The final piece of the puzzle was to add
a CNAME record from the www subdomain of my site
to the CloudFront distribution
(Here is a
tutorial from NameCheap
on the subject).
As mentioned at the beginning of this article,
I set up DNS to redirect bare requests for
bostonlee.com to www.bostonlee.com.
So, the CNAME record would stil work for those typing
only bostonlee.com into their browser.
And that is the current state of my website setup, as of the date of this article. If things change a fair amount, I may write more about it and add an update here.
If you have any suggestions for me on this front, I would be more than happy to hear them. I wish you luck if you want to try this out yourself!