How to route to an arbitrary S3 bucket website with Cloudflare Workers

How Cloudflare workers can overwrite the requests sent to S3

Author's image
Tamás Sallai
5 mins

Cloudflare and S3 bucket website

I happily used Cloudflare with S3 bucket website to host static sites on my own domain with HTTPS practically for free, as described in the previous article. This setup works quite well and requires no maintenance. But I've noticed a strange thing during the setup. The name of the bucket has to exactly match the domain otherwise it won't work. Even more strangely, the value of the CNAME seems to not matter.

My setup looks like this, notice that the domain and the name of the bucket are different:

The CNAME is set to the bucket website's domain:

But when trying to open the page, it results in an S3 error:

Notice that it tries to find the bucket website for the domain instead of the one that is specified in the CNAME.

Why is that?

S3 bucket website routing

It turned out that S3 uses the Host header instead of the URL to find the bucket. And because the Host header flows through Cloudflare without any changes, S3 website tries to get the bucket using the domain.

When a browser makes a request, the domain and the Host header matches:

$ curl -H "Host: mywebsite-cloudflare-test.s3-website-eu-west-1.amazonaws.com" http://mywebsite-cloudflare-test.s3-website-eu-west-1.amazonaws.com
<!doctype html>

<html lang="en">
<head>
	<meta charset="utf-8">
	<link rel="stylesheet" href="style.css">

</head>

<body>
	Hello world!
</body>
</html>

But in case they are different, S3 uses the header to find the bucket:

$ curl -H "Host: otherwebsite" http://mywebsite-cloudflare-test.s3-website-eu-west-1.amazonaws.com
<html>
<head><title>404 Not Found</title></head>
<body>
<h1>404 Not Found</h1>
<ul>
<li>Code: NoSuchBucket</li>
<li>Message: The specified bucket does not exist</li>
<li>BucketName: otherwebsite</li>
</ul>
<hr/>
</body>
</html>

In fact, the bucket name in the website URL can be omitted. As long as the region is the correct one S3 will find the bucket:

$ curl -H "Host: mywebsite-cloudflare-test" http://s3-website-eu-west-1.amazonaws.com
<!doctype html>

<html lang="en">
<head>
  <meta charset="utf-8">
  <link rel="stylesheet" href="style.css">

</head>

<body>
    Hello world!
</body>
</html>

What it has to do with Cloudflare?

In "Proxied" mode, Cloudflare sends a request to the CNAME domain but it uses the same Host header as it got from the visitor's browser.

Because of this behavior, you need to use the same name for the bucket as the domain. But buckets must be globally unique, which means it might be taken from you or you need to use a different one for other reasons.

Cloudflare supports a page rule to rewrite the Host header, but that is only available on their paid plans.

Fortunately, Cloudflare also supports computing on the edge, called Workers, and that allows fine control over how the website is fetched. And to a limit, workers are available on the free plan.

Bucket website

To set up a bucket website, you need to create a bucket, upload the files, then allow anonymous access. For more details, see the previous article.

Then activate the website hosting and take note of the Endpoint URL:

Cloudflare DNS

Now that the S3 side is set up, configure Cloudflare. The DNS and the HTTPS settings are the same as before.

Cloudflare Worker

The last part is to create a new worker that fetches the bucket website contents with the correct Host header. This solution is based on this comment.

Go to the Workers page on Cloudflare and create a new one with this code:

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request));
})

async function handleRequest(request) {
  return fetch(
    "http://mywebsite-cloudflare-test.s3-website-eu-west-1.amazonaws.com"
      + new URL(request.url).pathname,
    request);
}

This sends an unrelated request to the S3 bucket website which makes the Host header to be the bucket's name instead of the custom domain. Note that this is a rather simple implementation that only takes the request path into consideration and won't forward query parameters and cookies. This is fine for a static site hosted in S3 but it's not a universal solution.

Then create a new route and associate the worker with that:

Now when there is a request to the domain the worker fetches the correct bucket website:

CF workers free tier

Workers offer a great way to customize how Cloudflare proxy works, but they are not free. On the other hand, there is a generous free tier that should be enough for small sites.

In the free tier, the first 100k worker request per day is free, but it applies to the account, meaning if you have multiple sites their usage adds up. Also, keep in mind that it is per request, so if your site loads 10 CSS and JS files in addition to the HTML, a visit will consume 11 requests.

Apart from the daily limit, there is also a burst limit of 1000/min, which translates to ~17 request/sec if sustained for a longer time. This does not sound much, making the free tier only suitable for small sites.

To make the most of the free tier, make sure you minimize the amount of requests going to Cloudflare. Concatenating static files, revving, and utilizing client-side caches help a lot.

And when the free tier runs out you can always start paying based on usage. While it's not as attractive as a forever free plan without workers, but at least it scales with usage.

Conclusion

Cloudflare workers is an excellent feature as it allows arbitrary code to affect how routing works. With just some coding you can overwrite how it works with S3 bucket websites.

August 25, 2020
In this article