How to use S3 signed URLs with CloudFront

S3 signed URLs offer a flexible way to share private content. Learn how to use them through CloudFront

8 mins
I have a lot of challenges when it comes to AWS, but I bet your pain points are entirely different than mine. I'd love to hear what keeps you up at night. It would be great to hear from you by filling out this form. Thanks in advance!

Motivation

S3 signed URLs provide fine control over who can access private resources. It is flexible regarding both the permission side and also on the ease of automation.

CloudFront signed URLs on the other hand use a different mechanism to selectively grant access to resources and it is hard to deploy and maintain. But CloudFront provides several advantages over S3: It supports edge locations to reduce latency, HTTP/2 support for request multiplexing, and a common domain for static files and dynamic resources.

I wondered, is there a way to get all the benefits of using CloudFront while also having the flexibility of S3 signing?

As it turned out, it is indeed possible.

Setup

For a test setup, on the CloudFront side, I created a distribution, and selected the S3 origin with a bucket. One important part is to select “Forward all query params, cache based on all” on the Query String Forwarding and Caching part, as S3 signed URLs utilize query parameters for the signature.

Also make sure that you don’t give access to the bucket to CloudFront. We want to make sure the objects are only accessible via S3 presigned URLs, and those are checked on the S3 side, not on CloudFront’s.

For signing URLs, do the usual steps of creating an IAM user, giving it access to the bucket, and generating an access key which we’ll use for signing.

Lastly, prepare the backend, so that you can get a signed URL for an object stored in the bucket. My bucket is called cf-signed-urls-test, and the test object I uploaded is test.txt.

The resulting signed URL for the S3 object is like this:

https://cf-signed-urls-test.s3-eu-west-1.amazonaws.com/test.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAINGQCJMAROJWPJ3A%2F20180930%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Date=20180930T145820Z&X-Amz-Expires=900&X-Amz-Signature=ebb4245bcd774a678c0685419ab5b4012845f61cea6aa2092661f89f3948cf8b&X-Amz-SignedHeaders=host

Then replace the domain with the CloudFront one, leaving everything else:

https://d2tphqjsvmbzmt.cloudfront.net/test.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAINGQCJMAROJWPJ3A%2F20180930%2Feu-west-1%2Fs3%2Faws4_request&X-Amz-Date=20180930T145820Z&X-Amz-Expires=900&X-Amz-Signature=ebb4245bcd774a678c0685419ab5b4012845f61cea6aa2092661f89f3948cf8b&X-Amz-SignedHeaders=host

Following these steps and trying out the resulting URL, you’ll see that it does not work.

Why?

It required some investigation, but it turned out that the origin bucket selector on the AWS Console does not include the bucket’s region in the URL, while the Javascript SDK signs the URL that way.

Observe the difference:

The CloudFront origin domain, if you just click on the bucket name:

cf-signed-urls-test.s3.amazonaws.com

And the S3 signed URL:

cf-signed-urls-test.s3-eu-west-1.amazonaws.com

The solution: Manually change the origin domain name to include the region, and you are all set. When you try out the URL, it will go through.

X-Amz-Cf-Id

If you get an error saying that the X-Amz-Cf-Id is not signed:

<Error>
    <Code>AccessDenied</Code>
    <Message>There were headers present in the request which were not signed</Message>
    <HeadersNotSigned>x-amz-cf-id</HeadersNotSigned>
    <RequestId>...</RequestId>
    <HostId>...</HostId>
</Error>

In this case CloudFront does not recognise the origin as S3 but a custom origin. To aid debugging it appends this Id to the request, but that also invalidates the signature. Double check the Origin URL and make sure CloudFront reports the correct one.

Thanks to Lasse for debugging it!

Python SDK

Lasse, a fellow AWS architect, contacted me as the Python SDK works differently than the Javascript one. Here are his findings:

When you provide the region to the client, it still generates a non-regional URL:

bucket = "my-bucket"
region = "eu-central-1"
client = boto3.client("s3", region_name=region)

The generated URL will be https://my-bucket.s3.amazonaws.com/.... This is a problem as non-regional URLs suffer from several problems.

First, the DNS propagation needs time, so that the URL does not work right after creating the bucket. And it can take quite some time, it’s not the usual eventual consistency that needs just a few seconds.

Second, not all regions are guaranteed to work. In future regions this feature is disabled and while it should work for existing ones, your code will be less portable.

As Lasse found out, providing the endpoint_url makes the client generate path address style URLs:

client = boto3.client("s3", region_name=region, endpoint_url="s3.eu-central-1.amazonaws.com")

The result is https://s3.eu-central-1.amazonaws.com/my-bucket/.... This is also an addressing style that AWS plans to discontinue.

To generate regional signed URLs, set the addressing_style to virtual, which magically also makes the SDK generate regional URLs:

from botocore.config import Config
config = Config(region_name=region, s3={"addressing_style": "virtual"})
client = boto3.client("s3", config=config)

The result is https://my-bucket.s3.eu-central-1.amazonaws.com/..., which is the same as the Javascript SDK generates.

Thanks to Lasse for providing all these info!

Caching

Since the access control is moved to the origin, you need to be mindful about the caching settings. By default, CloudFront will cache the response longer than the validity of the signature. In effect, it is still accessible after the signed URL is expired.

Solution #1: disable the cache

The easiest solution is to disable the caching on the CloudFront side, so every request goes directly to S3 which in turn will check the signature. To do this, set the custom TTL to 0 in the behavior.

The problem with this is that it will also effectively nullify the edge location caching.

Solution #2: Adapt caching

While disabling the CloudFront cache works, it is not optiomal. It’s better to keep it enabled, but set it up in a way that prevents (or at least limits) stalled objects.

For a better solution, you need to change both the S3 signing code and the CloudFront caching settings.

Step 1: Round the signature expiration

By default, the parameters of the signed URL is dependent on when the signing happens. This defeats caching, as every user gets a different URL.

But you can do some rounding, so that different users get the same URL for the same resource. Expiration for an S3 signed URL is made up of two parts:

  • First, there is the X-AMZ-Date which is when the signing happened
  • Then there is the X-AMZ-Expires which determines how many seconds the signature is valid

If you round the current time to the last half-hour and set the expiration to 1 hour, there will be one URL for every 30 minutes, which makes it easily cacheable. The effective expiration time will vary between 30 and 60 minutes.

To do this, you need to make some changes to how you sign the requests. There is a library called timekeeper that provides a function to freeze time in an earlier instant.

This code rounds the date to the last half-hour and uses a constant expiration time of one hour:

const tk = require("timekeeper");

... // initialize the AWS SDK and the S3 object

const getTruncatedTime = () => {
	const currentTime = new Date();
	const d = new Date(currentTime);

	d.setMinutes(Math.floor(d.getMinutes() / 30) * 30);
	d.setSeconds(0);
	d.setMilliseconds(0);

	return d;
};

const params = {Bucket: '<bucket>', Key: '<key>', Expires: 3600};
const url = tk.withFreeze(getTruncatedTime(), () => {
	return s3.getSignedUrl( "getObject", params);
});

Step 2: Set up CloudFront caching

Now that you have cache-friendly URLs returned from the backend, you can enable caching, but limit it to let’s say 30 minutes.

This means that if there is a signed URL that was created at 13:00, and accessed at 13:59, it won’t be accessible after 14:30. Effectively, with this setup, no single URL is accessible for more than 90 minutes, which is a viable compromise. By changing the times the signature is valid and how long it is cached, you can fine-tune this to your specific needs.

Conclusion

While you could use S3 presigned URLs directly to provide access to private content, using CloudFront brings several advantages. You’ll have reduced latency, HTTP/2, and a single domain, which means no CORS problems and a simplified CSP.

But you need to be aware of caching as that could defeat the short expiration times of signed URLs. The proposed solution above solves this problem while retaining all benefits of distributing content through CloudFront.

15 November 2018