Stable S3 signed URLs
How to make signed URLs cacheable
Content distribution
URL signing is a way to provide controlled access to protected content. The backend contains custom code that decides whether a user can download a file and if the decision is positive then it signs a URL using a secret that only it knows, then returns the URL to the user. Then for the download the backend is not involved anymore: S3 checks the signature and then transfers the file to the client.
For example, an ecommerce site that sells digital products wants to distribute files only to users who bought those products. In that case, the backend is responsible for deciding whether to allow the download, but then the file itself is served by S3.
Signed URLs is a cornerstone of how serverless apps can handle files. As they need to follow the "quick and small" response model, returning a file of arbitrary size does not work. For example, a Lambda function has limits on the response size and that would impose limits on the maximum size of the returned files. Using signed URLs allows a serverless app to handle any file.
Signed URLs
A signed URL includes several query parameters:
https://terraform-20230713081606344000000003.s3.eu-central-1.amazonaws.com/test.jpg
?X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD
&X-Amz-Credential=AKIAUB3O2IQ5EZ3CFTGG%2F20230713%2Feu-central-1%2Fs3%2Faws4_request
&X-Amz-Date=20230713T094500Z
&X-Amz-Expires=900
&X-Amz-Signature=ee4c48e11d1250c8e02b1615f11f217c6454c7a2701c4b82abb620ef271163a3
&X-Amz-SignedHeaders=host
&x-id=GetObject
The most important is the X-Amz-Signature
. Calculating it requires a secret that is only known to the backend and it is checked by S3. Because of this, a
signed URL can not be forged.
How to sign a URL is well documented, but in practice it's mostly calling a function in the SDK:
import {getSignedUrl} from "@aws-sdk/s3-request-presigner";
import {S3Client, GetObjectCommand} from "@aws-sdk/client-s3";
return getSignedUrl(new S3Client(), new GetObjectCommand({
Bucket,
Key,
}));
Caching
By default, signed URLs are mostly unique, so if the backend signs two URLs for the same file they will still be different.
Compare these two signed URLs:
https://terraform-20230713081606344000000003.s3.eu-central-1.amazonaws.com/test.jpg
?X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD
&X-Amz-Credential=ASIAUB3O2IQ5NILQOP5T%2F20230713%2Feu-central-1%2Fs3%2Faws4_request
&X-Amz-Date=20230713T094506Z
&X-Amz-Expires=900
&X-Amz-Security-Token=IQoJb3JpZ2...
&X-Amz-Signature=0c127dbcf05090bdb74f78e2e8bbdf3e7edba7cdd8fac4f3d7c125a406ca4df8
&X-Amz-SignedHeaders=host
&x-id=GetObject
https://terraform-20230713081606344000000003.s3.eu-central-1.amazonaws.com/test.jpg
?X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD
&X-Amz-Credential=ASIAUB3O2IQ5DYKXZC7R%2F20230713%2Feu-central-1%2Fs3%2Faws4_request
&X-Amz-Date=20230713T094507Z
&X-Amz-Expires=900
&X-Amz-Security-Token=IQoJb3JpZ...
&X-Amz-Signature=2b9e8b177b4388ba5200b820057f5c8750bfa0177c852ccc3bd54ddfb7de8efc
&X-Amz-SignedHeaders=host
&x-id=GetObject
Notice that their signatures are different. This is usually not a problem as downloading a particular file is a one-time operation. In the ecommerce example, a user might want to download the file after buying the product, and maybe a few times after that, but the assumption is that all these downloads get the full contents.
But unique signed URLs defeat caching, both in the browser and on the edge. That means if the file is downloaded many times all of those requests will need to go to S3 and get all the bytes. And sometimes it can be a problem.
For example, if the user's avatar image is shown on every page in a webapp and is served via a signed URL then the image will be downloaded over and over again as the user navigates the app. Another example is a photo-sharing app that shows the private photos of the user using signed URLs. As those images can be rather big, downloading them many times wastes a lot of bandwidth.
Stable signed URLs
To make signed URLs cacheable we need to make them stable. On the other hand, we can't make them too stable as that would defeat the access control mechanism. S3 only checks the signature and the expiration time, so the longer the signed URL expires the longer the content will be available after making it private. Moreover, there is a limit of 7 days a single signed URL can be valid.
For caching, it's enough to make the generated URLs stable for several minutes or hours. That way, if the user navigates the app then the files won't be downloaded over and over again.
Let's see the variable parts of the URL and how to stabilize them!
Stable date
The most dynamic part of the URL is the X-Amz-Date
as that changes every second. This is the time of the signature and the validity period starts from
this point in time.
To make it more stable, we can round the signature time using the signingDate
parameter:
const roundTo = 5 * 60 * 1000; // 5 minutes
return getSignedUrl(new S3Client(), new GetObjectCommand({
Bucket,
Key,
}), {signingDate: new Date(Math.floor(new Date().getTime() / roundTo) * roundTo)});
The above code rounds the date to the last 5-minute mark, so the X-Amz-Date
changes only every 5 minutes instead of every second. This simple technique
offers a highly customizable way to define when new URLs will be generated.
This rounding has implications on the expiration time though. As it effectively backdates the signature the expiration time also comes closer to the actual signing time. In the above case the effective expiration is between 10 and 15 minutes.
Stable credentials
With the X-Amz-Date
fixed, the other changing part is the X-Amz-Credential
. Compare the values in the two URLs:
...
&X-Amz-Credential=ASIAUB3O2IQ5NILQOP5T%2F20230713%2Feu-central-1%2Fs3%2Faws4_request
...
&X-Amz-Credential=ASIAUB3O2IQ5DYKXZC7R%2F20230713%2Feu-central-1%2Fs3%2Faws4_request
They are clearly different, but it's not easy to see why. After all, the same backend signed both URLs.
In this case, the backend is a Lambda function and it uses its execution role to sign URLs. This is the usual serverless solution: the Lambda runtime can run multiple instances of the same functions to quickly respond to changes in load.
These instances then use the same role, but they don't share the session. The Lambda runtime assumes the role multiple times resulting in the different credentials used by the different Lambda instances. Depending on which instance signs the URL, the credential will be different.
To stabilize this, we need to opt out of IAM roles for signing and instead generate an access key for an IAM user and use that.
Note that it goes against the general best practice on not to use permanent credentials and there is a great deal of special care needed to handle the Secret Access Key. It is possible to do it securely though as we've covered in this article. As we've discussed there, the safest way is to store it in an SSM parameter.
With the Secret Access Key stored in SSM, the backend needs permissions to read it:
data "aws_iam_policy_document" "backend" {
# ...
statement {
actions = [
"ssm:GetParameter",
]
resources = [
module.access_key.parameter_arn
]
}
}
Then the best practice is to implement caching so that the parameter is not fetched for every request:
const cacheOperation = (fn, cacheTime) => {
let lastRefreshed = undefined;
let lastResult = undefined;
let queue = Promise.resolve();
return () => {
const res = queue.then(async () => {
const currentTime = new Date().getTime();
if (lastResult === undefined || lastRefreshed + cacheTime < currentTime) {
lastResult = await fn();
lastRefreshed = currentTime;
}
return lastResult;
});
queue = res.catch(() => {});
return res;
};
};
const getSecretAccessKey = cacheOperation(() => new SSMClient().send(new GetParameterCommand({Name: process.env.SECRET_ACCESS_KEY_PARAMETER, WithDecryption: true})), 15 * 1000);
Finally, use the credentials of the IAM users to sign the URL:
const accessKeyId = process.env.ACCESS_KEY_ID;
const secretAccessKey = (await getSecretAccessKey()).Parameter.Value;
return getSignedUrl(new S3Client({
credentials: {
accessKeyId,
secretAccessKey,
},
}), new GetObjectCommand({
Bucket,
Key,
}), {signingDate: new Date(Math.floor(new Date().getTime() / roundTo) * roundTo)});
The resulting URL is stable and only changes when the signingDate
changes:
https://terraform-20230713081606344000000003.s3.eu-central-1.amazonaws.com/test.jpg
?X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD
&X-Amz-Credential=AKIAUB3O2IQ5EZ3CFTGG%2F20230713%2Feu-central-1%2Fs3%2Faws4_request
&X-Amz-Date=20230713T094500Z
&X-Amz-Expires=900
&X-Amz-Signature=ee4c48e11d1250c8e02b1615f11f217c6454c7a2701c4b82abb620ef271163a3
&X-Amz-SignedHeaders=host
&x-id=GetObject