CloudFront is a proxy between the visitors and the backend servers. When it gets a request, it forwards to one of the origins, then returns the response to the visitor. Where this process gets complicated is that CloudFront also transforms the request. It replaces the path, the headers, cookies, and the query parameters, which makes it hard to pinpoint the problem when all you see is an error message.
Unfortunately, CloudFront does not provide any built-in tools to inspect what goes on the wire. It offers logs, but those do not contain the details sent to the origin.
If the origin you are trying to debug writes detailed logs then you can consult them to see what is the problem. But usually, that is not the case, and this approach depends on the origin itself.
An alternative solution is to use a service that provides a public endpoint and request logging, then point the CloudFront distribution to that endpoint. This gives visibility into what the origin request looks like on the wire.
This article describes two solutions, one is a hosted tool, the other is self-hosted.
Webhook.site is a webhook tester that gives a publicly available URL and stores the requests sent to it. It offers a nice web UI.
When you go to the site it gives you a unique path under the webhook.site domain. This is the unique debug endpoint you can use:
The next step is to modify the CloudFront Origin to send requests to this endpoint. Change the Origin Domain Name, the Origin Path, and make sure to use HTTPS only:
Wait for the distribution to deploy, then send a request to the CloudFront domain to trigger an origin request:
The webhook.site webapp automatically refreshes the list with the new request and prints all sorts of useful information about it:
For example, notice the URL of the request, especially the path:
Or the headers in the request, which are quite different than what the browser sent to CloudFront. One header that can cause trouble is the host:
Of course, any kind of HTTP tester works, not just the browser:
curl request produces a similar origin request but is easier to configure and automate:
curl it's easy to add headers to the viewer request and see how they are forwarded to the origin.
As in illustration, let's add an Authorization header! This is one of the headers that CloudFront handles in a special way, based on the cache behavior settings.
Let's see how CloudFront forwards this header by default:
curl -H "Authorization: b" https://d6mq5inawh4fd.cloudfront.net/
No header reaches the origin:
This is the default behavior when this header is not included in the cache key. From the documentation:
GET and HEAD requests – CloudFront removes the Authorization header field before forwarding the request to your origin.
Let's change the Cache Based on Selected Request Headers to "All":
The same viewer request forwards a lot more headers this time, and the Authorization is among them:
But note that this setting also overwrites the Host header, which is now the domain of the CloudFront distribution instead of the origin. This can cause problems with origins that use the Host header to route between services, such as API Gateway.
Debugging cookies and query parameters
Similarly, you can debug how cookies and query parameters are forwarded too. These are also changed by CloudFront, depending on the cache behavior settings:
Let's send a request with a cookie and a query parameter and see what reaches the origin!
curl --cookie "a=b" "https://d6mq5inawh4fd.cloudfront.net/?param=b"
With the above configuration, both the cookie and the query parameter reaches the origin:
Forwarding everything is a safe way to make sure CloudFront won't mess up the requests sent to the origin, but it also makes edge caching a lot less effective. As a rule of thumb, forward only the parts of the request that provides a different response.
Locally with rbaskets and ngrok
Instead of using a managed service you can also start a request interceptor locally. I found Request Baskets easier to start with Docker, and it provides similar functionality.
To start the service and tie it to a port (55555 in this case), use this docker command:
$ docker run -p 55555:55555 darklynx/request-baskets
Executing: /bin/rbaskets -l 0.0.0.0 -db bolt -file /var/lib/rbaskets/baskets.db
2020/09/07 07:14:52 [info] generated master token: HwA8MyXrHmYh6urV6ZItMN-wjDobCulqIaT5Pzrtmdnx
2020/09/07 07:14:52 [info] service version: v1.0.0-23-g1d3c492 from commit: 1d3c492 (1d3c492299010246dbe17f0c6834270537ff39c7)
2020/09/07 07:14:52 [info] using Bolt database to store baskets
2020/09/07 07:14:52 [info] Bolt database location: /var/lib/rbaskets/baskets.db
2020/09/07 07:14:52 [info] HTTP server is listening on 0.0.0.0:55555
It starts the service locally, which might be sufficient if the machine you are using is accessible from the internet. If it is, then point the CloudFront origin to this service and you can start sending requests.
If the machine is behind a firewall or a NAT you can use a tunnel, such as ngrok. It exposes a local port under a publicly available domain name and it uses a connection that originated from your machine so it can bypass firewalls.
You don't need to install anything, as you can run ngrok with npx:
$ npx ngrok http 55555
ngrok by @inconshreveable (Ctrl+C to quit)
Session Status online
Session Expires 7 hours, 59 minutes
Update update available (version 2.3.35, Ctrl-U to update)
Region United States (us)
Web Interface http://127.0.0.1:4040
Forwarding http://d0b629c4b31b.ngrok.io -> localhost:55555
Forwarding https://d0b629c4b31b.ngrok.io -> localhost:55555
Connections ttl opn rt1 rt5 p50 p90
0 0 0.00 0.00 0.00 0.00
Now when there is a request to the ngrok domain (
https://d0b629c4b31b.ngrok.io) it forwards it to the local port 55555, where the rbaskets service is listening.
With rbaskets, you need to create a basket first. Open the ngrok domain and use the Create button to generate an empty basket:
It gives a URL that you can use with CloudFront:
To point the distribution to this debug endpoint, modify the Origin Domain Name and the Origin Path. Since ngrok provides an HTTP as well as an HTTPS URL you can keep the Origin Protocol Policy on what you use for the real origin.
Finally, send a request to the distribution:
And inspect what reaches the origin:
When CloudFront forwards a request to the origin it makes modifications to it. It changes the path, the header, the cookies, and the query parameters. This behavior can lead to strange errors.
When you can not inspect the logs on the origin, for example when it's a managed service, then it's back to trial and error as CloudFront does not provide any help to inspect what it sends.
To debug origin requests effectively, you can use a webhook tester that gives a publicly accessible endpoint and logs the requests it receives. When CloudFront uses this endpoint as the origin you'll see what is sent on the wire, making debugging possible.