When CloudFront goes to the origin to fetch a content it behaves like any other third-party service. In effect, by default, whatever CloudFront has access to, everyone else has too.
Usually, this is a problem.
How to protect the origin so that only CloudFront has access and can not be accessed directly?
It depends on the origin itself.
Protecting S3 buckets
This is the most straightforward case, and it is the most likely to pop up when you search for serving private content. You have a private bucket, but you want CloudFront to have access to it.
This is the only usage of the Origin Access Identity, which is a principal that you can reference in the bucket's resource policy.
This is how to make it work: You create an OAI, assign it to the distribution, then update the bucket policy. This gives access to that distribution while also keeping the bucket private.
Protecting API Gateways
This is when you have an API and you want to restrict access only to the distribution.
API Gateway supports restricting access by API-keys, which means there is a header that must be sent along with the request.
CloudFront can set a custom header when it accesses the origin so that it can send the
x-api-key with the secret value.
The API is kept private, as it requires the secret, but when accessed through CloudFront the request goes through.
The approach that does not work
VPC security is usually done by security groups, and you can configure them to allow only CloudFront IPs. There is an official list of IPs that the service use and also a notification so that you can handle the future changes to the list.
Actually, there is an official script to manage the settings of a security group to only allow CloudFront IPs.
It seems good, but unfortunately, this is not a good solution.
It makes sure that only CloudFront can access the instances, but does not limit it to your distribution. In effect, any CloudFront distribution can access it, even outside your account.
Which means, anybody can create a new distribution, point it to your instance, and have access.
The kinda expensive approach
If you have an ALB, then you can configure WAF to require a custom header to be present in order to allow the request. Then add that header to the distribution, and you are all set.
The downside? Cost.
WAF has a fixed monthly cost, and for small applications, it could make the majority of the bill.
See here for a guide on how to set it up.
Handle on your servers
If you have instances that you can configure, you can implement an API-key-like functionality. See the next section for the details.
Protecting a custom backend
When you have a custom backend, be it Apache, Nginx, or a NodeJs server that you can configure freely, then the easiest way is to implement the API-key-like functionality. Set a custom header on the CloudFront distribution with a secret value, then configure your backend to only allow requests with that header set.
This way, you give access to CloudFront, but forbid direct access.
The downside is that your backend will receive the direct request and need to handle it. It can result in increased server load, and in some cases, might allow easier overloading attacks.
Limiting public access to your services is a good practice, and if you have CloudFront, restricting direct access is desirable. It is especially important when you utilize signed URLs, but even if you don't, you should do it this way.