How to use cfn-init to set up EC2 instances with CloudFormation
Install packages, deploy files, run commands, and start services in a new instance right from the CloudFormation template
Configuring EC2 instances
CloudFormation is AWS's Infrastructure-as-Code tool that lets you deploy multiple resources based on a template file that you write. It can create and manage nearly every type of resources in AWS, such as VPCs, Lambda functions, DynamoDB tables, and EC2 instances. A well-architected application is ready to run after its deployment with CloudFormation, without requiring any additional manual steps.
This works for serverless applications, permissions, and storage, but initializing EC2 instances does not fit into this pattern that easily. For example, a webserver needs to install packages, download additional files, and start services. There are tools, such as Ansible and RunCommand that solve this problem, but they are more advanced tools. In many cases, I don't want a separate step to set up something simple but want to use the stack right after CloudFormation is done deploying it.
EC2 provides the so-called "user data" which is a script that the instance runs when it starts. As you can write arbitrary code here, it allows setting up the instance with all the required packages and services.
But EC2 user data is hard to use. You can edit it only when the instance is stopped, so you need to wait a lot of time to try out changes. This makes a bad developer experience. Then writing things as a bash script usually results in a mess. There are ways to bring a structure into a script, but that is more of an ad-hoc solution.
Fortunately, there is a solution that integrates with CloudFormation well and yields a more usable structure: cfn-init.
Cfn-init
Cfn-init is a set of helper scripts that interface with the CloudFormation stack and does two things. First, it reads how the instance should be initialized from the CloudFormation stack and executes it. Second, it signals CloudFormation whether it's finished or there was an error, so it fits into the lifecycle of the stack. CloudFormation waits until the initialization is complete, and it also rolls back if there was an error.
It looks like this:
Resources:
Instance:
Type: AWS::EC2::Instance
# create a signal for cfn-signal
CreationPolicy:
ResourceSignal:
Timeout: PT15M
# defines how to set up the instance
Metadata:
"AWS::CloudFormation::Init":
configSets:
setup:
- install_server
install_server:
packages:
yum:
httpd: []
files:
"/var/www/html/index.html":
content: |
Hello world!
mode: 000644
owner: root
group: root
services:
sysvinit:
httpd:
enabled: true
ensureRunning: true
Properties:
# ...
UserData:
# runs the cfn-init scripts
Fn::Base64:
!Sub |
#!/bin/bash -xe
yum update -y aws-cfn-bootstrap
/opt/aws/bin/cfn-init -v --stack ${AWS::StackName} --resource Instance --configsets setup --region ${AWS::Region}
/opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource Instance --region ${AWS::Region}
It builds on the user data to initialize and kick off the init process. It does 3 things:
- Installs the
aws-cfn-bootstrap
script - Runs the cfn-init magic
- Signals to CloudFormation the result of step 2
The advantage of this structure is that the user data is static. You don't need to change it when you want to install more services. The arguments of the
functions are either supplied from CloudFormation directly (--stack ${AWS::StackName}
, --region ${AWS::Region}
), or references something in the
template (--resource Instance
, which is the resource name, or the --configsets setup
).
For step 3 to work, CloudFormation needs to know that an initialization signal is coming. This is the CreationPolicy
part. Without this, CloudFormation
waits only until the instance resource is created (the instance is running) and when that happens it marks it as complete.
Configs
The primary building block is a config. It defines the packages, groups, users, sources, files, commands, and services, in this order. The most important from this list are:
- packages define which package to install with a package manager (such as yum or apt).
- sources download compressed files and extract them to a directory. You can download repositories straight from GitHub this way.
- files create files with defined content.
- commands run arbitrary commands, such as building an app or copying files.
- services instruct the system service manager (systemd, usually) to start a service and to keep it running
The documentation page is especially helpful when you write cfn-init scripts. Take extra caution for the ordering in a config as that's easy to mess up.
In the example above, this is the install_server
config definition:
Resources:
Instance:
Metadata:
"AWS::CloudFormation::Init":
# ...
install_server:
# install Apache
packages:
yum:
httpd: []
# put an index.html to /var/www/html
files:
"/var/www/html/index.html":
content: |
Hello world!
mode: 000644
owner: root
group: root
# start Apache
services:
sysvinit:
httpd:
enabled: true
ensureRunning: true
This config installs the httpd
package (Apache), puts a file to /var/www/html/index.html
, and instructs sysvinit to start the httpd
service.
The result is a webserver running that serves the Hello world! file.
ConfigSets
ConfigSets are the configuration that define which configs are run and in what order. Each config runs into completion before cfn-init moves on to the next one.
In the example above, the setup
configSet defines that the install_server
config is run:
Resources:
Instance:
# ...
Metadata:
"AWS::CloudFormation::Init":
configSets:
setup:
- install_server
install_server:
# ...
When the cfn-init
command is called, you need to specify which configSet to run: --configsets setup
.
How it works step-by-step
Let's see what happens in the above example!
First, CloudFormation creates an EC2 instance, because there is a Resource with the Type of AWS::EC2::Instance
:
Resources:
Instance:
Type: AWS::EC2::Instance
When the instance is started, it runs the user data:
Resources:
Instance:
Properties:
# ...
UserData:
Fn::Base64:
!Sub |
#!/bin/bash -xe
yum update -y aws-cfn-bootstrap
/opt/aws/bin/cfn-init -v --stack ${AWS::StackName} --resource Instance --configsets setup --region ${AWS::Region}
/opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource Instance --region ${AWS::Region}
When the cfn-init
is run, it needs to first identify the stack the instance is part of. For this, it needs the --region
, the --stack
and the
--resource
. The first two are parameters are from CloudFormation, while the third one is the resource name.
With these parameters, cfn-init
grabs the metadata defined in the CloudFormation template.
Then the remaining parameter identifies which configSet to run. In this case, the --configsets setup
, which means the setup
configSet will be run.
It then looks at the configSets part and finds the setup
:
Resources:
Instance:
Metadata:
"AWS::CloudFormation::Init":
configSets:
setup:
- install_server
The first config in the set is the install_server
. So it moves on and intalls the packages:
Resources:
Instance:
Metadata:
"AWS::CloudFormation::Init":
install_server:
packages:
yum:
httpd: []
Then puts the files
:
Resources:
Instance:
Metadata:
"AWS::CloudFormation::Init":
install_server:
files:
"/var/www/html/index.html":
content: |
Hello world!
mode: 000644
owner: root
group: root
And finally, starts the httpd
service:
Resources:
Instance:
Metadata:
"AWS::CloudFormation::Init":
install_server:
services:
sysvinit:
httpd:
enabled: true
ensureRunning: true
And finally, since the cfn-init
is complete, the user data calls the cfn-signal
to let CloudFormation know that the instance is done:
Resources:
Instance:
CreationPolicy:
ResourceSignal:
Timeout: PT15M
Properties:
# ...
UserData:
Fn::Base64:
!Sub |
# ...
/opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackName} --resource Instance --region ${AWS::Region}
Because of the CreationPolicy
, CloudFormation creates a signal that the cfn-signal
can call. The -e $?
is the exit code of the previous
command, which is the cfn-init
, and the other arguments are to identify which signal to call.
Ordering
Knowing in which order cfn-init runs the parts of the configuration is essential, as it does not follow the order in which the elements appear in the template.
First, the configSet defines the order of the configs. Whichever appears later in the list will be run later.
Second, inside a config the elements are run in an ordering that is defined by AWS. For example, it always installs packages before running the commands. It's a best practice to keep the template in the same order as what the cfn-init script will run them.
And third, there is ordering inside a config step too, if it makes sense. It's especially important with commands as those will be run in alphabetical ordering. It's a best practice to start them with 01, 02, and so on.
For example, installing NodeJs requires two configs because it needs to run a command first and install a package next:
Resources;
Instance:
Metadata:
"AWS::CloudFormation::Init":
configSets:
setup:
- add_nodejs_repo
- install_nodejs
add_nodejs_repo:
commands:
01_add_repo:
command: |
curl -sL https://rpm.nodesource.com/setup_14.x | bash -
install_nodejs:
packages:
yum:
nodejs: []
"gcc-c++": []
make: []
The setup
configSet ensures that the add_nodejs_repo
with the command will be the first one to run and the install_nodejs
with the package
will be the second.
Tips
When you write the cfn-init scripts it's better to comment out the CreationPolicy
from the template. This is because if you make a mistake CloudFormation
will roll back the whole stack, destroying the instance. Just don't forget to uncomment it when you're done.
You can run the cfn-init
command if you SSH into the instance too, which makes a quick deploy-try out process. Just right click on the instance and get
the user data and copy-paste the cfn-init
part.
You can inspect the logs at /var/log/cfn-init.log
and /var/log/cfn-init-cmd.log
. These are extremely useful for debugging.
It's better to keep the template in a way that reflects the actual ordering of the elements. Keep the configs according to the configSet, and the parts of the config according to their implicit ordering. Also prefix the commands with 01_, 02_, and so on, just to make it easy to see which comes after which.
The WordPress AWS-provided template is a great reference point: https://s3.eu-west-1.amazonaws.com/cloudformation-templates-eu-west-1/WordPress_Single_Instance.template.
The command accepts a test
argument that defines whether the command needs to be run or not. It's especially useful to run something only once, such as
adding test data to the database only if it's empty. The test
checks the exit code of the script, for example docker-compose --version
will succeed
only if docker compose is installed. To invert the exit code, use ; (( $? != 0 ))
, such as test: docker-compose --version >/dev/null 2>&1; (( $? != 0 ))
runs the command only if docker compose is not installed.
Conclusion
The cfn-init scripts allow an easy way to define how to initialize an EC2 instance right in CloudFormation. It offers a modularized and more declarative way to manage packages and run commands than the user data.