AWS: How to query the available CPU credits for t2/t3 instances
How to know how much CPU power your burstable instance has
Motivation
The instances in the t2 and the t3 instance family, i.e. the instance types that start with either t2.
or t3.
, are burstable ones. That means the
instance collects CPU credits over time that can be used later. If you use less than what you get in the long run -- the baseline performance, between
5% and 40% of the available CPUs -- you won't even notice how this system works. But if you use more than that, the instance will either get throttled or
you'll get charged for the excess usage.
It is useful to know how much horsepower an instance has. But unfortunately, looking from the instance itself it is hard to know whether the instance is overused or not.
But as this data can be queried from CloudWatch, it is just a matter of scripting to get an up-to-date overview of the numbers.
CPU credit balance
The credit balance is automatically posted to CloudWatch by AWS under the AWS/EC2 namespace. It has a fixed interval of 5 minutes and it can not be lowered even when using detailed monitoring.
The metric that keeps track of the available credits is called CPUCreditBalance
.
When using the AWS CLI, the dimension
parameter can be used to filter for the instance ID, like this: --dimensions Name=InstanceId,Value=$INSTANCE_ID
.
But how do you know the instance ID? It can be queried from other APIs, or if you request it from the instance itself then you can use the metadata service:
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
.
Another set of required parameters is the date interval consisting of the start and the end time. Both are expected in ISO-8601 format, which is
readily available using date --iso-8601=seconds
.
The end-time
is the current time so that it returns the most recent data point: --end-time $(date --iso-8601=seconds)
.
The start-time
should be at least 5 minutes in the past: --start-time $(date --iso-8601=seconds -d "10 mins ago")
. It is better to specify a longer,
but not too long, interval and then sort and filter the results than use a too short one and risk not consistently getting back a data point. The only
exception is when the instance was recently started, in which case there will be no metrics available.
The --statistics
defines how the data points are aggregated. Since by specifying --period 300
there will be no aggregation, this call will get the
raw data points. Therefore it does not make a difference what you choose, apart from the counted ones. I'll use Maximum in these examples. Just make
sure to handle the appropriate one in the response.
The CloudWatch CLI call is then:
aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUCreditBalance \
--start-time $(date --iso-8601=seconds -d "10 mins ago") \
--end-time $(date --iso-8601=seconds) \
--period 300 \
--statistics Maximum \
--dimensions Name=InstanceId,Value=$INSTANCE_ID
This returns a JSON similar to this one:
{
"Label": "CPUCreditBalance",
"Datapoints": [
{
"Timestamp": "2019-04-13T08:59:00Z",
"Maximum": 104.93587318333333,
"Unit": "Count"
},
{
"Timestamp": "2019-04-13T09:04:00Z",
"Maximum": 105.02401506666666,
"Unit": "Count"
}
]
}
Notice that there are 2 data points in this example. With some jq
magic it is easy to extract the data with the highest timestamp:
| jq '.Datapoints | sort_by(.Timestamp | fromdateiso8601) | .[-1].Maximum'
The full script that returns the CPU credits for an instance is:
CPU_POSITIVE_BALANCE=$(aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUCreditBalance \
--start-time $(date --iso-8601=seconds -d "10 mins ago") \
--end-time $(date --iso-8601=seconds) \
--period 300 \
--statistics Maximum \
--dimensions Name=InstanceId,Value=$INSTANCE_ID |
jq '.Datapoints | sort_by(.Timestamp | fromdateiso8601) | .[-1].Maximum')
Surplus balance
If you have unlimited mode enabled then you can also account for the surplus balance. This is a similar metric, but instead of tracking the remaining CPU credits it tracks the negative balance.
The concept is the same, but the metric name is CPUSurplusCreditBalance
:
CPU_SURPLUS_BALANCE=$(aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUSurplusCreditBalance \
--start-time $(date --iso-8601=seconds -d "10 mins ago") \
--end-time $(date --iso-8601=seconds) \
--period 300 \
--statistics Maximum \
--dimensions Name=InstanceId,Value=$INSTANCE_ID |
jq '.Datapoints | sort_by(.Timestamp | fromdateiso8601) | .[-1].Maximum')
Then to calculate the real credit balance, subtract the surplus balance from the positive one. This can be safely done as one of the two metrics is always 0.
CPU_BALANCE=$(echo "$CPU_POSITIVE_BALANCE - $CPU_SURPLUS_BALANCE" | bc)
Maximum CPU credits
The maximum amount of CPU credits affects how much the instance can accumulate as well as when you'll be charged for the surplus credits. I consider it vital info.
The maximum value is how many credits the instance gets in 24 hours. There are tables in the AWS docs, but I couldn't find them in a parseable format.
To remedy this, I made a JSON file that you can use to query this data. If you see any discrepancy between the docs and the JSON, please open a PR.
With this repository in place you can use it to get the hourly collected credits for burstable instance types. To calculate the maximum is just a matter of multiplication:
CPU_CREDITS_PER_HOUR=$(curl -s https://sashee.github.io/aws-data/burstable_instances_cpu_credit_per_hour.json | \
jq --arg instanceType "$INSTANCE_TYPE" '.[$instanceType]')
MAX_CREDITS=$(echo "$CPU_CREDITS_PER_HOUR * 24" | bc)
Where to get the instance type? If you use the metadata service: INSTANCE_TYPE=$(curl -s http://169.254.169.254/latest/meta-data/instance-type)
.
Printing
Now that you have both the CPU_BALANCE
and the MAX_CREDITS
, it is time to print them:
printf "%.0f/%d" "$CPU_BALANCE" "$MAX_CREDITS"
This will print an informative status, for example:
105/288
Conclusion
With some scripting you can get the up-to-date status of your instance. It can be easily integrated with tmux, zsh, or any other tool you use. This can be a convenient indicator where you stand in terms of bursting capabilities.