How to delay calling a Lambda function using Step Functions

Call a state machine with a wait task first

Author's image
Tamás Sallai
4 mins

Delaying

Lambda is event-driven, meaning it starts running when an event triggers it. Because of this, it does not support delaying that event and start processing at a later time.

While it could wait as part of its code, that is an expensive operation as you need to pay for the total runtime of the function multiplied by the memory allocation. Also, the maximum time a function can run is 15 minutes. In theory, you could use a small Lambda for waiting and call itself to extend the hard execution limit, it is more of an interesting development challenge than a realistic approach.

In AWS, you can use Step Functions to solve this instead. It is an orchestration service that allows you to define a workflow as a state machine and it can use other services and built-in operations. One of this built-in operation is the Wait task that stops the process for a configurable amount of time. And to make it a viable solution, contrary to a Lambda function, you don't pay for the total execution time.

A delayer state machine has a Wait and a Task (with a Lambda function) states:

The Wait task

This task stops the execution for some time. The interesting part here is how to define how long that pause should be.

The task supports a constant amount of seconds via the Seconds parameter, but it can also take a Path that is a reference to its input.

A state machine-based orchestration service is not just to define what tasks it needs to take but also what data flows between those tasks. In this simple example, the state machine gets some input and it passes that to the Wait task. That task can then use that to know how long to wait.

{
	"States": {
		"Wait": {
			"Type": "Wait",
			"SecondsPath": "$.delay_seconds",
			"Next": "..."
		},
	}
}

The parameter is the SecondsPath, and it defines a path: $.delay_seconds. This looks up the input object, finds the delay_seconds field, which is a number, and wait for that amount of seconds. The great thing about this structure is that the caller can define the delay and it's not hardcoded.

Calling the Lambda

The second state is a Task that calls a Lambda function. It needs its ARN, as usual, and it also needs the lambda:InvokeFunction permission.

To get that permission, the state machine uses an IAM role that trusts the states.amazonaws.com service and has an identity-based policy:

States definition

Combining all of the above, the final states definition:

{
	"StartAt": "Wait",
	"States": {
		"Wait": {
			"Type": "Wait",
			"SecondsPath": "$.delay_seconds",
			"Next": "Call function"
		},
		"Call function": {
			"Type": "Task",
			"Resource": "${aws_lambda_function.lambda.arn}",
			"End": true
		}
	}
}

It starts at the Wait task, that stops the execution based on the delay_seconds input. When it wakes up, it moves to the Call function state, and that calls the Lambda function. When that is finished the execution ends.

Testing

Let's use the AWS CLI to call the state machine and start an execution:

aws stepfunctions start-execution \
	--state-machine-arn $(terraform output -raw delayer_arn) \
	--input '{"delay_seconds": 5, "test":"value"}'

Since the example code is deployed via Terraform, the delayer_arn output is the ARN of the state machine. Then the --input defines an object with a delay_seconds and a test value.

The log shows that the lambda is called after a delay:

And the function gets all the inputs in the event argument:

Absolute timing

Instead of waiting for N seconds, the Wait task also supports a timestamp and it halts the execution until that is reached. That's the Timestamp and its TimestampPath counterpart option:

{
	"Wait": {
		"Type": "Wait",
		"TimestampPath": "$.at",
		"Next": "Call function"
	},
}

The timestamp format is ISO8601 but with some additional restrictions, such as it must not contain a numeric offset. I found that the %Y-%m-%dT%H:%M:%SZ works fine when the timestamp is in the UTC timezone.

To test it, call the state machine with a timestamp that is 10 seconds from now:

aws stepfunctions start-execution \
	--state-machine-arn $(terraform output -raw scheduler_arn) \
	--input "{\"at\": \"$(TZ='UTC' date --date="10 seconds" +"%Y-%m-%dT%H:%M:%SZ")\", \"test\":\"value\"}"

The inspector shows the steps taken:

And the logs show that the function was called after a delay:

January 25, 2022
In this article