How to paginate the AWS JS SDK using async generators

The AWS SDK returns the lists in batches. Learn how to use async generators to hide this

6 mins
I have a lot of challenges when it comes to AWS, but I bet your pain points are entirely different than mine. I'd love to hear what keeps you up at night. It would be great to hear from you by filling out this form. Thanks in advance!

Paginated responses

The functions of the AWS SDK that return lists are paginated operations. That means that you get a limited number of elements in one call along with a token to get the next batch. This can be a problem if you are not aware of it, as you might get all the elements during development but your function might break in the future.

But while getting the first batch is just await lambda.listFunctions().promise(), paginating until all the pages are retrieved requires more work.

I was curious about how to make a function that returns all elements irrespective of how many pages it needs to fetch to get them while also keeping the await-able structure.

Pagination

In the case of Lambda functions, the lambda.listFunctions call returns a structure with a list of all your lambdas if you don’t have too many of them (at most 50):

const functions = await lambda.listFunctions().promise();
{
	Functions: [...]
}

To simulate a paginated case, you can set the MaxItems parameter:

const functions = await lambda.listFunctions({
	MaxItems: 1
}).promise();

This time a NextMarker is also returned, indicating there are more items:

{
	Functions: [...],
	NextMarker: ...
}

To get the next batch, provide the last marker:

const functions = await lambda.listFunctions({
	MaxItems: 1,
	Marker: functions.NextMarker,
}).promise();

Then do so until no NextMarker is returned.

A solution with async generators

Async generators are a relatively new feature of Javascript. They are like traditional generator functions, but they are async, meaning you can await inside them.

To collect all the Lambda functions no matter how many calls are needed, use:

const getAllLambdas = async () => {
	const EMPTY = Symbol("empty");

	const res = [];
	for await (const lf of (async function*() {
		let NextMarker = EMPTY;
		while (NextMarker || NextMarker === EMPTY) {
			const functions = await lambda.listFunctions({
				Marker: NextMarker !== EMPTY ? NextMarker : undefined,
			}).promise();
			yield* functions.Functions;
			NextMarker = functions.NextMarker;
		}
	})()) {
		res.push(lf);
	}

	return res;
}

// use it
const functions = await getAllLambdas();

Breaking it down

The most important thing is to keep track of the NextMarker returned by the last call and use that for making the next one. For the first call, Marker should be undefined, and to differentiate between the first and the last one (the one that returns no NextMarker), a Symbol is a safe option as it cannot be returned by the API.

const EMPTY = Symbol("empty");
let NextMarker = EMPTY;
while (NextMarker || NextMarker === EMPTY) {
	// Marker: NextMarker !== EMPTY ? NextMarker : undefined

	NextMarker = functions.NextMarker;
}

After the call, we need to yield the functions returned:

yield* functions.Functions;

The yield* makes sure that each element is returned as a separate value by the generator.

Finally, a for await..of loop collects the results and returns them as an Array:

const res = [];
for await (const lf of (async function*() {
	...
})()) {
	res.push(lf);
}
return res;

To use it, just call the function and wait for the resulting Promise to resolve:

const functions = await getAllLambdas();

Making it generic

The same Marker/NextMarker pattern appears throughout the AWS SDK. But unfortunately, the naming is different for different services. For example, getting the CloudWatch Logs log groups you need to provide a nextToken parameter. This makes it impossible to support all the listing functions with a generic wrapper.

Luckily, as the pattern is the same, we can make a wrapper function that handles everything but the naming:

const getPaginatedResults = async (fn) => {
	const EMPTY = Symbol("empty");
	const res = [];
	for await (const lf of (async function*() {
		let NextMarker = EMPTY;
		while (NextMarker || NextMarker === EMPTY) {
			const {marker, results} = await fn(NextMarker !== EMPTY ? NextMarker : undefined);

			yield* results;
			NextMarker = marker;
		}
	})()) {
		res.push(lf);
	}

	return res;
};

It follows the same structure as before, but it gets an fn parameter that does the actual API call and returns the list and the marker.

To get all the Lambda functions with this wrapper:

const lambdas = await getPaginatedResults(async (NextMarker) => {
	const functions = await lambda.listFunctions({Marker: NextMarker}).promise();
	return {
		marker: functions.NextMarker,
		results: functions.Functions,
	};
});

Adapted to the log groups:

const logGroups = await getPaginatedResults(async (NextMarker) => {
	const logGroups = await logs.describeLogGroups({nextToken: NextMarker}).promise();
	return {
		marker: logGroups.nextToken,
		results: logGroups.logGroups,
	};
});

Conclusion

Paginated outputs can be problematic if you are not aware of them and even then it’s not straightforward to use them in a simple way. Fortunately, with async generators, you can have the same 1-function-call await-able structure similar to other methods in the SDK.

30 July 2019