Efficient multi-level pagination in GraphQL

How to implement pagination in nested structures

Author's image
Tamás Sallai
5 mins

Paginating responses

Pagination is necessary almost every time when an endpoint returns a list of items. While there are exceptions to this, for example when you know the number of returned items has an upper bound, it is bad API design to crash the API when it tries to return too many values. This is especially a problem with serverless applications as they enforce the "quick and small" approach by limiting the running time and the response size.

The solution is, of course, pagination. The API returns a limited number of items and the client can fetch more pages in subsequent queries. For example, a frequently used approach in AppSync is to return a paginated type that has a list and a token:

type PaginatedUsers {
	users: [User!]!
	nextToken: String
}

type Query {
	allUsers(nextToken: String): PaginatedUsers!
}

The nextToken is both an input and an output here. The first page gets no input token:

query MyQuery {
	allUsers {
		users {
			id
		}
		nextToken
	}
}

Then the response may or may not include an output nextToken. If it does, the next query's input token will be this value:

query MyQuery {
	allUsers(nextToken: "...") {
		users {
			id
		}
		nextToken
	}
}

Then the client fetches pages up until there is no output token anymore. And that means it got all results from the API. All while individual responses are small and constrained.

Multi-level pagination

Pagination affects all types of APIs but GraphQL adds an extra touch here: pagination on multiple levels.

Let's say a photo-sharing app has users and they have images with a schema that looks like this:

type User {
	id: ID!
	images(nextToken: String): PaginatedImages!
}

type PaginatedUsers {
	users: [User!]!
	nextToken: String
}

type PaginatedImages {
	images: [Image!]!
	nextToken: String
}

type Image {
	url: AWSURL!
}

type Query {
	allUsers(nextToken: String): PaginatedUsers!
	user(id: ID!): User!
}

Notice that getting all images for all users requires two layers of pagination:

  • The allUsers returns a PaginatedUsers type
  • Then the User.images returns a PaginatedImages

This means a query that fetches users and also their images might get a response with multiple pagination tokens:

query MyQuery($nextToken: String) {
	allUsers(nextToken: $nextToken) {
		users {
			id
			images {
				images {
					url
				}
				nextToken
			}
		}
		nextToken
	}
}

And a response with 2 tokens:

{
	"data": {
		"allUsers": {
			"users": [
				{
					"username": "User 2",
					"id": "id2",
					"images": {
						"images": [
							{
								"url": "...",
							}
						],
						"nextToken": "token1"
					}
				}
			],
			"nextToken": "token2"
		}
	}
}

What a client should do is to continue pagination on all tokens. And to implement an efficient algorithm for it turns out to be quite complicated.

Problem: Nested pagination changes the query

Pagination usually means sending the same query multiple times. But not here: to fetch the next page of images the query should not start from the allUsers again. Instead, it should fetch just the one specific user with the images:

query MyQuery($id: ID!, $nextToken: String) {
	user(id: $id) {
		images(nextToken: $nextToken) {
			images {
				url
			}
			nextToken
		}
	}
}

All pages after the first one should follow this structure, using the token returned by the previous or the first query.

Problem: Result structure

Second problem is: what the result should look like?

Since the different queries have different structures, it is not useful to just collect all the results and return them in an array. Instead, a paginated response should look like a non-paginated one. This way it can be transparent to the caller whether pagination was necessary or not.

For example, the images should concatenate all the returned image objects and overwrite the nextToken with the last response (where it's null if all pages are fetched).

{
	"data": {
		"allUsers": {
			"users": [
				{
					"username": "User 2",
					"id": "id2",
					"images": {
						"images": [
							{"url": "..."},
							{"url": "..."}
						],
						"nextToken": null
					}
				}
			],
			...
			"nextToken": null
		}
	}
}

Implementation

Here's a generic solution that can be adapted to any schema structure:

const makePaginatedFetcher = (sendQuery, processItem, fieldNames) => (...args) => async (currentPage = undefined, previousPages = []) => {
	const [currentItems, nextItems] = await Promise.all([
		(async () => {
			return {
				...currentPage,
				[fieldNames.items]: currentPage ? await Promise.all(currentPage[fieldNames.items].map(processItem)) : [],
			}
		})(),
		(async () => {
			if (!currentPage || currentPage[fieldNames.nextToken]) {
				const nextPage = await sendQuery(currentPage?.[fieldNames.nextToken], previousPages, ...args);
				if (nextPage) {
					return makePaginatedFetcher(sendQuery, processItem, fieldNames)(...args)(nextPage, [...previousPages, nextPage[fieldNames.items]]);
				}
			}
		})(),
	]);
	return {
		...currentItems,
		[fieldNames.items]: [...(currentItems[fieldNames.items] ?? []), ...(nextItems?.[fieldNames.items] ?? [])],
		[fieldNames.nextToken]: nextItems ? nextItems[fieldNames.nextToken] : currentItems[fieldNames.nextToken],
	}
}

It needs 3 configuration items:

  • sendQuery defines how to send a paginated query
  • processItems gets called with each item, handling things like nested pagination
  • then fieldNames defines the fields for the items and the pagination token

To use it, define a fetcher for the images:

const imagesSubquery = `
	images {
		url
	}
	nextToken
`;

const fetchImages = makePaginatedFetcher(async (nextToken, _previousPages, userid) => {
	return (await query(`
		query MyQuery($id: ID!, $nextToken: String) {
			user(id: $id) {
				images(nextToken: $nextToken) {
					${imagesSubquery}
				}
			}
		}
	`, {id: userid, nextToken})).user.images;
}, (image) => image, {items: "images", nextToken: "nextToken"})

Note that it can get the first page from the other query, so it needs to define only the user query.

Then the user fetcher:

const fetchUsers = makePaginatedFetcher(async (nextToken) => {
	return (await query(`
		query MyQuery($nextToken: String) {
			allUsers(nextToken: $nextToken) {
				users {
					username
					id
					images {
						${imagesSubquery}
					}
				}
				nextToken
			}
		}
	`, {nextToken})).allUsers;
}, async (user) => {
	return {
		...user,
		images: await fetchImages(user.id)(user.images),
	};
}, {items: "users", nextToken: "nextToken"});

The processItems (second argument) function kicks in and calls the fetchImages that returns all images, paginating as necessary.

And that's it, the only thing left is to call it:

const users = await fetchUsers()()

The result is all the users with all the images in a format that hides whether pagination happened or not. This makes it easy to switch an existing non-paginated query to the paginated one.

The network tab shows the request that were made:

Conclusion

Nested pagination is a unique problem in GraphQL as it allows querying nested structures in a single request. Nesting in queries is the main advantage of using GraphQL as it saves a lot of roundtrips but that makes pagination harder to implement.

The utility function shown in this article solves pagination in an efficient way: subsequent requests are only sent when they are needed and no data is fetched twice.

March 21, 2023
In this article