Hard-to-debug unhandled rejection cases

3 examples when an unhandled rejection happens

Tamás Sallai

8 mins

Photo by Pixabay: https://www.pexels.com/photo/question-mark-on-chalk-board-356079/

Code is available on GitHub

Want to learn AWS serverless development? Click here

Unhandled rejections

In one of my projects I'm building a cache that works across workers. With that library, I can turn any function into a cached one where the cache key is calculated based on the arguments and if a given key is already finished then the stored result will be returned.

One core synchronization for this type of caching is that the function should only be called once for a given set of arguments. That means if the function is already running then the caller will just wait until the execution is complete.

It is based on an object that keeps track of in-progress calculations. A simplified version:

const inProgress = {};

const postTask = (key, fn) => {
	if (inProgress[key] === undefined) {
		// start task
		return inProgress[key] = fn();
	}else {
		// task already running
		return inProgress[key];
	}
}

Then I wanted to support worker threads as well. This means that if any process is currently calling the function then all parallel calls will wait for the results, even if a worker thread is doing the calculation. This mechanism builds on BroadcastChannel as threads don't share memory. A BroadcastChannel is a cross-context messaging port that enables global messaging.

But the requires a rather complex messaging protocol between the workers and the main thread. For that, I implemented a coordinator that is run on the main thread and handles workers' requests to start tasks.

When a worker wants to call the function, it checks first with the coordinator that a call with the arguments are not in progress. If it is, then the worker needs to wait for the finished signal, if not, then the coordinator create an entry in the inProgress object and waits for the worker to report that it finished the function.

A simplified code for this:

channel.addEventListener("message", ({data: {type, key}}) => {
	if (type === "start") {
		if (inProgress[key] === undefined) {
			// no active calls to the function
			inProgress[key] = new Promise((res, rej) => {
				// listen for finish and finish_error messages from the worker
				const handler = ({data: msg}) => {
					if (msg.key === key && ["finish", "finish_error"].includes(msg.type)) {
						channel.removeEventListener("message", handler);
						delete inProgress[key];
						if (msg.type === "finish_error") {
							rej(msg.reason);
						}else {
							res();
						}
					}
				}
				channel.addEventListener("message", handler);
				channel.postMessage({type: "startack", key});
			});
		}else {
			// tell the worker that it's already in progress
			channel.postMessage({type: "inprogress", key});
			inProgress[key].finally(() => {
				channel.postMessage({type: "finished", key});
			});
		}
	}
});

This implementation works, for example, starting the task in a worker then calling the function locally won't call it twice:

// send start and wait for startack
await postToChannel({type: "start", key}, "startack");

// post local task
postTask(key, () => {console.log("called")});

// finish in the worker
await postToChannel({type: "finish", key});

// no calls to the function

I was happy with the implementation, up until I started writing test cases for rejections. What happens if the function rejects? In that case, the worker will send a finish_error event, the coordinator rejects the inProgress Promise, and all the calls will be rejected as well, just as expected.

What I did not expect to see is unhandled rejections. And as I subsequently found out, tracking down these rejections is quite challenging and often surprising.

This article describes the three causes of unhandled rejections I encountered while working on this project. Each has different root causes, and posed different challenges.

Book

Building GraphQL APIs with AWS AppSync

How to design, implement, and deploy GraphQL-based APIs on the AWS cloud

Case #1: No reject handler

Let's start with the one that comes with the fewest surprises! If nothing handles a rejection, then it becomes an unhandled rejection. While it seems trivial, it still bit me.

Usually, rejections behave similarly to exceptions: they go up chain of async functions. This is why I hardly encounter this problem: except for a few forgotten await, it never happens.

For example, deleting a file but forgetting the await produces an unhandled rejection:

fs.rm(file);

But adding an await everywhere usually solves this problem:

await fs.rm(file);

In this case, the Promise returned by fs.rm is awaited so the async function will be rejected if it rejects.

So, what went wrong in my use-case?

When a worker calls the function with some arguments the coordinator creates a Promise. This makes it easy for local calls to wait for the result: simply return this Promise:

const postTask = (key, fn) => {
	if (inProgress[key] === undefined) {
		return inProgress[key] = (async () => {
			// ...
		})();
	}else {
		// return the Promise
		return inProgress[key];
	}
}

That means depending on how many postTask calls a key gets, the Promise will be used zero or more times. The problem case here is the zero. What if only the worker is running the function? In that case, the inProgress[key] Promise will be rejected but without anything to handle it.

// send start, wait for startack
await postToChannel({type: "start", key}, "startack");
// send finish_error
await postToChannel({type: "finish_error", key, reason: "failed #3"});

// unhandled rejection

The solution is rather simple after figuring out the cause: make sure that at least one rejection handler is always attached:

channel.addEventListener("message", ({data: {type, key}}) => {
	if (type === "start") {
		if (inProgress[key] === undefined) {
			// no active calls to the function
			inProgress[key] = new Promise((res, rej) => {
				// ...
			});
			inProgress[key].catch(() => {}); // <== attach a rejection handler
		}else {
			// ...
		}
	}
});

Case #2: Promise.finally

When a worker wants to start working on a task, it needs to inform the coordinator about that with a start message. Then it receives either a startack so that it can call the function, or an inprogress so that another thread is already calling the function. After the inprogress, the worker then needs to wait for a finished message telling it that the result is ready.

This is sent by the coordinator:

channel.addEventListener("message", ({data: {type, key}}) => {
	if (type === "start") {
		if (inProgress[key] === undefined) {
			// ...
		}else {
			channel.postMessage({type: "inprogress", key});
			inProgress[key].finally(() => {
				channel.postMessage({type: "finished", key});
			});
		}
	}
});

The above implementation is wrong. If the function call rejects there will be an unhandled rejection:

// worker1 starts and waits for the startack
await postToChannel({type: "start", key}, "startack");
// worker2 starts and gets an inprogress
await postToChannel({type: "start", key}, "inprogress");
// worker1 finishes with error
await postToChannel({type: "finish_error", key, reason: "failed #4"});

// unhandled rejection

Why is an unhandled rejection throw there? It turns out that if the Promise is rejected then the one returned by finally is also rejected. And since it's not handled, it becomes an unhandled rejection.

The solution? Make sure that it can't reject:

inProgress[key].catch(() => {}).then(() => {
	channel.postMessage({type: "finished", key});
});

It is usually not a problem as the Promise is usually returned and awaited on.

Case #3: Late return

The third one I encountered during writing test code for the coordinator.

In the library, when a task is posted the code first reads the filesystem to see if the result is already saved there. If not, then it proceeds with calling the function.

A simplified version:

const postTask = (key, fn) => {
	if (inProgress[key] === undefined) {
		return inProgress[key] = (async () => {
			// do something async...
			await setTimeout(1);
			return fn();
		})();
	}else {
		return inProgress[key];
	}
}

In the tests I wanted to control the series of events. For that, I usually use the Promise with the resolve/reject functions extracted, for which the Promise.withResolvers syntactic sugar is coming.

const {promise, resolve} = withResolvers();
// task returns the promise
const result = postTask(key, () => promise);
// ... other steps

// finish the task
resolve();

await result;

This works fine when the Promise is resolved. But when it rejects, it raises an unhandled rejection:

const {promise, reject} = withResolvers();
// post task
const result = postTask(key, () => promise);

// ... do other steps

result.catch(() => {});

// reject
reject("failed #6");

The interesting part is that the result is properly rejected, and before rejection there is a catch handler attached to it. So, where the unhandled rejection comes from?

The problem is the order of operations here. Since the postTask does not immediately call the function, the reject() runs first. In a sizeable codebase it was not easy to find this, but putting the two parts next to each other makes it more visible:

const postTask = (key, fn) => {
	if (inProgress[key] === undefined) {
		return inProgress[key] = (async () => {
			// do something async...
			await setTimeout(1);
			return fn();
		})();
	}else {
		return inProgress[key];
	}
}

const result = postTask(key, () => promise);
reject("failed #6");

In the example, the setTimeout(1) delays calling the fn() so that reject() runs before that. Without a rejection handler, it will raise an unhandled rejection.

To solve it, I needed to make sure that the function was already called when doing the rejection:

const {promise, reject} = withResolvers();
const {promise: calledPromise, resolve: calledResolve} = withResolvers();
const result = postTask(key, () => {
	// resolve calledPromise
	calledResolve();
	return promise;
});
// wait until the task function is called
await calledPromise;
// then reject
reject("failed #7");