How to properly implement unique constraints in DynamoDB
Combine transactions and condition expressions to make sure a value is unique in the database
Guaranteeing uniqueness
Let's say an application stores user objects and as part of the item it records email addresses. A table with a single user might look like this:
When other users come and register to the app, they can specify their email address as well. But that means they possibly use one that is already taken:
This is usually a problem, as an email address might be used to login to the site, and having multiple users with the same email would cause problems.
So, how to make sure the email address a user wants to register with is not already taken by an existing user?
The naive solution is to query the database before the insert and throw an error if the email is already in it. The problem with this solution is a race condition. Network calls can be delayed for arbitrary durations, and the email address might be taken between the check and the insert. A solution can not depend on a prior query.
In SQL, this is easy. There is a UNIQUE constraint that enforces this property on the database-level. It creates an index that the database will check before inserting the new user:
CREATE TABLE User (
ID varchar(255) PRIMARY KEY,
email varchar(255) NOT NULL,
UNIQUE (email),
);
INSERT INTO User values("lFr1D", "test@example.com");
INSERT INTO User values("vHPZL", "test@example.com");
-- Duplicate entry 'test@example.com' for key 'email'
But DynamoDB is not a SQL database, so this is not an option here.
One solution is to make the email the key of the user object. As DynamoDB is a key-value data store, it guarantees that only a single item can have a given key. But this is a bad approach. One, you can only make one attribute the key. If users also have usernames and you want to make that unique too, you need to choose between the two. Second, emails in general make a bad key. Users might want to change the email they use for your service and that is increasingly hard if it is referenced throughout the database. Also, in the future, you might want to add the option to associate multiple email addresses with a single user. Making the email the key of the object makes this very hard to do.
DynamoDB has no support for a unique constraint out-of-the-box.
Unique constraints in DynamoDB
So, is there a way to implement this functionality using the features DynamoDB supports?
Yes, there is.
We need two things to make it happen. First, DynamoDB supports transactions. A transaction contains individual operations and provides the atomicity property for the whole set. It means that either all of the operations are effective, or none are. This is enforced by the database itself, so there is no race condition for checking the unique constraint and inserting a user record.
The second important feature is the ability to add check conditions, such as you can define that "insert an element if no element with this key exists".
These are the only two things you need to implement the unique constraint that works reliably and without race conditions.
Storing the unique values
Transactions and check conditions still work on items, so we can't write a transaction that checks that no user has this email address. This means we need to create a new table that has the unique values (the email address, in this example) as keys.
Since DynamoDB supports composite keys, where the key is made up of a partition key and a sort key, it's best to store the type of the unique constraint as well as the value. In the case of emails, the partition key can be the email address and the sort key a constant value, such as "email".
By adding the type to the unique constraint's key, you can store different types in a single table. To define a unique constraint for the username too, just add items with the type "username". Also, by defining the rarer of the two parts of the key as the partition, you'll end up with more partitions, which is a best practice when working with DynamoDB.
You can implement the same in a single-table design, a table modeling technique in which you use only a single table to store different types of items and differentiate them using prefixes.
It does not matter which style you use to store data, as they are easily interchangeable.
Inserting a new user item into the database involves two operations bundled together in a transaction. The first one inserts the user item, as you'd normally do without the unique constraint. The second one inserts the email into the second table, with the condition that the item does not exist. If this condition fails, the whole transaction is canceled, which means there is a user with this email address already in the database.
Implementing unique constraints in Node.js
Let's see how the multi-table approach looks like in code!
First, we have two tables, one for the users and one for the unique constraints. In Terraform the definition might look like this:
provider "aws" {
}
resource "random_id" "id" {
byte_length = 8
}
resource "aws_dynamodb_table" "users" {
name = "ddb_unique_constraint_sample_${random_id.id.hex}_user"
billing_mode = "PAY_PER_REQUEST"
hash_key = "ID"
attribute {
name = "ID"
type = "S"
}
}
resource "aws_dynamodb_table" "uniques" {
name = "ddb_unique_constraint_sample_${random_id.id.hex}_unique"
billing_mode = "PAY_PER_REQUEST"
hash_key = "value"
range_key = "type"
attribute {
name = "value"
type = "S"
}
attribute {
name = "type"
type = "S"
}
}
This defines the two tables and their keys.
Usually, the names of the tables are passed via environment variables. Let's assume that the USERS_TABLE
and the UNIQUES_TABLE
are the two names:
const {USERS_TABLE, UNIQUES_TABLE} = process.env
When you work with the official Javascript SDK, you can choose between the AWS.DynamoDB
class and the DocumentClient
. The latter provides a subset
of functions of the former, and its main selling point is that it converts between DynamoDB and Javascript types. Effectively, you don't need to spell out the
item types, such as in the ugly-looking {S: "value"}
. In this article, I'll opt for the DocumentClient
solution.
With the DocumentClient
, the structure of the transaction is this:
const AWS = require("aws-sdk");
const docClient = new AWS.DynamoDB.DocumentClient();
await docClient.transactWrite({
TransactItems: [
// transaction items
]
}).promise();
The TransactItems
array defines the individual operations inside the transaction. These operations can be Put
, Delete
, Update
,
and ConditionCheck
. To insert a user object, we'll use the Put
operation.
A Put
operation supports several options. One is the ConditionExpression
that defines the preconditions for this operation and the transaction will
fail if any of these conditions are not satisfied. The attribute_not_exists(#pk)
check along with an ExpressionAttributeNames: {"#pk": "value"}
ensures
that the email is not present in the unique constraint table before inserting it.
The transaction implementation:
await docClient.transactWrite({
TransactItems: [
{
Put: {
TableName: USERS_TABLE,
// check not exists before (no user with this ID)
ConditionExpression: "attribute_not_exists(#pk)",
ExpressionAttributeNames: {
"#pk": "ID",
},
// the attributes
Item: {
ID: id,
email,
}
},
},
{
Put: {
TableName: UNIQUES_TABLE,
// check not exists before (no user with this email)
ConditionExpression: "attribute_not_exists(#pk)",
ExpressionAttributeNames: {
"#pk": "value",
},
// the item attributes
Item: {
value: email,
type: "email",
}
},
},
]
}).promise();
This reliably takes care of enforcing the email uniqueness when a user is inserted into the database.
Modify a user
Let's see how to update the unique constraint when a user changes their email address! In this case, we need to do 3 things:
- update the user item
- delete the current email address from the constraint table
- add the new email address to the constraint table
Notice that we need the current email address as well as the new one for this. Issue a get
to retrieve the current values for the user item:
const currentUser = (await docClient.get({
TableName: USERS_TABLE,
Key: {ID: id},
}).promise()).Item;
There is a possible but very subtle error that can introduce a race condition and unfortunately most tutorials don't address this problem.
Notice that this get
is not part of the transaction. In DynamoDB, transactions are either all-writes or all-reads, so you can't read the value (the
current email address) and be sure that it is not changed by the time the write transaction hits the database. Because of this, you need to add checks into the
transaction that the state of the items (the current value of the email address) is what you've read moments before.
This check needs a ConditionExpression
that compares the email value in the database with the one read previously:
{
ExpressionAttributeNames: {"#email": "email"},
ExpressionAttributeValues: {":currentemail": currentUser.email},
ConditionExpression: "#email = :currentemail"
}
This check is crucial. Without it, a user might have concurrently changed their email address and the database becomes inconsistent.
The three operations, with the proper checks:
// change user's email to newEmail
const currentUser = (await docClient.get({
TableName: USERS_TABLE,
Key: {ID: id},
}).promise()).Item;
await docClient.transactWrite({
TransactItems: [
{
Update: {
TableName: USERS_TABLE,
// update the email
UpdateExpression: "SET #email = :email",
ExpressionAttributeNames: {"#email": "email"},
ExpressionAttributeValues: {
":email": newEmail,
":currentemail": currentUser.email
},
// check the value in the db matches the previously read one
ConditionExpression: "#email = :currentemail"
},
},
{
// remove the old email from the unique constraint
Delete: {
TableName: UNIQUES_TABLE,
Key: {
value: currentUser.email,
type: "email",
},
ConditionExpression: "attribute_exists(#pk)", // optional
ExpressionAttributeNames: {
"#pk": "value",
},
}
},
{
// add the new email to the unique constraint
Put: {
TableName: UNIQUES_TABLE,
ConditionExpression: "attribute_not_exists(#pk)",
ExpressionAttributeNames: {
"#pk": "value",
},
Item: {
value: email,
type: "email",
}
},
},
]
}).promise();
Notice the third Put
. It's the same operation that we used to add a user to the database.
Delete a user
The final use-case is to handle user deletion. In this case, their email address should be removed from the unique constraint table.
This works almost the same as the modification. To delete the current email address, we need to read the value first. Then, as it's outside the write transaction, we need to add a condition to make sure the value is the same as expected.
const currentUser = (await docClient.get({
TableName: USERS_TABLE,
Key: {ID: id},
}).promise()).Item;
await docClient.transactWrite({
TransactItems: [
{
Delete: {
TableName: USERS_TABLE,
Key: {ID: id},
ExpressionAttributeNames: {"#email": "email"},
ExpressionAttributeValues: {
":email": currentUser.email
},
ConditionExpression: "#email = :email"
},
},
{
Delete: {
TableName: UNIQUES_TABLE,
Key: {
value: currentUser.email,
type: "email",
},
ConditionExpression: "attribute_exists(#pk)", // optional
ExpressionAttributeNames: {
"#pk": "value",
},
}
},
]
}).promise();
Conclusion
As you've seen in this article, implementing a unique constraint is possible in DynamoDB but also that it requires quite some planning to do right.