What is AWS AppSync

A first look at GraphQL and its AWS-managed implementation

Author's image
Tamás Sallai
6 mins

AWS AppSync is a managed GraphQL service that you can use to offer a serverless API. GraphQL is an open standard and you can run it on an EC2 instance and configure it any way you’d like. But opting in to AppSync makes it trivial to get up and running on AWS.

In this article we’re going to take a look into what is GraphQL and why use AppSync if you want to offer a graph-based API on AWS.

GraphQL vs REST

First, let’s talk about GraphQL! When you need to provide an API for a webapp or a mobile app the trivial choice is to use REST.

REST builds on resources that are available under a URL and the clients can interact with them through HTTP verbs. For example, a backend that stores Todo items for users might offer these resources:

  • GET /user => Get the list of users
  • GET /user/:id => Get user by id
  • POST /user => Create new user
  • GET /user/:id/todos => Get the Todo items for a given user

When the client wants to show the users and their Todo items, it can send a request for the users, then for individual users, then for Todos for a given user.

On the other hand, the query model in GraphQL is based on graphs. The backend can define queries that act as an entry point to the object graph and then the clients can navigate between objects through their fields.

For example, the a GraphQL schema that defines users and Todo items might look like this:

type Todo {
  name: String
  checked: Boolean
}

type User {
  id: ID
  name: String
  # [] means an array of items
  todos: [Todo]
}

type Query {
	allUsers: [User]
  user(id: ID): User
}

schema {
	query: Query
}

The Query is the entry point, so clients can get a user by its id. Then they can traverse the graph and get the Todos for that specific user through the User.todos field.

In a query, this looks like this:

query {
  user(id: "user1") {
		name
    todos {
      name
      checked
    }
  }
}

The user is the query that gets an argument (id: "user1"). Then the request specifies what properties to get for object. Notice that the user might have multiple Todo items. GraphQL returns all of the Todos for the user and their name and checked properties.

todo1name => todo1checked => falsetodo2name => todo2checked => truetodo3name => todo3checked => trueuser1id => user1name => Bobtodosuser2id => user2name => Joetodos *-->todo3Querying user1 with its Todo items

The response:

{
	"data": {
		"user": {
			"name": "Bob",
			"todos": [
				{
					"name": "todo1",
					"checked": false
				},
				{
					"name": "todo2",
					"checked": true
				}
			]
		}
	}
}

Just like with REST, the API does not care about how the response is assembled. In most cases, the fields and objects correspond to tables and items in a database, but you can also call a function that returns random data. As long as the structure of the data is what the consumers expect (and in the case of GraphQL, what is defined in the schema) the response is valid.

Advantages over REST

Notice that the schema documents what is possible with the API. It defines the types and their fields and how a request can traverse the graph.

REST APIs can get any shape and they don’t have an enforced way to document them. OpenAPI is a specification that remedies this problem, but the backend needs to opt in. With GraphQL, you can be sure that the API is documented in the schema.

Also, REST APIs usually have an n+1 request problem. If you want to get all users with all their Todo items, you need to send 1 request for the list and one for each user:

  • GET /user
  • GET /user/id1/todos
  • GET /user/id2/todos

n+1 requestsbrowserAPIdbGET /userlist usersusersusersGET /user/1/todosget todos for user 1user1 todosuser1 todosGET /user/2/todosget todos for user 2user2 todosuser2 todos

With GraphQL, the query specifies the whole graph that it needs in the response:

query {
  allUsers {
		name
    todos {
      name
      checked
    }
  }
}

This single query returns all the users and all their Todo items.

Notice that no fewer database queries needed to assemble this response, just fewer requests. In practice, the number of roundtrips is the biggest contributor to slowness that users experience and GraphQL drastically reduces that number.

1 requestbrowserAPIdbGraphQL query for users and todoslist usersusersget todos for user 1user1 todosget todos for user 2user2 todosusers and todos

Why AppSync

GraphQL is an open standard, so there are several servers to choose from. One of them is AWS AppSync that is available in the AWS cloud. As usual, if your architecture uses other products in AWS then it’s natural to look into their solution first.

AppSync is serverless. You don’t need to run instances to drive the API making it trivial to get started with GraphQL. Note that you can use servers to serve the requests, such as quering for the users might be handled by an EC2 instance, but you don’t need to run servers for the API itself. That said, I found DynamoDB and Lambda to work naturally with AppSync and this stack can provide a truly serverless solution.

Apart from offering a managed solution, AppSync also adds a bunch of features on top of GraphQL:

  • Auth directives that integrate with IAM and Cognito
  • Additional scalar types, such as AWSDateTime and AWSJSON that provide extra validation
  • Request/response templates using VTL, though it requires some getting used to
  • Integration with other AWS services, such as getting/storing items in DynamoDB or calling Lambda functions

Then there are subscriptions. A subscription is a real-time notification channel that clients can use and get an event through WebSocket when some data is changed. For example, if a client shows the Todo items for a user, it might want to update the list real-time when another client makes a request (a request that changes data is called a Mutation in GraphQL terminology) that changes it.

Connections for subscriptions are managed by AWS, so you don’t need to keep track of the connected clients and send messages to a WebSocket channel. On the other hand, you need to define what events trigger a subscription. When I first read about AppSync I had the impression that it uses some “magic” to make it easy. In reality, getting subscriptions right is no easier than for other types of APIs (this article is a great resource) and it even has a few undocumented edge cases.

AppSync subscriptionsuser1APIuser2subscribe for newTodosaddTodo mutationstore Todonotification

Finally, AppSync writes a ton of logs to CloudWatch and it also publishes metrics about the API.

A separate graph shows latency values that provide some insight how fast is the API:

Conclusion

If you build on top of AWS, AppSync is a great product to provide a GraphQL-based API. It integrates with everything you’ll need and offers a fully managed solution. On the other hand, it builds on a lot of concepts that are not that mainstream and it takes a lot of learning to get up to speed with it.

19 October 2021
In this article