How to detect device connection and disconnection in AWS IoT Core

The presence topics publish an MQTT message on status change

Author's image
Tamás Sallai
5 mins

Device lifecycle events

Connected devices can push data to AWS IoT Core via MQTT. Then the backend can react to these messages via topic rules. This forms the basis of an AWS-based IoT solution.

Then I was wondering is there a way to detect connections and disconnection events without the devices actively pushing data to the cloud? For example, it would be great to have a dashboard with up-to-date information about the connection status of each device.

Yes, it's possible. In this article, we'll look into how the presence topics work and how to write a topic rule that writes connection and disconnection events into a DynamoDB table.

Reserved topics

AWS IoT Core provides an MQTT broker that connected devices can use. With MQTT, by default, any device can publish to any topic. In the case of AWS, there are some reserved topics that connected devices can't use.

Among these reserved topics are the two presence topics: $aws/events/presence/connected/<clientId> and $aws/events/presence/disconnected/<clientId>.

Presence topics in the AWS documentation

AWS publishes to these topics whenever a device connects or disconnects. Then the same topic rule mechanism can be used to handle these events as for normal messages. The only difference is that AWS automatically publishes to these topics.

Activity log

IoT Core provides an activity panel for things. It subscribes to MQTT events related to that device and provides an easy visualization of the presence topics:

Activity stream shows the messages related to a thing

Events structure

The event structure is documented here.

An example connected event:

{
	"clientId": "thing_73d6f002860b417b",
	"timestamp": 1672742949382,
	"eventType": "connected",
	"sessionIdentifier": "bab409a3-306a-40a4-9a63-8b817ddac9b9",
	"principalIdentifier": "b39fb9e51fe1baf71bd98728fa6337efdc6852c5e4a846eecb58080c64dd86d1",
	"ipAddress": "3.75.237.130",
	"versionNumber": 8
}

And its disconnected pair:

{
	"clientId": "thing_73d6f002860b417b",
	"timestamp": 1672742949397,
	"eventType": "disconnected",
	"clientInitiatedDisconnect": true,
	"sessionIdentifier": "bab409a3-306a-40a4-9a63-8b817ddac9b9",
	"principalIdentifier": "b39fb9e51fe1baf71bd98728fa6337efdc6852c5e4a846eecb58080c64dd86d1",
	"disconnectReason": "CLIENT_INITIATED_DISCONNECT",
	"versionNumber": 8
}

Event ordering

Notice the two fields related to ordering: timestamp and versionNumber. While it seems easy to decide whether a device is connected or disconnected at a given moment, it gets very complicated as you read through the documentation. The first obstacle is mentioned on the top of the page:

Lifecycle messages might be sent out of order. You might receive duplicate messages.

Ok, then just order the events by timestamp. But that also won't work:

An approximation of when the event occurred, expressed in milliseconds since the Unix epoch. The accuracy of the timestamp is +/- 2 minutes.

This means the latest timestamp might indicate the device is disconnected but a later event with an earlier timestamp might say it's connected.

Ok, then just order by versionNumber. Well, it's not that easy either:

If a client is not connected for approximately one hour, the version number is reset to 0.

So you can't rely on a higher number is for a later event. Probably something like storing all events and then querying the last 2 minutes would work but I haven't explored this in detail yet.

Processing lifecycle events with topic rules

Topic rules run for every message published to a topic that matches the topic filter.

A rule consists of an SQL query to define the topic filter, an optional WHERE clause to further restrict the events, and a SELECT for transformations.

A topic rule that handles both connection and disconnection events:

SELECT * as event, timestamp, version, topic(4) as eventType, topic(5) as clientId
	FROM '$aws/events/presence/+/+'
	WHERE topic(4) = 'connected' or topic(4) = 'disconnected'

The first + matches both connected and disconnected. These are the only two presence events now, but it's a good idea to make sure if AWS adds other ones then this rule won't match them. That's what the WHERE clause is doing: topic(4) = 'connected' or topic(4) = 'disconnected'.

topic(4) is the fourth (notice the 1-indexing) topic, which is the first +. The statement also maps that to eventType as I found it a best practice to map wildcards to a friendlier name in topic rules.

The second + matches every clientId. Since this ID has to be the thing name as required by IoT Core, the topic rule knows which thing is connected/disconnected. So it maps topic(5) to thingName.

Storing the events

Topic rule actions open the rest of the AWS cloud. They can call a Lambda function, publish a message to SNS or SQS, put it to S3 or DynamoDB, call an HTTP endpoint, and so on.

In this example, we'll put the data into a DynamoDB table. Storing the raw events is usually a best practice for later debugging.

DynamoDBv2 action

As usual, IoT Core needs an IAM role configured for this topic rule that allows the dynamodb:PutItem action for the table:

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Action": "dynamodb:PutItem",
			"Resource": "arn:aws:dynamodb:..."
		}
	]
}

After the devices connect and disconnect, the items accumulate in the table:

Presence events in the DDB table

Conclusion

Lifecycle events are powerful constructs to detect when devices interact with IoT Core. With this, it is possible to create a dashboard with up-to-date statuses and send notifications on changes.

But watch out for event ordering and duplication. As usual with MQTT, you can't rely on accurate timestamps or exactly once delivery. While usually things are happening as expected, there can be edge cases.

May 30, 2023

Free PDF guide

Sign up to our newsletter and download the "Git Tips and Tricks" guide.


In this article