How to detect device connection and disconnection in AWS IoT Core
The presence topics publish an MQTT message on status change
Device lifecycle events
Connected devices can push data to AWS IoT Core via MQTT. Then the backend can react to these messages via topic rules. This forms the basis of an AWS-based IoT solution.
Then I was wondering is there a way to detect connections and disconnection events without the devices actively pushing data to the cloud? For example, it would be great to have a dashboard with up-to-date information about the connection status of each device.
Yes, it's possible. In this article, we'll look into how the presence topics work and how to write a topic rule that writes connection and disconnection events into a DynamoDB table.
Reserved topics
AWS IoT Core provides an MQTT broker that connected devices can use. With MQTT, by default, any device can publish to any topic. In the case of AWS, there are some reserved topics that connected devices can't use.
Among these reserved topics are the two presence topics: $aws/events/presence/connected/<clientId>
and $aws/events/presence/disconnected/<clientId>
.
AWS publishes to these topics whenever a device connects or disconnects. Then the same topic rule mechanism can be used to handle these events as for normal messages. The only difference is that AWS automatically publishes to these topics.
Activity log
IoT Core provides an activity panel for things. It subscribes to MQTT events related to that device and provides an easy visualization of the presence topics:
Events structure
The event structure is documented here.
An example connected event:
{
"clientId": "thing_73d6f002860b417b",
"timestamp": 1672742949382,
"eventType": "connected",
"sessionIdentifier": "bab409a3-306a-40a4-9a63-8b817ddac9b9",
"principalIdentifier": "b39fb9e51fe1baf71bd98728fa6337efdc6852c5e4a846eecb58080c64dd86d1",
"ipAddress": "3.75.237.130",
"versionNumber": 8
}
And its disconnected pair:
{
"clientId": "thing_73d6f002860b417b",
"timestamp": 1672742949397,
"eventType": "disconnected",
"clientInitiatedDisconnect": true,
"sessionIdentifier": "bab409a3-306a-40a4-9a63-8b817ddac9b9",
"principalIdentifier": "b39fb9e51fe1baf71bd98728fa6337efdc6852c5e4a846eecb58080c64dd86d1",
"disconnectReason": "CLIENT_INITIATED_DISCONNECT",
"versionNumber": 8
}
Event ordering
Notice the two fields related to ordering: timestamp
and versionNumber
. While it seems easy to decide whether a device is connected or disconnected
at a given moment, it gets very complicated as you read through the documentation. The first obstacle is mentioned on the top of the page:
Lifecycle messages might be sent out of order. You might receive duplicate messages.
Ok, then just order the events by timestamp. But that also won't work:
An approximation of when the event occurred, expressed in milliseconds since the Unix epoch. The accuracy of the timestamp is +/- 2 minutes.
This means the latest timestamp might indicate the device is disconnected but a later event with an earlier timestamp might say it's connected.
Ok, then just order by versionNumber
. Well, it's not that easy either:
If a client is not connected for approximately one hour, the version number is reset to 0.
So you can't rely on a higher number is for a later event. Probably something like storing all events and then querying the last 2 minutes would work but I haven't explored this in detail yet.
Processing lifecycle events with topic rules
Topic rules run for every message published to a topic that matches the topic filter.
A rule consists of an SQL query to define the topic filter, an optional WHERE clause to further restrict the events, and a SELECT for transformations.
A topic rule that handles both connection and disconnection events:
SELECT * as event, timestamp, version, topic(4) as eventType, topic(5) as clientId
FROM '$aws/events/presence/+/+'
WHERE topic(4) = 'connected' or topic(4) = 'disconnected'
The first +
matches both connected
and disconnected
. These are the only two presence events now, but it's a good idea to make sure if AWS adds
other ones then this rule won't match them. That's what the WHERE clause is doing: topic(4) = 'connected' or topic(4) = 'disconnected'
.
topic(4)
is the fourth (notice the 1-indexing) topic, which is the first +
. The statement also maps that to eventType
as I found it a best
practice to map wildcards to a friendlier name in topic rules.
The second +
matches every clientId. Since this ID has to be the thing name as required by IoT Core, the topic rule knows which thing is
connected/disconnected. So it maps topic(5)
to thingName
.
Storing the events
Topic rule actions open the rest of the AWS cloud. They can call a Lambda function, publish a message to SNS or SQS, put it to S3 or DynamoDB, call an HTTP endpoint, and so on.
In this example, we'll put the data into a DynamoDB table. Storing the raw events is usually a best practice for later debugging.
As usual, IoT Core needs an IAM role configured for this topic rule that allows the dynamodb:PutItem
action for the table:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "dynamodb:PutItem",
"Resource": "arn:aws:dynamodb:..."
}
]
}
After the devices connect and disconnect, the items accumulate in the table:
Conclusion
Lifecycle events are powerful constructs to detect when devices interact with IoT Core. With this, it is possible to create a dashboard with up-to-date statuses and send notifications on changes.
But watch out for event ordering and duplication. As usual with MQTT, you can't rely on accurate timestamps or exactly once delivery. While usually things are happening as expected, there can be edge cases.