MQTT-TLS could use Amazon IoT

Hi guys,

I update MQTT-TLS lib to 0.2.12, this version could work on 0.6.3(default), 0.7.x, 0.8.x firmware.
And this library could connect to the Amazon IoT(TLS) private key.

here is sample movie.

I test this version on test.mosquitto.org, iot.eclipse.org, AWS IoT MQTT servers.
I hope this library is helpful for IoT Photon developers.

Regards,
Hirotaka

11 Likes

Thank you! Very kind :slight_smile:

Great work!

1 Like

This is a really amazing lib, thanks for the efforts @hirotakaster!

1 Like

@hirotakaster, I will test this during the weekend as well. I have been looking for an AWS compliant library as well. Thanks for the effort!!!

@hirotakaster, have you tried this in threaded mode as well? Would be great is the device could be doing stuff while trying to connect to the internet in case of network issues.

@GrtVHecke I did not test on threaded mode. And sorry, this library disable the threading mode on mbedTLS.(FYI: mbedTLS on particle firmware DTLS disable the threading too.)

Ah, that may explain why my device was locking up.

Is there anything else we should be aware of? :slight_smile:

Itā€™s hard to point for the many application use case.
There are any request and problem ā€¦etc, Iā€™m happy the developer will report or update source code diff pull request on the github.

Just making sure I understand correctly, are we talking about this:

SYSTEM_THREAD(ENABLE)
becoming
SYSTEM_THREAD(DISABLE)

Or a different type of multi-threading?

My meaning is ā€œMBEDTLS_THREADING_Cā€ in TLS. This library donā€™t set SYSTEM_THREAD(DISABLE).

Thanks @hirotakaster,

Three questions, Iā€™m hoping you can help with:

  1. How do you set a message to QoS level 0?
  2. How do you set a time out of 1 minute on the connection?
  3. Does the library act in a non-blocking way when the connection is lost?
    i.e. Are the commands below non-blocking
  • client.isConnected()
  • client.publish
  • client.subscribe
  • client.loop
  • client.connect

Thanks in advance

Handy tips for using AWS MQTT Brokers.

TLDR: There are messaging limits you need to be aware of. This is mainly usefull for people with > 20 devices, devices that send a lot of messages, devices that drop off networks regularly, devices that must be successful in sending publish events.

I spent a good 2 hours on the phone today with two AWS specialists (Who were awesome and extremely helpful). There are a few limitations you may need to be aware of as a potential future user of AWS MQTT.

1. Standard Limits (these can be increased if you ask nicely and let them know why)

  • Default inbound publishes (device > AWS) are capped at 3000/second per account per region
  • Default outbound publishes (AWS > device) are capped at 6000/second per account per region
  • 100 publishes per second, per client (much higher than the 1/second with Particle - even with bursting up to 4 in 1 second, over a 4 second period.

2. The limits that may catch you out (they can not be increased at this stage 8/March/2018)

  • In-flight messages that are yet to be acknowledged are capped at 100 at any point in time.
    However: Not every message is acknowledged (QoS 0 messages are not acknowledged).
  • Device shadows are limited to 10 publishes / second (if you have a 1000+ devices this is feasible to hit). There seems to be no way around this other than writing some Lambda code and caching device states into a separate database then slowly updating the Device shadows.

In-flight messages (be-careful)

  • It is ideal for the broker to ā€˜acknowledgeā€™ a message is received, else your device doesnā€™t know if it should try again, or assume it is sent. This is a problem for critical applications (e.g. over temp sensors)
  • What is an ā€˜in-flightā€™ message? It is where either the client or broker is yet to fully respond to the acknowledge request. (this occurs when you send QoS level 1 messages).
  • ā€˜In-flightā€™ messages do not show up on the AWS Cloudwatch metrics list, so you canā€™t be alerted to the fact you have lost messages.
  • When client devices lose connectivity and donā€™t disconnect from the broker, the broker will hold the messages for an hour and keep retrying to send (this consumes part of your 100 in-flight limit if they are QoS 1)

So how do we avoid losing messages?

  • Set the timeout of your devices very low (< 1mininute), this way, when a client device drops off the network, the messages waiting for it in the queue will drop off too quickly and hope you donā€™t exceed the 100 limit.
  • Donā€™t use QoS1 ā€“ It seems the best way to ensure your message is acknowledged by the broker is to:
  1. Pass a unique message number in each ā€˜publishā€™
  2. Have AWS IoT when it gets your first message, re-publish to your device your original message with the code.
  3. Have your device retry the message if it doesnā€™t get the unique message code back within a certain timeframe (e.g. 5 seconds). Have the code send a non AWS cloud alert on failure (e.g. Particle Cloud or On-board 3rd Party SIM SMS)

Hope this helps.

If you want a guide on how to setup acknowledgement requests by republishing, check this guide I made: https://github.com/CameronTurner/AWSQoS1WorkAround

The instructions need some polish, but it is all there. Message if you need help.

1 Like

Hi @Cameron,

  1. How do you set a message to QoS level 0?

You could this function.
bool publish(const char *, const char *, EMQTT_QOS, uint16_t *messageid = NULL);

like this.
client.publish("outTopic/message", "hello world", MQTT::QOS0);

  1. How do you set a time out of 1 minute on the connection?

This library send MQTT keep alive timeout ping in every 15 second.
TCP Layer timeout is depending on the Particle API (TCPClient).

  1. Does the library act in a non-blocking way when the connection is lost?

When MQTT or TCP connection is lost or fail, you could find that on client.isConnected() loop(see the sample source) is false.

Great, so as long as we run client.isConnected() first. Then no blocking?

Whatā€™s the time out on a Publish event and Connect event if it disconnects / is disconnected / canā€™t connect?

Thanks!

Iā€™ve also noticed when using this within a larger set of code, the device will not be able to connect for OTA updates.

If I trigger client.disconnect() from the cloud, then it seems to allow OTA updates again.

Iā€™m wondering if the 15 second keep alive ping is blocking the other OTA update threads access or limiting the device resources too much that the OTA canā€™t complete before a time out orā€¦ results in it blocking somehow?

That maybe TcpClient read is blocking the OTA. Check OTA update when you tcpclient.read() & firmware source code.
MQTT. isConnected() method is non blocking(see library source code), just a check tcpclient. connected() status.

I'm thinking of extending this out to 5 minutes, that should hopefully reduce the load on the TCP connection yeah?

Where is the best place to change this correctly? (there are a lot of files to go through and I haven't been able to find any documentation yet) :slight_smile:

MQTT timeout could change on MQTT_KEEPALIVE timeout.
TCP connection time out you could change the particle firmware source code.
(like this, but Iā€™m not testing. TCPClient Connectivity & Long Timeout Killing UX)

@hirotakaster Thanks for this library. Iā€™m trying to reproduce your AWS IoT test, but seeing issues with the minimal example case. I used this process:

  1. Create keys from AWS IoT
    39%20PM

  2. Create a project and include @hirotakasterā€™s MQTT-TLS library -> project.properties:
    dependencies.MQTT-TLS=0.2.12

  3. Copy a2-example.ino from library

  4. Copy certificate from AWS into library .ino file:
    08%20PM

I am assuming the ā€œroot CA for AWS IoTā€ from Symantec goes into:
#define AMAZON_IOT_ROOT_CA_PEM

ā€œcertificate for this thingā€ goes into:
#define CELINT_KEY_CRT_PEM

ā€œA private keyā€ goes into:
#define CELINT_KEY_PEM

  1. Replace server URL:
    MQTT client(ā€œabc123myurlgoeshere.iot.us-west-2.amazonaws.comā€, 8883, callback);

  2. Then following your video I should be able to test by subscribing to the outTopic/message topic:

First, is all of this correct? Second, when I attempt this on a photon, compiling in Desktop IDE:

  • client.connect(ā€œsparkclientā€); executes but does not result in a connection
  • client.isConnected() is always false

After adding debug code all through MQTT-TLS.cpp I find that the tls handshake is successful but then while waiting for MQTT to connect something happens. Again from some debugging:

01%20PM

Can you shed some light here? Without additional code this library should execute with just adding our personal AWS keys right? Also, the execution seems to stop in a way that has me suspecting memory issues but can I call System.freeMemory() from the library somehow to confirm as I learn whatā€™s happening? Prior to the connection attempt freememory returns 28632.

Lastly, for anyone else looking at Google IoT Coreā€¦ Iā€™m doing this initially with AWS IoT
in order to facilitate larger file transfers through MQTT payloads (greater than Particleā€™s 255 or 256b packets allow), and because it looks like @hirotakaster has got it working there. Ultimately, Iā€™d prefer using Google IoT Core MQTT (and I wish this was built into particle in a way that allowed 50kb+ transfers without struggling through HTTPS or MQTT-TLS integrations) Iā€™m hoping to eventually transition to Google IoT Core with MQTT-TLS but I suspect there are other authentication complications (ie. JSON Web Token requirement? [JWT] (https://cloud.google.com/iot/docs/how-tos/credentials/jwts)) If you are also working on something like this, reach out to me please.

@ian.c
Yes, please use cert.pem, private.key for CELINT_KEY_CRT_PEM, CELINT_KEY_PEM.
next is you would be better check the AWS IoT core certificate policy ARN rule.

I want to test on Google IoT core but could not yet, because of Google IoT Core is now Private Beta version I could not join beta user.