I have a server-side application that needs to know within about a minute when a Cellular Particle (B523) is no longer “online”.
I understand so far that the cellular devices work with UDP and this use case is probably not optimal. Nevertheless, I would like to verify the possibilities.
I probably can’t work with the “spark/status” event, since this doesn’t trigger “offline” with the Gen 3 devices.
An idea would be to use the particle.keepAlive() and query the “last_heard” status. At the moment I only managed this via API and not via WebHook, so I would have to constantly poll the status of every device.
Another idea would be to use a custom event and forward it to my application via WebHook. Within the application, the “offline” event is triggered after a certain time window if no WebHook was received.
My “dream” would be to recognize within the particle cloud when a device is not online and only then to send a WebHook to my application.
Ideally, the whole thing should consume as little data as possible.
Does anyone have a good idea how my project can be managed most sensibly?
I would like to allow myself to link @rickkas7 because I know that he has experience with such things. Hope that’s all right.
An offline event is generated for all devices, but for cellular and Gen 3 devices it may take a half hour for it to occur. The reason is that the default keep-alive is 23 minutes, so there’s no way to know if the device went away or not, as it’s not transmitting data.
While you could reduce the keep-alive, this has two side-effects:
Each keep-alive uses 122 bytes of data. It you set it to 60 seconds you’ll use 5.5 MB of data per month just in keep-alive pings.
Polling the last-heard will only work with a small number of devices. Eventually you will hit the API rate limit to poll each device separately.
The best way is to have the device publish an event (with NO_ACK) periodically. You can get these either using a webhook, or my preference, using Server Sent Events.
Since you’re going to need some sort of service for keeping track of which devices missed their pings anyway, SSE is the easiest way to receive these quickly and efficiently for small-to-medium scale. One benefit during prototyping is that your SSE server can be behind a firewall and NAT and the connection is encrypted but you don’t need a TLS/SSL server certificate.
The use of SSE instead of webhooks seems absolutely sensible to me. Never heard about that before, but I will definitely take a closer look at it!
You say a keep-alive of 60 seconds results in 5.5 MB of data a month. I am aware of this and I would accept that so far.
Would publishing an event (with NO_ACK) every 60 seconds consume less data? I couldn’t find anything specific about this…
One more thing: you say that an offline event is generated for all devices. I can reproduce that so far, even though it usually takes more than an hour at my tests. However, that somehow make sense to me with the default keep-alive at 23 minutes.
What I cannot really understand is why it takes the same time if I set the keep-alive to 60 seconds.
In my understanding, the offline event should be triggered at the cloud side after about 1.5 to 2.5 times the period configured at the clients keep-alive, or am I wrong?
Would be great if you could give me a feedback again!
One question still came up in my tests: Does the Cellular.getDataUsage() API work with the B523? I could not find any clear indication of this in the documentation and attempts to call this API have so far been unsuccessful.
I would be very happy to receive feedback on this and the questions from the previous post!
Thanks a lot @rickkas7!
Now that I think about it, I think you might need to miss two keep-alive pings for the device to be marked offline. Since the offline indicator doesn’t know what keep-alive you’ve set, it assumes the default setting, so 46 minutes to an hour doesn’t seem unreasonable to me.
Because keep-alive pings use ACK mode, using NO_ACK would reduce the data usage in half.
I know from MQTT, for example, that the clients inform the broker of their keep-alive when connecting, and the broker then sets the offline timeout accordingly to 1.5 times this period.
I would have guessed that the particle keep alive might be similar.
I will probably go a way as you described in your first answer. That seems the most sensible to me!