Electron 0.5.3, Reconnecting loses subscriptions


#1

Hey guys,

I’ve notice that occasionally (not always) when my electron devices lose connection for a time they don’t seem to re-subscribe. I know this was a bug fix for 0.4.x. I’m using FW 0.5.3 at the moment, and I’m seeing this behavior now and then in production.

I know there are only 4 slots for subs. We only subscribe once on first connect to the cloud, but I’d like to defend against this kind of failure. We’re running with system-thread enabled in SEMI_AUTOMATIC, and have logic to reconnect to the cloud after a sleep or disconnect, but it does not currently re-subscribe if we’ve previously setup subscriptions.

Is there a safe-way of re-subscribing? If I Particle.disconnect() and Particle.unsubscribe() while disconnected from the cloud, can I safely Particle.subscribe()? Devices publish messages fine, but never get any from the cloud which is a big problem.

Thanks!


#2

Yes you can.

How does that reconnection logic work?

Why are you still using dated 0.5.3? 0.6.4 has proven to be the most stable version for Electrons (to date).

Background:
The device OS keeps track of the functions, variables and subscriptions that were registered and when the connection gets lost and regained a short handshake is performed where some kind of hash that represents the registered items as checked. If the hash fits the cloud should still know about them and no further action need to be taken. If the hash doesn’t a more elaborate “negotiation” takes place to re-register without any need for the application code to do that.
If a subscription doesn’t work after a reconnect something else is wrong - e.g. a 0.5.3 bug that was fixed already.


#3

Thanks!

Right now Particle.connected() return true for us even if a device is offline. It seems to update when we attempt to publish, so right now we’re checking the results of publishes to decide if we need to reconnect. We also always do a reconnect after sleeping.

I think the bigger plan is to add an acknowledgement from our cloud, a sort of quick back-and-forth message that tests our subscriptions. If the device doesn’t get the publishes from the cloud we will disconnect and reconnect, re-subscribing in the process.

This is an edge case for us: It’s working 99% of the time, but we need to protect the 1% of devices that don’t seem to perform this re-subscribe (they are silently not receiving messages, the particle API is returning a 200:OK code when we publish the messages and the device is happily publishing messages). We’ve noticed this behavior is often preceded by a period of being offline.

We’re working on moving to 0.6.4 and 0.7.0 in production.
It’s proven to be a slow process: running appropriate testing takes a lot of time.