Losing Spark.subscribe() connection/subscription

hi all,

I have two cubes subcribed to events of each other via Spark.subscribe(), and periodically (every 2min) publish events to each other via Spark.Publish(). They are on different networks.

The setup is working great for many hours, but after half a day or so one Spark is no longer processing the published events from the other. Both sparks are still connected (I thing a reconnect happens in between due to reconnect of the Internet upstream each night).

If I am subscribing to the same event from elsewhere (webpage) I CAN see that the events are still successfully published by both Sparks, even in the broken state after some hours of uptime.

I further narrowed it down by publishing a debug event each time my event handler is called, which does not happen.

Therefore I know that both Sparks are still connected to the cloud and both are publishing events but somehow lost the Spark.subscribe() subscription and my event handler is no longer called.

Is this a known problem?
Is there any thing necessary or available to periodically “refresh” a Spark.subscribe() subscription?
Could it be useful to periodically call Spark.subscribe() again with the same event?

I’ll take a run at this, but I could be way off!

I wonder if it’s the internet disconnect (and subsequent cloud disconnect) that is causing the subscription to be lost? That kinda makes sense to me, that the cloud drops the subscriptions once it sees the device has gone offline, but I would need to try it out to be sure, Maybe @dave knows!

You can have more interaction with the cloud connection by using SYSTEM_MODE(SEMI_AUTOMATIC) mode and with code like this in you loop:


if (!Spark.connected()) {
   Spark.connect();
   Spark.subscribe(...);  // setup subscriptions
}

So that whenever the connection to the cloud is dropped, your code knows about it and can re-register the subscriptions.

With multiple disconnects, the code will be adding multiple event handlers for the same event in the core, which may lead to some odd side-effects. But it would be good to know if this gives you a half-way fix.

I imagine a full fix is to have the firmware itself re-register the registered events on reconnect to the cloud.

3 Likes

Heyo!

The core firmware currently sends the ‘subscribe’ message to the cloud on the first handshake, and then doesn’t send them again when the connection is dropped / re-established. I would think the firmware would remember and send them again, but I think that subscribe workaround could work in the meantime.

The cloud can’t remember which subscriptions the core should have on a fresh connection, it needs to be sent by the core.

Thanks!
David

2 Likes

Agreed! Issue filed - https://github.com/spark/core-firmware/issues/278

1 Like

Yes this would be harmful, since each call to Spark.subscribe() fills a new subscription slot in the firmware, and there are only 4 of these. So for now, it's best to add the code to register the event only on disconnect.

This is made more complex because Spark.connect() doesn't work instantly. So it's best to detect a reconnection to the cloud like this:

void loop() {
   static bool prev_connected = true;
   bool connected = Spark.connected();
   if (connected && !prev_connected) {
       Spark.subscribe(...);
   }
   prev_connected = connected;
   if (!connected)
      Spark.connect();
}

This will still fail after 4 disconnects. To avoid filling up the subscription slots, we have to go lower level:

Declare

extern SparkProtocol spark_protocol;

At the top of the file, and in loop() instead of using Spark.subscribe() use

spark_protocol.send_subscription("eventName", SubscriptionScope::MY_DEVICES);

This should then work for any number of disconnects. Please note that this is a hack and it will be fixed in a future version of the firmware, at which point you should remove this code.

Hope that helps!

1 Like

Thanks again for your help.

I still have two questions concerning this solution:

a)
When using the “spark_protocol.send_subscription” I still need to do one initial Spark.subscribe()? So the “spark_protocol.send_subscription” just resends the subscription information to the cloud, while the internal connection to the handler is there anyway from the first call.

b)
Does the solution for detecting the disconnect work in system mode “automatic”?
The daiy internet reconnect is just happening for a few secons (but involves a change of the public IP). So my concern was that my main loop would never see Spark.connected() being false.

No, it's not needed - the internal registration of the handler is still there, so we just update the cloud.

Yes, it works with automatic mode.

(Disclaimer - I've not tried this code - just posting from what I understand about how the system works in the hope it gives you a possible workaround until a fix is available in the firmware. Let me know how it goes!)

I can pretty much confirm that it’s a problem with any disconnect of the device from your router. I have a reed switch on one Spark controlling whether one LED is on or the other, and just to test it I left it plugged in for a day while I was at work. It was working fine when I left but when I got back, it only started working again when I manually hit the RESET button on the Spark Core that was Subscribed (Hitting reset on the publisher did nothing.)

I just unplugged my router momentarily and plugged it back in. Both cores, went dark blue, then reconnected, and behold, it stopped working.

This seems like a SERIOUS issue that needs to be addressed since the main claim to fame for the functionality of this product isn’t really viable if any time there’s a connection issue at all cloud functions stop working.

I attempted to use the code you provided as-is (including adding extern at the top of the file and all that) in the code I had, and as soon as I shut off my router the Spark Core started blinking green as if it was already attempting to connect to the Wi-Fi network, and stayed that way even after the internet was restored and the other Spark Core had gone through it's normal re-connect. (The green flashing didn't stop until I RESET the core and then it connected normally and functionality returned to normal.)

After that didn't work, I figured it had something to do with how I simply dropped what you had into my loop without changing anything, so I moved the bracket down to include the second part:

void loop() { ...

   static bool prev_connected = true;
   bool connected = Spark.connected();
   if (connected && !prev_connected) {
       spark_protocol.send_subscription("light-up", SubscriptionScope::MY_DEVICES);
         //moved it from here
   prev_connected = connected;
   if (!connected)
      Spark.connect();
     }   // to here

// rest of the stuff I have going on etc
}

After resetting my router again, the status LED seemed to go through the normal steps except that it wasn't registering the publish event I had subscribed to anymore (stopped functioning just the same way it did during the first test I did before changing any code).

I noticed that in the spark_protocol.send_subscription("eventName", SubscriptionScope::MY_DEVICES);
that there isn't any space for setting up a handler (and it gave me an error when I tried to just add a comma and throw it in there). In my original code in void setup, I have

Spark.subscribe("light-up", ledOn, MY_DEVICES);

Where ledOn hooks up to a handler which does the stuff I want it to do, which I'm assuming is why it wouldn't work when re-connecting (I don't see a way for it to hook up with the handler in setup?)

I’ll look at fixing this issue over the coming days.

2 Likes

FYI, I was having a similar problem with a spark that stops receiving subscribe/publish events. In my case it wasn’t due to the wifi disconnecting; the code to deal with that didn’t help because mine was never in that disconnected state. But I was able to use that solution for a workaround. I keep track of the last received event and if too much time has passed I call:
spark_protocol.send_subscription("eventName", SubscriptionScope::MY_DEVICES);
Before this fix my spark typically stopped receiving events typically within 12 hours. After the fix it’s been running properly for over 3 days.

FWIW, my other core publishes every minute, and this core calls ‘send_subscription’ if it hasn’t received an event in the last 5min. To track duration I’m just using millis() and subtracting prior from current to figure out elapsed time.

1 Like

Yes, this is one of the core features why I use Spark.core modules.

It would be nice if there is a function to pull a published value manualy. If the core is in sleep mode, you cannot handle subscripted events.

Hey All,

Just wanted to pop in and say I’m following this thread! :slight_smile:

When the core loses and re-establishes its cloud connection, I think it doesn’t re-send the subscription request, so the cloud doesn’t know which messages to send it. When we put the ttl to use on the events, I think it’ll be possible to pull your still-fresh events back down to your device :slight_smile:

The firmware team is focusing very hard on the Photon right now, but I’ll ping them about this again when they have a chance to look at it.

Thanks!
David

3 Likes

Any update on this? The prior solution won’t compile on the Photon so I removed it and the issue appeared to have been resolved but when I use a Core without it I lose the subscription within a day.

This issue remains on the Core till a newer firmware release for the Core is official.

I just had a Photon lose it’s subscription. Is there an equivalent of this that will work on the Photon?
spark_protocol.send_subscription(“temp”, SubscriptionScope::MY_DEVICES);

Is this still on the radar to be fixed? I’m experiencing a similar issue with losing particle.subscribe after a few days, likely after a wifi blip or something. I haven’t seen a good way to work past the issue, short of issuing a reboot nightly or something.

The issue has been fixed in 0.4.x versions of firmware.