Detecting if Spark.publish() fails to communicate with the cloud

I have found that if my Internet connection goes down just before a Spark.publish(), the latter will execute and return OK but the event will not get to the cloud. There is no documented return value for Spark.publish(), thus no way for the firmware to know if the event ever actually got to the cloud. It takes some time (perhaps 20 seconds or so) before a Core recognizes that the Internet connection was lost and tries to recover. In this period of time, the firmware will continue to operate normally but the events that get published do not get to the cloud.

Is there a return value from Spark.publish() that addresses this? Is there some other way for the firmware to know that a published event did not actually make it to the cloud?

1 Like

The cloud does send an acknowledgement message back down to the device after a successful publish, and I believe publish returns a boolean if the blocking_send completed. So it’s not quite the full cloud acknowledgement, but definitely something you can check in your code. Look for more positive hooks for positive confirmation in the code and protocol early fall this year, I’m planning on adding a lot of this. :slight_smile:

Thanks,
David

2 Likes

Thanks, Dave. I tried this but I get a compiler error saying that the return from Spark.publish() is a void; specifically: “no return statement in function returning non-void [-Wreturn-type]”. It would be very helpful to have the boolean return indicating success. Please put this high up on the todo list.

BobG is correct. Although the documentation indicates that “false” will be returned if it fails, the type appears to be void and the compiler chokes on any attempt to treat it otherwise. My program uses two versions of Spark.publish - one that directly communicates with the cloud, and one that calls a handler. I have noticed that the handler is sometimes not called. It sometimes just misses one cycle, sometimes a burst, and sometimes never calls it again until I reset. Very frustrating to not be able to detect it - and even more frustrating that it happens to begin with.

@Dave
I’m curious if there’s been any activity on this recently.

We’ve been tracking an issue similar to what has been described here where in some random cases–even though Particle.connected() is true and Particle.publish() returns true–a particular published event never actually makes it to the Particle event stream.

I strongly suspect it has something to do with a specific published event being dropped or lost (presumably by the ISP or router) yet because the events before and after went through the Photon assumes the errant event was successfully published. I’m able to reproduce this by simply publishing an increment counter once every second and randomly there will be numbers missing (even though they all show up as “successful”).

Ultimately it boils down this: Is there a way to verify 100% that a particular published event successfully makes it to the Particle cloud? Perhaps some sort of way to call Particle.publish in a blocking format that waits for a positive response before continuing? As noted above, Particle.connected() and evaluating the return value from Particle.publish() do not appear to accomplish this at present.

NOTE: We are using threading in our software ( SYSTEM_THREAD(ENABLED)). I’m assuming this is probably part of the problem (though we definitely need to use threads so hopefully there’s a solution that doesn’t involve turning them off).

For 90% of the data we transmit it’s perfectly acceptable for a random errant publish to be lost, but for the other 10% we have to have a guarantee that the data reaches the Particle cloud.

Any thoughts?

1 Like

AFAIK there is no such synchronous version of Particle.publish(), but the currently available workaround is to self-subscribe for events that you need to have sent reliably and if you don’t receive the event in due time resend.

On the other hand, I guess you have already thought of this, but how does your test code behave if you stretch the publishing rate from 1/sec to e.g. only every 1100 ms?
Just to see if the rate limiting is getting too strict. Especially if you had a connection dropout earlier you’ll get some system events generated after reconnect that also count for the rate limit.

1 Like

@ScruffR

Thanks for the quick reply. Self subscribing to critical events is actually a very interesting idea–I’ll need to explore this more for viability with our application but it might work. Obviously this doubles our data usage for our Electron-based product variants but that’s probably still more efficient than having to do a full-on HTML TCP post for delivery verification.

Regarding your question about timing, I don’t believe it’s related to rate limiting (at least not directly) unfortunately as we’ve observed this packet lossage behavior previously with far less frequent publishing schedules. I will try to confirm this though.

Ultimately though it’s clear that the existing Particle.publish() function call can’t possibly know whether the event is successfully published or not until long after the function has been called because it doesn’t block at all and immediately returns true (with threads enabled at least). If I had to guess I’d say the Particle.publish() function simply returns whether Particle.connected() is true or not. I would think a blocking version of the Particle.publish() function would be extremely beneficial for applications where it’s important to confirm that a specific event was successfully published to the cloud and where data usage is restricted (IE Electron).

Nope, “unfortunately” Particle.publish() currently doesn’t even care for Particle.connected() and can even “crash” the code if not. But AFAIK that’s being dealt with for 0.5.0.
The return value of Particle.publish() is more like an indication whether the publishing “order” has been successfully handed over to the WiFi task.

Without back and forth communication I’d say it would still be not certain that a publish succeeded even if the packet was sent and acknowledged due to other instances involved in the communication. So some extra data “cost” would need to be introduced.

But this might again be a possible proposal @mdma and @jvanier might have an opinion about :wink:
The Particle.publish() delivery issue was already part of some other threads where System.sleep() needed to be postponed in order to allow for delivery. So this might go hand in hand.

Thanks for the response. I’m working on testing the idea of subscribing to the devices own published events to confirm receipt and I think I’ve actually discovered a bug.

Supposedly there should be a way to do something like this:

Particle.subscribe(“MyEvent”, myEventHandler, myDevId);

The expected behavior of this is that it subscribes to “MyEvent” event-prefix from the local device only. This ends up listening for any “MyEvent” event on the public firehose event stream instead (it seems to be disregarding the myDevId string entirely).

Is there an acknowledged issue with subscribing to events from a specified device ID only, or perhaps there is some other shorthand (besides the device coreid) I need to pass to the function instead to indicate I only want to receive events from the local device?

That is not a bug, it’s just not implemented yet :wink:
You’d need publish PRIVATE and send an indicator (e.g. device ID) as payload (or even better as suffix for the event name).
Subscribe will still filter for the prefix, but in code you can check for desired suffixes.

But it’s still planned for implementation, maybe something @Dave could comment on.

Hi All,

Good questions! Better event support, including persistent / stored events, and event queries, and event acknowledgement are features that we get lots of requests for. I don’t know exactly when they’ll be on the roadmap, but I’m hoping we can get them in sometime this year. I’m sorry I don’t have a more concrete timeline yet. It’ll require both cloud and firmware changes, so we still have some work before this will be ready.

If you need acknowledgements that your events were published, you can setup a webhook to listen for your events. It’ll send a “hook-sent” event back to your device every time the event was received, so you know which ones were sent out.

I hope that helps!

Thanks,
David

Thanks @Dave for chiming in :+1:

And any ETA for Particle.subscribe() for Device ID?


@solarplug, where have you found the docs about this?

I know it was there, but can’t find it anymore (maybe exactly for above reason)
If it’s still burried somewhere in the docs, we’d need to hide it for the time being.

1 Like

I found the Particle.subscribe syntax with the myDevId portion by looking through the firmware on Github (I guess I always assumed it was possible so I didn’t check the documentation first)–I had found some references to it on other forum posts but I’m not sure offhand which ones in specific (and it’s entirely possible they were referring to the javascript API and not the wiring calls).

It would be great to have this functionality down the road but for now we will proceed with sequencing the deviceID into the event name to allow for targeted subscribing.

Thanks!