Particle Subscriber callback - takes more than 30 Seconds to a minute

Hello,
I am using P2 based system with device OS 5.4.0
We are using following API to register the callback-
Particle.subscribe(DevId, Callback).

Our integration is using webhook interface to Publish and Subscribe.

We are observing that the Subscriber callback is called in the range of 500ms to more than 30+ seconds.

The questions are-
1- Is there any specific timeline for a callback to be invoked ? Is it possible to get the callback firing within few seconds 100% of the times ?
2- Any configuration we can add to ensure that the callbacks will be called within certain duration (may be 1-2 seconds) ?

regards,
Anil

If a subscribe on a P2, Photon 2, or Argon is taking more than a few seconds, it's usually related to this problem:

When these devices communicate with the Particle cloud, they do so over UDP, specially using DTLS (datagram TLS) over UDP. The reason you don't need to set up port forwarding in your router is that when a device on the LAN makes an outgoing UDP request, the router sets up a temporary port forwarding to return the response packet back to the Particle device.

Since port forwarding is a finite resource, the router will eventually remove these temporary mappings, typically after they have not been used for a period of time, typically a few minutes.

The Particle device periodically sends a packet ("keep-alive") so the router does not delete this port forwarding. The default keep-alive for Wi-Fi devices is 25 seconds. (For cellular devices, it's 23 minutes for the Particle SIM.)

If your site's router deletes the port forwarding, there is no notification to either side, so the cloud will still believe the device is online, and the device is still breathing cyan. However, any packet sent from the cloud to the device will fail, because the mapping no longer exists and the packet is discarded. This includes function, variable, subscribe, and OTA.

However, when the device sends its next keep-alive, the router will typically recreate a port mapping. If this is done within the retry interval for the subscribe, the event will go through, which is why it arrives much later than expected, but still arrives.

The first thing to try is to reduce the Particle.keepAlive() , maybe try 15 seconds and see if that makes a difference.

Monitoring the logs using trace logging level may provide additional insights as well.

SerialLogHandler logHandler(LOG_LEVEL_TRACE);
1 Like

Hi Rick,

I am talking about the Webhook interface to publish the messages and getting the callbacks back to my P2 device.

I am calling the following APIs to register a callback-

Particle.subscribe(System.deviceID(), sys_id_subscriber_cb);

To publish the message to Particle, following API is called-

Particle.publish(_publish_event_type, quoted_message, TTL, PRIVATE | _ack_nack)

This publish() API triggers a callback which is registered through subscriber() API.

The problem is that this callback occasionally fires back after 20 or 30 seconds. Is there any configuration/API which can help to reduce this callback time and make it persistent below 5 seconds or so?

It will be great help if you can connect me to Particle’s cloud team as it might block our device development.

Appreciate all the help.

Thank you,

Anil

That is exactly what I was referring to, and is one cause of that behavior.

The other cause of that behavior is blocking loop(), not returning immediately. Because subscription handlers are called between calls to loop(), blocking loop will prevent your subscription handler from being called.

I don't see where where the webhook is involved, unless you are subscribing to the hook-response or hook-error on-device. In that case, also check the integration log timestamps to make sure that the external server you are calling isn't causing a delay.

The cloud won't delay sending the event to the subscription handler, so there's no setting to make it faster. It will only be stopped by network issues (keep-alive or packet loss requiring retry) or an on-device issue (blocking loop, or a blocking operation on the system thread).

Makes sure you are using SYSTEM_THREAD(ENABLED) so the cloud messages can be processed independently from your code, except for the calling the function and subscription handler issue, which still is only done between calls to loop even in threaded mode.

Make sure your code is not using noInterrupts() or SINGLE_THREADED_BLOCK for extended periods of time, as this will block the system thread, which can cause the event to the subscription handler to be lost, which will cause a delay for the retry.

Hi Rick,

Any update on this issue? Can we try reaching the cloud team? Please share the contact and I can discuss it further.

Regards,

Anil