Webhook response subscription lost after reset on 2.0.1

This example code and webhook will particle.publish every 20s, and on receiving the subscribed webhook response, it will particle.publish “received” to be viewed in the Console.

But after activating the particle.function RST from the Console resulting in System.reset(), when the device gets online again it no longer publish “received” to the console and seem to have lost it’s webhook response subscription.

When uploading slightly changed code, (ex. add/remove a delay(1)), webhook reception will start again.

This is on B523 with OS2.0.1, but likely the same with Boron.

Am I missing something?

Code:

SYSTEM_MODE(SEMI_AUTOMATIC)
SYSTEM_THREAD(ENABLED);

SerialLogHandler logHandler;

static bool SystemResetNow = false;
static unsigned long lastUploadTime = 0-5000;
const pin_t WATCHDOG_PIN = D22; 


void setup() {

    pinMode(WATCHDOG_PIN, OUTPUT);
    
    SystemResetNow = false;
    lastUploadTime = millis();

	Particle.function("RST", startReset);
    Particle.subscribe(System.deviceID() + "/hook-response/upload/", uploadResponseHandler);

	Particle.process();
    Particle.connect();
}

 
void loop() {
    
    if (Particle.connected() && Time.isValid()){
    
        if (((millis() - lastUploadTime) >= 20*1000)){
            lastUploadTime = millis();
            delay(1); // to re-enable webhook response reception after RST, include/exclude this line and upload
            Particle.publish("upload", "{\"T\":\"2021-02-02T12:51:03Z\",\"R\":-72,\"I\":60,\"D\":\"{}\"}", 60, PRIVATE, WITH_ACK);
        }
    }

    if(SystemResetNow == true){
        SystemResetNow = false; 
        delay(3000); 
        System.reset();
    }
    
    digitalWrite(WATCHDOG_PIN, !digitalRead(WATCHDOG_PIN));
}

int startReset(String input) {
    Log.info("startReset()");
    SystemResetNow = true;
    return 1;
}

void uploadResponseHandler(const char *event, const char *data) {
    Log.info("Response: OK");
    if(Particle.connected()) Particle.publish("log", "Received", PRIVATE, NO_ACK);
}

(The webhook is meaningless - just gets the demo going)

{
    "event": "upload",
    "responseTopic": "{{PARTICLE_DEVICE_ID}}/hook-response/{{PARTICLE_EVENT_NAME}}",
    "url": "https://www.google.com",
    "requestType": "GET",
    "noDefaults": false,
    "rejectUnauthorized": true,
    "responseTemplate": "V"
}

@thrmttnw - hm, would you mind opening a support ticket for this (support.particle.io)? It would be great to have your DeviceID and Integration name for more detailed troubleshooting. I’d like to reproduce and also watch traffic in your account.

Done … oh, should have mentioned that the webhook and device is set up in a product in console.

I believe the reason the delay is required is that the subscribe does not complete until some time after connecting to the cloud.

With the way that it’s written, without the delay, the webhook is triggered and the response comes back before the subscription is complete. It’s not that it was not subscribed, it just didn’t happen yet if the delay is left out.

Unfortunately there is no call to find out if your subscription has been activated yet, but it’s a non-zero but small number of milliseconds.

It fails in the same way with or without the delay. The delay is only (un)commented to make a code change that provokes restarting to receive the web hooks.

With or without the delay it looks like this:

Just a note on your subscribe handler.
Since the a subscription will only ever fire when you have an active connection there is little point in checking if(Particle.connected()) (unless your code would somehow directly call that function :wink: ).

Also your screenshot does not show any sign of an event that would fit your filter.

How do you trigger the webhook?
What does the integration log tell you about the triggering event?

Thank you for taking a look.

Yes, just to make sure the response subscription should be in place before triggering it with the publish, to rule that out.

The integration log tells me

  1. when the publish of "upload" triggering the web hook has fired
  2. if/when the device received a web hook response that triggered a publish of "received"
  3. before the reset, the device receive a response triggering the publish of "received"
  4. after the reset, the device no longer receive a web hook response multiple times in a row

It unfortunately is quite a bit less than what is needed for debugging.

I have some SSE logging below parallel to the integration log above.

The SSE log tells me:

  1. To me the subscribe to the response looks correct
  2. To me the web hooks contain/acts the same before and after the reset
  3. The only difference is that after the reset, the device no longer publish “received”, most likely because it no longer receive the web hook response to trigger that
2021-02-12T14:49:22.199 +00:00	e00fce68ba235059f4a1588f upload "data":"{\"T\":\"2021-02-02T12:51:03Z\",\"R\":-72,\"I\":60,\"D\":\"{}\"}
2021-02-12T14:49:22.493 +00:00	particle-internal        hook-sent/upload 
2021-02-12T14:49:22.493 +00:00	particle-internal        e00fce68ba235059f4a1588f/hook-response/upload/0 V
2021-02-12T14:49:22.689 +00:00	e00fce68ba235059f4a1588f log Received
2021-02-12T14:49:42.196 +00:00	e00fce68ba235059f4a1588f upload "data":"{\"T\":\"2021-02-02T12:51:03Z\",\"R\":-72,\"I\":60,\"D\":\"{}\"}
2021-02-12T14:49:42.307 +00:00	particle-internal        hook-sent/upload 
2021-02-12T14:49:42.359 +00:00	particle-internal        e00fce68ba235059f4a1588f/hook-response/upload/0 V
2021-02-12T14:49:42.602 +00:00	e00fce68ba235059f4a1588f log Received
2021-02-12T14:50:02.274 +00:00	e00fce68ba235059f4a1588f upload "data":"{\"T\":\"2021-02-02T12:51:03Z\",\"R\":-72,\"I\":60,\"D\":\"{}\"}
2021-02-12T14:50:02.363 +00:00	particle-internal        hook-sent/upload 
2021-02-12T14:50:02.421 +00:00	particle-internal        e00fce68ba235059f4a1588f/hook-response/upload/0 V
2021-02-12T14:50:02.661 +00:00	e00fce68ba235059f4a1588f log Received
2021-02-12T14:50:22.289 +00:00	e00fce68ba235059f4a1588f upload "data":"{\"T\":\"2021-02-02T12:51:03Z\",\"R\":-72,\"I\":60,\"D\":\"{}\"}
2021-02-12T14:50:22.312 +00:00	particle-internal        hook-sent/upload 
2021-02-12T14:50:22.393 +00:00	particle-internal        e00fce68ba235059f4a1588f/hook-response/upload/0 V
2021-02-12T14:50:22.676 +00:00	e00fce68ba235059f4a1588f log Received
2021-02-12T14:50:31.799 +00:00	e00fce68ba235059f4a1588f spark/status offline
2021-02-12T14:50:31.834 +00:00	e00fce68ba235059f4a1588f spark/status online
2021-02-12T14:50:31.840 +00:00	e00fce68ba235059f4a1588f particle/device/updates/enabled true
2021-02-12T14:50:31.840 +00:00	e00fce68ba235059f4a1588f particle/device/updates/forced false
2021-02-12T14:50:38.693 +00:00	e00fce68ba235059f4a1588f particle/device/updates/pending false
2021-02-12T14:50:48.222 +00:00	e00fce68ba235059f4a1588f upload "data":"{\"T\":\"2021-02-02T12:51:03Z\",\"R\":-72,\"I\":60,\"D\":\"{}\"}
2021-02-12T14:50:48.344 +00:00	particle-internal        hook-sent/upload 
2021-02-12T14:50:48.395 +00:00	particle-internal        e00fce68ba235059f4a1588f/hook-response/upload/0 V
2021-02-12T14:51:08.228 +00:00	e00fce68ba235059f4a1588f upload "data":"{\"T\":\"2021-02-02T12:51:03Z\",\"R\":-72,\"I\":60,\"D\":\"{}\"}
2021-02-12T14:51:08.290 +00:00	particle-internal        hook-sent/upload 
2021-02-12T14:51:08.338 +00:00	particle-internal        e00fce68ba235059f4a1588f/hook-response/upload/0 V
2021-02-12T14:51:28.222 +00:00	e00fce68ba235059f4a1588f upload "data":"{\"T\":\"2021-02-02T12:51:03Z\",\"R\":-72,\"I\":60,\"D\":\"{}\"}
2021-02-12T14:51:28.296 +00:00	particle-internal        hook-sent/upload 
2021-02-12T14:51:28.364 +00:00	particle-internal        e00fce68ba235059f4a1588f/hook-response/upload/0 V

Is it normal for the integration log to not show web hook responses? If so, how can you debug web hooks without an SSE log?

@marekparticle

Same behaviour on a B-SOM dev board (powered with 12V 2.1A and 1200mAH LiPo to be sure).

OTA the above firmware and let it start up, remove power for a minute, after it is online let it send and receive 3 times, provoke RST, and afterwards it no longer receives. A serial trace confirms this.

Doing it a number of times, it does occasionally keep working, but too rare.

The web hook response from our server confirms uploads, before deleting data on the device, to mitigate historic Particle cloud outages being a showstopper for the target customer base.

I have not found a work-around, so the deployed field test devices have been downgraded to OS1.5.2.

For support:

Expected behaviour: The device both before and after a particle.function “RST” induced reset, is expected to publish log "receive " as proof of receiving a webhook response after publising “upload” every 20s.

Actual behaviour: after a particle.function “RST” induced reset, the device no longer publish log "receive " as proof of receiving a webhook response after publising “upload” every 20s.

Delay(1) significance: the expected and actual behaviour happens with or without delay(1). delay(1) is used as a code change so a SW download will “hard reset” the device and restore expected behaviour. Power cycling the device for a minute will also restore expected behavior.

I just set up an Argon, with nothing connected but power, in a Console product with the above webhook and code on OS2.0.1, and it fails in the same way. [edit: changed D22 to D6 in the code]

Loosing web hook reception was originally discovered happening in field test units without a reset. And remotely inducing a system.reset() did not restore it, but uploading slightly modified code did restore web hook reception.

Thanks for this - I’ll make sure that our team sees it!

After 10 days on 1.5.2, I have not yet seen any devices loosing webhook response reception.

I finally managed to replicate this behaviour on 2.0.1.
It does NOT happen on 1.5.2.

Taking this to the DeviceOS team.

1 Like

So far 1.5.2 is not a fix for us.

Our implementation of the excellent PublishQueueAsyncRK lib with i2c connected FRAM (MB85RC256V-FRAM-RK lib) is unreliable on 1.5.2 on our field test units.

Together with web hook response confirms from our server on uploads, we use this to mitigate historic Particle cloud outages being a showstopper for the target customer base.

Are there known errors on 1.5.2 related to i2c or known system memory leaks on 1.5.2?

As any code change restores the webhook reponse reception on OS 2.0.1, does anyone know if there a way to induce a reset from user code that has the same effect as a code change (OTA SW download)?

Not sure if this will help in the given case, but we had similar experiences in the past and a temporary workaround was to considerably change the “signature” of exposed functions/variables and subscriptions by adding a dummy item with an ever mutating name.

This way any previously stored (and possibly invalid) connection context would be ignored and a new context would be created.
This would come at the expense of extra “negotiation data” needing to be exchanged between device and cloud - which may be one of the reasons for introducing some of these changes (with potential for being overly protective :wink: ).

1 Like

Thanks, a good idea - that should fix it.

But the example below had no positive effect.

Added this before setup() :

static bool dummyer = true;

And before particle.function … I added this:

Particle.variable(String(random(10, 1000)), dummyer); 

Still learning. In console, the log under device view I used above is limited, whereas the top level event log includes webhook responses.