[Fixed] Webhooks callbacks are not reliable!

Dave · June 8, 2015, 2:43pm

Hey all!

So, there is an old bug in the core firmware where the subscribe request isn’t sent after the session is dropped and reconnected. If your connection is dropping frequently, this can cause your subscribes to disconnect. I believe this is fixed in the newer photon firmware which will be available for cores soon as well. I suspect that is what’s causing the frequent subscribe failures you’re seeing above.

Thanks,
David

peekay123 · June 8, 2015, 2:46pm

@Dave, ok but this does not address the issue of webhooks not firing when they are called from CLI and using the dashboard to monitor the response. You did mention above that you may know what’s up.

Dave · June 8, 2015, 2:50pm

Heya!

Totally, I suspect that’s a separate issue. My guess is that if the very first event / response doesn’t come through, but subsequent requests / responses do come through normally, that could be a side effect of the internal messaging that hooks use. The workaround would be to test run, or run it again if a response isn’t heard within some timeout. I’m going to test that more fully and build a fix for it if that’s an issue, but it might take me a sprint or two to fix.

Thanks!
David

peekay123 · June 8, 2015, 2:53pm

@Dave, I can’t vouch for the first event which often doesn’t fire but mostly for subsequent events which often don’t fire for several publish events. Mind you, I have not tested this on the latest firmware on my Core-based RGBPongClock.

kennethlimcp · June 10, 2015, 3:52pm

@BulldogLowell,

any code for me to test? Webhooks is now my new toy

I have some magic that allows me to see what might be the issue

ruben · June 15, 2015, 12:54pm

My problem is that it counts as a hit every time I make my external API request so I’d rather not have to try multiple times to get a parsed response alas, I do not have magic glasses. Let me know if you need any other info from my setup.

BulldogLowell · July 12, 2015, 8:26pm

Would anyone know if it is possible that I am being rate limited (e.g. Weather Underground) because of Particle’s servers making calls to an API?

In other words, is it possible that the Weather Underground API is seeing the aggregated volume of all of our web-hooks as if they are from a single IP address?

I’m not getting consistent returns and it is troublesome.

When I make the calls from my Chrome browser… they are returned consistently. Using particle publish not so much.

Moors7 · July 12, 2015, 8:32pm

Following the docs, that might indeed be possible as that API seems to be fairly popular. Perhaps @dave can elaborate?

BulldogLowell · July 12, 2015, 8:36pm

Well, that may explain my problem, then.

Thanks Jordy, I missed that one…

Dave · July 13, 2015, 8:48pm

Hi @BulldogLowell,

I checked the logs and I’m not seeing host specific rate limiting for the weather underground, we have a default upper-bound for any host, to make sure we’re not being too aggressive. I wonder if you’re hitting your max number of webhooks per minute? (6 per minute per device), or maybe just publishing too fast? (try to average no more than 1/second). Please feel free to PM me with any details and I’m happy to look into it more deeply for you.

Thanks,
David

Dave · July 16, 2015, 10:46pm

Hi Everybody!

This has been driving me mad, so I’m very happy to report I just fixed an issue we discovered that might have been making the webhook responses less reliable. Essentially there was a configuration issue on one of the connections in the flow, and it was causing problems that were hard to detect. Can you give it a try and let me know if it’s better / worse / etc for you?

Thanks again for everyone’s help troubleshooting this, and for your patience, I’m really excited that we might be putting this particular issue to rest.

Thanks!
David

mayhew1955 · July 21, 2015, 8:11am

Hi David,

Thank you for your perseverance. I started testing again last night and will continue tonight to check that we have the reliability sorted out.

Fingers crossed,

mayhew1955 · July 21, 2015, 9:22pm

Hi David,

I have persevered with my tests and have started seeing dropped responses. It does seem to be slightly better but it is far from reliable.

My setup is as follows:

void setup() {

    //  particle serial monitor
    Serial.begin(115200);

    //  subscribe to webhooks
    bool subscribed = Spark.subscribe("hook-response/io_", ioBridgeCommand, MY_DEVICES);
    if (!subscribed)
         Serial.println("subscription failed for iobridge");

    // and wait at least 10 seconds to allow time to connect
    delay(10000);

}

void loop() {

    Serial.println("Requesting salon temp");

    // publish the event that will trigger our first webhook
    Spark.publish("io_temp_int");

    // and wait at least 60 seconds before continuing
    delay(60000);

    // publish the following 4 events in the same manner
    Spark.publish("io_heat_on");
    delay(60000);
    Spark.publish("io_pump_on");
    delay(60000);
    Spark.publish("io_heat_off");
    delay(60000);
    Spark.publish("io_pump_off");
    delay(60000);


}

// simple response test
void ioBridgeCommand(const char *name, const char *data) {

    Serial.println("ioBRidge command response");

    String strName = String(name);
    String strData = String(data);
    Serial.println(strName);
    Serial.println(strData);

}

And the responses are:
5 replies
5 replies
3 replies
4 replies
3 replies
1 reply
3 replies
4 replies

I believe that the servers being called are reliable, I never have a problem with curl. I can say that there is some improvement though because the fact that we can receive only 1 reply and then 3 or 4 replies is better than before. My observations before were that once a webhook died it stayed dead !

I’m sorry it’s not 100% reliable but I must underline that I have found a satisfactory workaround for my setup so am no longer relying on this fix.

Dave · July 22, 2015, 4:07pm

Hi @mayhew1955,

Thanks for posting your test results! This is very helpful, I’ll continue to work on this. As a general note though, checking for a response from your server in the subscription should allow your code / device to be confident the server was reached. I think ultimately in any reliable system there needs to be positive checks / confirmation all the way through the stack.

Thanks!
David

ruben · July 24, 2015, 6:25pm

I am still not getting responses. Did we have any movement on that?
What would you recommend? Waiting a few seconds for a response and if not, then re-issuing a call?
Is there a simple example for how you'd do this?

trackdork · July 26, 2015, 9:00pm

I've been struggling with this for some time too. I have a system that shuts down my Photon after processing a hook response, then wakes up 10 minutes later and attempts to get another hook response. I have had to resort to re-sending events every 15 seconds or so until I get a response. I don't have a clean dashboard screenshot of this, but I don't see responses there either most of the time. I'd say I get 1 in 3.

EDIT: After some more thought, I guess the real problem with this is that, with my electron arriving in a few months, sending an extra hook request or two, or three or FIVE is a near-showstopper for a system that is on a limited data budget. My entire architecture relies on this mechanism. I have no doubt that the team can find a better solution than we have now, I just hope that the issue gets the attention it needs.

Dave · July 27, 2015, 3:59pm

Hi @ruben and @trackdork,

Definitely, especially in a bandwidth constrained environment, messages must be as reliable as possible. I’ll continue to dig into this until it’s resolved.

Thanks!
David

trackdork · July 28, 2015, 3:00am

Thanks @Dave, I appreciate the hard work. It’s a challenge no doubt when significant parts of the overall system are out of your control. Let the community know if there’s anything we can do to help!

ruben · August 18, 2015, 3:26am

Just a quick report that I finally put in a loop to keep calling the webhook unless there is a response. Bounces between 2 and 3 calls to get a response. I am fairly sure that wunderground is getting hit with requests even when I am not getting responses in reasonable time (10-25 sec) but I’ll double check tomorrow.

sazp96 · August 18, 2015, 1:43pm

Hello folks,

I’m having the same issue. My workaround that has worked well so far, is to call the webhook every 15 seconds until I get a response.

It takes anywhere from 1 to 5 calls to the webhook before I get a response. Having said that, I always get a response in a 3 minutes window.

@Dave, I have detail logs with time stamps. I will PM them to you. Hope this helps in your investigation.

Topic		Replies	Views
Delay from webhook call to hook-sent Cloud argon	5	541	March 29, 2022
Hook-response not calling handler function Cloud	4	914	April 7, 2018
System issue? Webhooks failing but status.particle.io says everything is green Troubleshooting	32	5992	June 21, 2016
Webhooks no longer working? Troubleshooting	20	4587	May 20, 2015
Webhook requests gone Cloud	14	654	April 10, 2021

[Fixed] Webhooks callbacks are not reliable!

Related topics