Hard fault with Soft AP calls

I’ve been using JS code to make constant requests at a 4 second per request rate to a P1 in Soft AP mode at http://192.168.0.1/device-id. Once I get back the correct information in the JS code indicating that the device is connected to the P1, the code then walks a user through setting up credentials (i.e. getting the public-key, sending creds etc.). I’m running into some difficulties with this approach.

  1. The P1 will occasionally hard fault when connected via Soft-AP. This was relatively frequent but hasn’t been a problem since upgrading the firmware to the 0.5.1rc.

  2. Despite being connected, the JS code will occasionally not receive the device id. Once the P1 is restarted, the problem is usually fixed.

  3. When credentials with a bad password are sent, the device often hits a hard fault. We are trying to keep this process as simple and bug free as possible, so we removed the ability to store more than one set of credentials. In the loop we try to account for bad credentials being sent with the following code

     if (WiFi.connecting()) {
            delay(1000);
            Particle.connect();
            delay(1000);
            if (waitFor(Particle.connected, 5000)) {
               // Successful connection
              menuID = 76;
            } else {
               // Bad credentials
              menuID = 78;
            }
          }

I would say about 50 percent of the time the device freezes and does not reach the //bad credentials lines. Otherwise it works well.

Thanks for all your help!

Just to note:
The inability to Particle.connect() within 5sec is no real indication for wrong credentials.
You might want to WiFi.connect() and waitFor(WiFi.ready, TIMEOUT) instead and use a slightly longer timeout.

The presence of wrong creds should also not cause a hard fault, so I’d rather think that some other part of your code might cause that, but is hard to tell without seeing more of it.

Thanks Scruff,

I did try the following code last night with less reliability.

if (WiFi.connecting()) {
     delay(1000);
     WiFi.connect();
     delay(1000);
     if (waitFor(WiFi.ready, 5000)) {
          delay(1000);
          Particle.connect();
          delay(1000);
            if (waitFor(Particle.connected, 5000)) {
               // Successful connection
              menuID = 76;
            } else {
               // Bad credentials
              menuID = 78;
                }
          } else {
                  menuID = 78;
            }
}

I’ll strip down the code to something as simple as possible and get back to you shortly. Every Particle call we make is preceded by if (WiFi.ready() && Particle.connected()) {

1 Like

I stopped most of the rest of our code from running in the loop during this process and it did help improve reliability quite a bit. No more freezes. I am getting one weird situation that only occurs after several attempts (5ish) of adding incorrect credentials.

  if (WiFi.connecting()) {
    subMenuForWifi(8);
    delay(1000);
    Particle.connect();
    delay(1000);
    if (waitFor(Particle.connected, 5000)) {
      menuID = 76;
    } else {
      menuID = 78;
      WiFi.clearCredentials();
      delay(500);
      WiFi.off();
      delay(500);
    }
  }

I’m trying to clear the credentials as soon as a unit is unable to connect. This works nearly every time, but occasionally the chip continually broadcasts the Soft AP despite the WiFi.off(); call above. Then when I try to reconnect to it, it is unresponsive. When I restart a device, it’s good to go again.

I think there is an open issue about this on GitHub.
And others Pull requests · particle-iot/device-os · GitHub

On the other hand it would be interesting why the solution with waitFor(WiFi.ready, 5000) would be less relyable - the opposite would be expected.
If you used WiFi.connect() you could also supply the flag WIFI_CONNECT_SKIP_LISTEN in case you don't want to enter listening mode on fail.

When you are entering the else branch of your if (waitFor(Particle.connected, 5000)) you might actually be connected to WiFi, so you might want to add a WiFi.disconnect() before clearing the credentials.


Looking at that code (I removed the superfluous delay() statements)

if (WiFi.connecting()) {
  WiFi.connect();
  if (waitFor(WiFi.ready, 5000)) {
    Particle.connect();
    if (waitFor(Particle.connected, 5000)) {
      // Successful connection
      menuID = 76;
    } else {
      // Bad credentials /// <--- this is a wrong conclusion
      menuID = 78;       ///      only the cloud is inaccessible but WiFi is already ready
    }
  } else {
    menuID = 78;         /// <--- and here 5sec might just not be long enough
  }
}

I guess the inner else is wrong, since the credentials are not wrong, otherwise you would never have got into the true branch.
For checking the WiFi credentials, I'd completely dump the Partricle.connect() attempt.

I don’t know if it’s related to your issue, but the 0.5.1-rc.1 firmware includes a fix for “a timing-critical bug in WICED that causes system freeze”

1 Like

Thanks for the tip on the issue in github!. I’m guessing it’s all related. We want to make sure they get access to the particle cloud, so it seems to be working fine as is. Although I have to admit, due to the intermittent nature of the bug, I could be completely wrong about the wifi.ready being less reliable. I’ll quantify a little better next time.

That new firmware did wonders for stabilizing some devices with spotty network connections, but we had it installed already for the current issue. Hopefully 0.6.0 takes care of these last smaller issues!

We noticed reproducible hard faults when we have two photons in listening mode at the same time. The presence of the 2nd causes the first one to hard fault during pairing. This may or may not be your problem, but it might help your tests. At first it looked like this happened “occasionally and sporadically” until we realized it was virtually 100% of the time if another developer was doing a pairing test at the same time.

Ps. We used firmware version 4.9.

That’s fits our pattern pretty well, thanks for the tip. It’s very likely we had another unit in setup mode while we were running tests. Since that’s unlikely to happen with customers, we will definitely test only with one unit at a time now

@hine If you can confirm this is your problem, let us know! I’d be curious to know that this is an ongoing unfixed bug still present in 0.5.1.

Also curious if it is a known problem reproduced by particle, and if there is a plan in place for a fix. Cheers!

@philip, have you filed a GitHub issue about this already?
That’s the best way to get the devs to have a look at something like this.

It’s also a good place to find out whether anybody else has reported this before and what other issues are known and which fixes are pending.

Was there any progress on this? We noticed issues in noisy environments, so in busy office buildings with many aps.