All normal-pri user threads suspended for two seconds shortly after WiFi connect starts?

I’m running with SYSTEM_THREAD(ENABLED) and SYSTEM_MODE(MANUAL).

My loop() is just

    delay(1000);
    
    if (Particle.connected()) {
        Particle.process();
    }

At the end of setup() I call Particle.connect() and then create a persistent worker thread that spins in a 100msec loop. At the top of each iteration in that thread, I reset a one second watchdog timer in an I2C-connected device (a Pololu Tic500 stepper controller board).

I’ve noticed that within a few seconds of starting up, I miss one of the watchdog resets. I’ve inserted timing calls based on millis() and it seems that I consistently get 2 second long hangs (2000-2010-ish msec - oddly precise). This also happens if I just use loop() for this instead of a separate thread. Further debugging suggests that my thread (or loop()) gets switched out at arbitrary points (like in the middle of an I2C call), not just while yielding.

I can solve my problem by creating the thread with a priority of 7 instead OS_THREAD_PRIORITY_DEFAULT (which is 0) - any lower priority value and I still get preempted.

I feel like Particle should manage its internal thread priorities better because this took many hours to figure out and requires me to use a dedicated thread when the regular loop thread would have probably been just fine. It seems like there’s something in the Particle WiFi stack running at priority 6 that does a spin wait for exactly two seconds for something, and that thread should either be prioritized way lower, yield during the wait, or OS_THREAD_PRIORITY_DEFAULT and the priority of the loop thread should be raised.

Hey, welcome to the Particle Forum!

instead try

for(uint32_t _ms = millis(); millis() - _ms < 1000; Particle.process());

to see if that helps. Also, if you could post your code - that would help us to better assist you. Thanks!

I tried that and (as I expected, unfortunately) it didn’t change anything.

See https://github.com/rgiese/camera-slider/blob/973c9cb46241e83af25445739e31e7815d8e31f4/packages/firmware/slider/Main.cpp for my current code. If you change line 72 from a 7 to a 6, I get preempted and my external watchdog falls over.

Is there a way for me to enumerate other running threads without doing a full build including the DeviceOS? uxTaskGetSystemState doesn’t seem to be available to me using the regular cloud compiler.

I do not have experience using threads. However, I have checked out some documents, on this forum, you might find useful to solve the issues you are encountering. I will list them below.

System resources are not thread-safe. The USB serial debug port (Serial) can only be called safely from multiple threads if you surround all accesses with WITH_LOCK(). An example would be changing the following line

 Serial.printlnf("-- Current state: %s", g_State->getName());

to

WITH_LOCK(Serial) {
 Serial.printlnf("-- Current state: %s", g_State->getName());
}

You might find these helpful:
https://community.particle.io/t/particle-threads-tutorial/41362

https://community.particle.io/t/delays-in-multithreading/50338

recommend the following code for first connection within setup():

    Particle.connect();
    waitFor(Particle.connected, 30000); 

and then the following within loop() to maintain the connection:

  if (!Particle.connected()) {  //  NOT connected to Particle cloud
  waitFor(Particle.connected, 30000); 

So, your loop() might look like this:

void loop() {
  delay(100);
  if (!Particle.connected()) {  //  NOT connected to Particle cloud
    Particle.connect();
    waitFor(Particle.connected, 30000); 
  }
  Particle.process();
}

@rgiese

I think you are making your solution too complicated and ‘fighting’ the way the Device OS is working on the system thread - by the way which device OS are you using? Particle have changed the behaviour of post connection to ensure a reliable registration of variables and functions. Have you declared any of these or made any Particle.subscribe() calls?

With SYSTEM_THREAD(ENABLED) there is no need to call Particle.process() in loop().

You should do your Particle.connect() in setup and wait with a timeout until it has connected as described above.

Have you tried with SYSTEM_MODE(SEMI_AUTOMATIC) or not cloud connecting?

I’m using version 1.5.2 of Device OS which is what the cloud-based compiler is currently building against, so I’m assuming it’s the latest and greatest.

I’m not doing anything with the Particle cloud in my example (no subscribe or publish).

I don’t want to have to wait for the connect to finish, because that’s the point of manual mode. There clearly is something doing a busy wait at pri 6 for exactly two seconds and there shouldn’t be.

You can instruct the cloud build system to build against other version too.
There are more recent pre-release versions out which you could also try (i.e. 1.5.4-rc.1 and 2.0.0-rc.1 - the latter will be an LTS version and hence is feature frozen but will focus on stability of these features while the former may sport some newer features not present in 1.5.2).