SYSTEM_THREAD 0.4.7 Photon

There was a post saying that this message is due to a cloud update, it happens because the photon resets right after a flash instead of replying flash ok, its not a real flash error.

2 Likes

Actually - I believe that is not the case.

My photon(s) accept a download and then just continue what they were doing - I now have to press Reset after EVERY download.

NB This is probably because I am no longer able to use automatic mode and am running Manual - with a Particle.process() every 10 seconds.

Its now actually preventing me from deploying products as I can no longer rely on remote updates to work properly.

Regards

Graham

I’ve successfully reproduced and fix the issue. This is caused by the application thread being blocked. The system is attempting to notify the application that a reset is about to happen, but the application isn’t processing this.

The purpose behind the notification is to give the application chance to put the device in a safe state prior to reset.

I added a 30s timeout on the reset event so that an unresponsive application doesn’t prevent the system from resetting. Please note that an unresponsive application will cause problems in other areas, such as not handling cloud functions, and button events, since these are all handled on the application thread.

One way to keep the application thread responsive is to move blocking code to a separate thread, which can block freely, and keep the application loop() free for processing application events.

2 Likes

Hi,

Thanks again for the responses…

I am now even more confused though. My application is not normally blocking - until something goes wrong, then we remain in Setup until we have a connection. That I am about to change - so that we enter Loop() without waiting for a valid connection.

BUT - you also say that the system is trying to tell the application about pending events ???. As far as I saw posts - this is a FUTURE feature - and not in 0.4.7 ???.

I am using 0.4.7 as its the ONLY release which is full - 0.4.8 is still RC1 and 0.4.9 is still a while away ??? - Q1 ???.

Having done more testing, my fixes for Re-initialising the UDP seem to work (testing still ongoing!!), Now I have the issue of not exiting Setup after a reboot. Thats my problem and will be fixed very soon now ;-)).

Look forward to your reply re notifications though - as I am not handling ANY as I don’t know how to = -or if its even supported in 0.4.7 ???.

Should I really be using 0.4.8-rc1 in a 'live (albeit user trial) scenario ???.

Thanks again

BR
Graham

One way to know that your application isn’t blocked is to toggle the D7 LED periodically. Please try that when performing an OTA - my guess is that the LED will not be toggling.

0.4.9 will be released next week, so it’s not a far off.

Hi,

Thanks for the update - will wait for 0.4.9 then ;-)).

As for toggling D7 - great for a test app but my D7 is a digital output controlling an external relay :open_mouth:

I know that my app is still running after update, as I have a 10 second loop with 10 1sec ā€˜states’ and each state sends debug via serial1, So I can see it still running all my code…as if nothing changed.

My app is not blocked at all - apart from a few ā€˜stutters’ while the OTA (app only) comes down. It then just carries on as if nothing has happened - until some time later (if I don’t press Reset!!) - usually a minute or so, it suddenly decides to Reset. I can hear it do so as all Relay outputs suddenly revert to OFF, so I can hear the relays drop out :wink:

NB I can’t really ā€˜simulate’ this on a cut-down app, as the whole app is now pretty comprehensive…30k binary.

Once this ā€˜instability’ is fixed the Photon really will be a truly excellent product, we already have a couple on live trial, and are planning a LOT more ;-)). I am about to leave one in Florida controlling my pool heatpump - when I return to the UK :-(( - hence why stability is paramount ;-). The trial units in the UK are already doing a comprehensive job running a whole-house heatpump system, controlling the HP flow temps as well as hot water and house heating…

An excellent product ;-)).

BR

Graham

How often does loop() exit? That’s the key here. Your app may be running a local loop, but as far as the system is concerned, if loop() doesn’t return then the app is blocked. If each loop iteration takes 10 seconds, there are a number of events, so it will take several iterations to work down to the reset event.

Nope,

The loop exits every second, and runs delay(xxx) for much of the rest of the time. I just have 4 second timeslots where we run a bitbanged IO to read some 1wire temp sensors (takes most of the 1sec timeslot - with an 800mSec delay anyway :-O), and the rest delay when their job is done (MINIMAL time used). I can calculate it if it will help, but I can’t see that it makes much odds.

I turned OFF RTOS timers when I was having issues, so now I just count elapsed time for each 1 second run through the loop, and delay for (1000-elapsed), then exit the loop.

Top of loop code

starttime = micros();

End of loop code

endtime = micros();
  Serial.print("\r\n");
  //Serial.print(pubdata);
  i = 1000 - (endtime-starttime)/1000;
  if (i > 0)
    delay(i);

So - assuming you actually DO some processing during delay, I am giving a LOT back.

Also I am running :
SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(MANUAL);

anyway, so I (according to the docs and posts) I don’t even need to exit the loop at all do I ???.

So you see I am actually given most of the processing time back, and my code (although complex) should be actually quite light on performance :-O.

Hope this helps…

BR
Graham

For 0.4.7, delay in application code doesn’t pump application events but is simply a delay. 0.4.9 will address this, so that loop() doesn’t need to exit so long as delay(), or Particle.process() is called.

The consequence of this is that the application guidelines for single and multithreaded apps will be the same regarding blocking - call delay or Particle.process() from time to time.

If I've not completely missed the train there, I might add this is true when SYSTEM_THREAD(ENABLED) is used (as Graham indicated), but is not entirely true for single threaded application code.
Or am I wrong there @mdma?

If it's correct, this comment should just complete the picture - otherwise I'll delete this post when clarified.

That’s right - the discussion is about threaded behavior, so threading was implicit, but thanks for calling this and making it clear. :+1:

For single threaded, delay()/Particle.process() runs the system tasks and application callbacks. What I’m proposing for 0.4.9 is that even for threaded operation, delay() processes application events (cloud function calls, event notifications etc…) This ensures that application events get processed timely, keeps the model familiar, yet we still have the system tasks running as a separate thread.

To help understand why I chose this approach, the alternative would be to post all application events on yet another thread separate from loop(), but then everyone’s application code must then be coded as multithreaded (using volatile for all data shared between the loop, cloud functions and event handlers). I’d prefer to make it opt-in. If adding delays in the application doesn’t work for a particular case, application developers can choose to create a separate thread themselves for their blocking application code that then leaves the loop() (which could be empty) to pump application events.

2 Likes

Guys,

Many thanks for this clarification - I must admit that previous responses to a similar posting stated that delay(xxx) did indeed do some system processing - and nothing told me that this was different under various threading mode :-O.

So - moving forward…

In 0.4.9 will you be using the FreeRTOS vTaskDelay for delay ???. In this case ANY other tasks would get the unrequired processing time. This is how I have always used FreeRTOS in my PIC projects for many years now ;-), and it works VERY well.

There was also conflicting advice regarding particle.process() - stating that it is simply a stub function, and as the system thread is running separately, that particle.process() is NOT required -O.

Might I ask if this could be clarified/explained in the docs as it is clearly a little understood (or even mis-understood) operational detail !!.

I suspect that we will go with what we have and await 0.4.9 before making many more changes.

NB we WILL be able to reliably upgrade OTA to 0.4.9 won’t we ??? - ie without a physical presence to reset or re-set-up wifi etc. The reason I ask is that it might just arrive AFTER I leave my module 5,000 miles behind me - at the end of next week ;-O.

Finally @mdma, can I thank you for joining this discussion - it is GREAT to have a particle developers responses - your forum colleagues are EXCELLENT and have given me LOTS of extremely valuable support - but are not actually particle developers (NO criticism intended AT ALL !!!) ;-))…

Best Regards

Graham

@GrahamS, could it be that you’re refering to the info from this post of mine a while back, when you say you got other information?
Lost memory after publish?

If so, please re-read!

If your application doesn't block the OTA process then it will work fine, but I can't make any promises given that we've already seen an issue because of application delays, and it's these issues we are looking to overcome by adding behavior changes in 0.4.9.

If you wanted to be totally certain the update is successful, you could implement your own "standby" cloud function. When called, this would put the device in a standby state (e.g. shutting off any critical machinery), and short-circuit loop() to be a no-op so that no delays are executed.

bool standby = false;

int standby(String arg)
{
    // shutdown critical things, put the device in a safe state
    digitalWrite(D7, LOW);
    standby = true;
    return 1;
}

void setup()
{
     Particle.function("standby", standby);
     // rest of setup()...
}

void loop()
{
     if (standby)
         return;

    // rest of normal loop
}

In future, this functionality will be part of the core firmware, allowing you to remotely put devices in standby mode, reset etc.. from the Dashboard.

3 Likes

Word. They are excellent, and deeply appreciated.

Sure :smile: The docs will be updated as we refine our system threading model. System threading is Beta so that we can find out the best balance of features and behaviors before setting the API in stone. I feel we are close, so the API will be finalized soon.

4 Likes