P1 Flashing Issues

hello,

We have device OS 0.6.3 running and we are trying to upgrade to 1.4.4 capable Firmware but we have observed that multiple devices (~10% of 800) simply ignore the flash request even though sending command receives positive acknowledgement.

I was wondering if particle can guide us and show us the path forward as to why this is happening and help us diagnose the problem.

Thanks
Amogh

Have you tried putting the device into safe mode first?

we have built the safe mode functionality now, to remotely be able to put the device in safe mode, but some devices are unable to take this update.

Flashes work (100% of the time) when device is in safe mode, without any problems, but not (~50%) when our application firmware is on.

My understanding was remote flash would work with APPLICATION firmware still RUNNING, is what makes me curious why does

  • p1 : reject the request, and
  • when it accepts it , goes through RED SOS #13 flash.

@amoghjain, this is as I suspected.

Your application is most likley not giving enough time to DeviceOS. You probably have a tight loop that is not calling any DeviceOS functions that yield’s. In these areas of the code call Particle.process().

Refer to https://docs.particle.io/reference/device-os/firmware/photon/#particle-process-

Let us know how it goes.

Hi,

Thanks for the response!

I hadn’t thought about network thread not being yielded to by the app thread, which is something I will take a look into right away, and report to you.

Meanwhile, i was wondering if you can comment on my current understanding of what is happening : currently, the device is showing online on particle’s dashboard and makes posts to our AWS back-end servers and yet ignoring the flash commands, because all of the functionalities are happening on app thread, and not yielding to network threads long enough to full fill its task? the reason posts/mqtt is working because message buffers are small and/or are able to cycle fast enough.

Hey, I just re-read the response, and I would like to clarify that we ARE calling Particle.process() every 2 second to a max 10 sec delay.

That is not often enough to allow for a reliable OTA update.
Normally the cloud task would be executed aproximately 100 to 1000 times per second.

You can try SYSTEM_THREAD(ENABLED) to decouple your app from the system thread.

1 Like

hey @ScruffR,

thanks for the reply!

Just to confirm what I understand as we do have the threads decoupled as we have
SYSTEM_MODE(SEMI_AUTOMATIC);
SYSTEM_THREAD(ENABLED);

and we manually call Particle.process(), we are getting caught up somewhere in the loop and NOT getting to particle.process() in due time (1ms - 10ms) to process the call?

@amoghjain, your description in comment P1 Flashing Issues sounds reasonable to me.

I agree with @ScruffR, calling Particle.process() every two seconds is too slow.

From your comments, it sounds like you are a unsure as to what is happening timing wise. Suggest you log each call to process() and log each time loop() is entered.

If loop() is being called too slowly, this will explain why you need to manually call process() as well.

Beware! It is possible to call Particle.process() too quickly as well, this can cause buffer overflows I believe.

1 Like

This is the first time I've heard about that :confused:
Also with SYSTEM_THREAD(ENABLED) there should be no need for Particle.process() at all (unless thread switching is blocked via code)

1 Like

@ScruffR,re calling Particle.process() too quickly, someone posted a comment about it long ago. I will try and find it and report back.

Re the need to call process(), if one is in a loop and not calling any system calls like delay(), wont it not yield to the system thread?

FreeRTOS will perform a thread switch every 1ms irrespective of the running code unless it’s explicitly blocked via SINGLE_THREADED_BLOCK, ATOMIC_BLOCK or noInterrupt() or some kind of dead lock situation.

BTW, you can have a while(1) Particle.process(); whithout any issue even with Particle.variable() still being serviced.

1 Like

Got it. Didn’t realise the task switching was timer based…

This raises the question as to what is going on with @amoghjain’s issue?

True, but without code it's impossible to just guess right.

I fear this calls for a support ticket at support.particle.io

@amoghjain, given @ScruffR’s last comment, suggest you implement the logging to get a feel for what is going on.

hi @UMD and @ScruffR

I reviewed the code and found we have a manual tick rate of 1 second, and we call particle process every 1 second from our application loop.

Secondly, since we already have SYSTEM_THREAD(ENABLED); I was wondering if we even need to include Particle.Process() in our application loop.

@amoghjain, I understand that you have manually checked the code to find… but, did you actually log the calls to see if what is actually happening is what you expect?

Am interested to see how often loop() is being called.

My understanding is that you can’t flash your code reliably, ie the problem still exists.

Sure, you could remove the calls to Particle.Process() - it is easier to perform the experiment than theorising what might happen!

1 Like

Hi @UMD,

I went through the documentation and haven’t yet confirmed but would really appreciate if you can confirm what you mean by “log”. Does it mean

  • we have to call a particle API function in our Firmware and then we need to provide you with particle ID and you can tell us what is happening?
  • Check particle API JS calls,(cloud events?), or
  • Store Event info in a SQL database via post command

or if can you please explain in a little bit more verbosity what do you mean by logs?

Thanks
Amogh

@amoghjain,

By logging I meant using the logging functions described here:

https://docs.particle.io/reference/device-os/firmware/photon/#logging

For example:

void loop()
{
    Log.trace("loop() entered"); 
    ...
}

The above example would give you USB serial output (if you configure the logger to use USB serial) something like this:

00000123000 [app] trace: "loop() entered"
00000123500 [app] trace: "loop() entered"
00000124000 [app] trace: "loop() entered"
00000124500 [app] trace: "loop() entered"

The above is showing you that loop is being called every 500 mS. Logging it VERY USEFUL in tracking issues in a running system.

I understand that there is some overhead in you setting this up, but in my opinion, it is well worth the effort.