OTA code update slow on P1

About 50% of my OTA code updates on the P1 are slow i.e., it may take between minutes to hours for the P1 to stop blinking and restart. Often I see a slow purple blink, sometimes a fast green blink. Often the P1 (eventually) restarts and does so successfully with the updated code. In other cases I have to manually reset the device. In all cases it has the new code and runs fine afterwards.

The P1 is mounted on my own back board which supplies the 3.3vdc and connections to peripherals. The same code runs on a photon (mounted on similar board) where I do not see this behavior at all.

Any idea where I should start troubleshooting?

Let me ping someone that might be able to help, @rickkas7 are you able to assist?

Kyle

ping…

I’d install the P1 version of the cloud debug firmware on the device and start capturing a log file. It may or may not show something useful:

If you don’t have a USB connector, there’s also a version that outputs the same data to the TX pin if you can get access to that.

1 Like

Awright, after much digging, kicking and screaming, here is what I found;

My initial assessment that 50% of P1 firmware updates are slow was incorrect, Photon updates were equally slow. So my focus to determine what was different between P1 and Photon in my implementation changed to why I see these slow firmware upgrades. I noticed they seemed to be coming and going; i.e. I’d have a 10 - 20 slow ones in a row followed by a few fast ones and vice versa.

Long story short; a lot of my code is done on timers or otherwise timed events. The actual code that runs on a timer ISR is actually short but I do run several that run on uSec intervals. I also noticed that these timers were still going when an upgrade is in progress. I always assumed that the whole code set would just stop during an upgrade - this is true for loop() but not interrupts. So now when I get a firmware_update_begin system event, I kill all the timers and other interrupts I have got going. Since then, all the upgrades have been smooth and quick.

To all you particle geniuses: does this make sense? Was this already known and I just reinvented the wheel? If not, this might just be another little bit of info for your knowledge db.

@joost, what you experienced is essentially processor starvation, a common issue with high speed interrupt driven systems, especially in small systems. The Particle firmware doesn’t know how your use hardware and interrupt resources so can’t be predictably pro-active when an OTA occurs. Thus the reason for the system events. Your approach is the correct one.

BTW, I experienced a similar OTA “sensitivity” with my RGBMatrixPanel library. :slight_smile:

1 Like

@peekay123: Thanks for confirming my results, I’ll sleep easier now :wink: Though I would have thought that an OTA is recognized by the OS firmware and would have killed interrupts during that time. Then again, now that I know, it is easy to resolve in the system handler.

Troubleshooting was only positive for Starbucks, now I am going back to my normal coffee routine…

1 Like