Bug Bounty: Electron not booting after battery discharges completely

Louder for the people in the back!

From the pull request:

The idea being that a power glitch could result in the read of sector 0 write protection bits being misinterpreted as unprotected, which would unlock the Option Bytes register to change these bits to be protected. While writing to Option Bytes register, if power is lost or the MCU is reset, this can result in Read Protection level being set to 1. This is fairly well understood and easy to reproduce, so not attempting to write protect the bootloader on every boot is a great mitigation technique to avoiding RPD level 1.

Always be very wary of 'un-journaled' (irreversible) writes to flash, especially on boot. Something like this ended up with me spending a week on a project rescue mission in Phoenix, which I do not recommend.

(I'm sure it's a fine place to live, but as a work trip it's a pretty boring town)

1 Like

I have tried 0.8.0-rc.11 and can make the device not boot reasonably reliably.

process is.

No battery connected.
Powered via USB only.
Flash new Device-OS and then your application. (I have a batch file that does them all one after the other)

As soon as the partIcle-cli says “Flash success!” IMMEDIATELY disconnect the USB. thereby removing the power.

When you try and power it back up, it won’t boot.

What exactly does that mean?

And

have you tried with battery?
When powering via USB only, there are extra demands to the power supply to allow for high current, high speed current demand which not a lot of USB wall warts fullfil (and no USB port of any common computer).

When you reconnect the USB to provide a PSU, after having flashed the device with the new Device-OS and application, it no longer boots (AKA it’s bricked and needs to be wiped using the ST-LINK)

I only reflashed the Device-OS and my application via the “particle flash --usb” so there was no need to have the battery connected, I did not connect to the cellular network at all.

Are you doing this right after the first of 4 "Flash success!" replies (3 system + 1 user module, is what I'm presuming your batch file is doing), or after the 4th?

If you do this after the first, you are likely corrupting the DFU process midway through on the second system module. This will leave your device in a state where it will not boot the system firmware and might SOS hard fault as well. But you should be able to get back in to DFU mode. Try removing power for a bit and restoring, while holding the MODE button.

If that's not what you experienced, please add a bit more detail about what you are doing and I'll try to reproduce it. Thanks @marshall !

Hi BDub

I’m doing it after the final (4 of 4).

I was loading code into 30units and 6 (20%)of them never rebooted, I’m pretty sure that these ones I disconnected very quickly after the “Flash Success” message, The LED may not have had a chance to even flash White, I can retest it but I have to make up a lead for my ST-LINK first. I don’t think that it matters what the application code actually is, It’s just the time delay after it has finished flashing and before you remove the power.

Here is my batch file

::extract the filename for the binary from the current directory path.
for %%* in (.) do (
	:: @echo =%%~n*
	set filename=%%~n*
)


echo project name = %filename%

set firmware=0.8.0-rc.11
set binary_extn=bin
set binary_filename=%filename%.%binary_extn%
echo binary_filename = %binary_filename%
call upgrade_fw.bat %firmware%
particle flash --usb %binary_filename%

and this is the upgrade_fw.bat

SET firmware=%1

particle flash --usb c:\system_firmware\v%firmware%\system-part1-%firmware%-electron.bin
particle flash --usb c:\system_firmware\v%firmware%\system-part2-%firmware%-electron.bin
particle flash --usb c:\system_firmware\v%firmware%\system-part3-%firmware%-electron.bin

Thanks for the extra details @marshall I’ve been trying now for 20 tries of a similar script on Mac that flashes 0.8.0-rc.11 and then tinker and I’ve not had any problem. I’ve also just tried a bunch of booting and yanking power before and after the white LED boots to simulate that last stage of the flashing script. Basically what happens there is the last particle update command causes a soft reset, and then you are yanking power just before the LED turns on, or right when it turns on. I’ve been plugging the USB in and unplugging it at different times (just before the white LED and just after). Trying to vary it a bunch and it continues to boot over and over. I don’t doubt that it happened to you 6 times out of 30 like you said, but I haven’t had the same results here. What type of electrons are these (G530, U260, etc…), how long is your USB cable? Any chance you can recreate this with one of the good ones when you are trying to do it? (I wonder if some slight residual power on the caps when I’m trying over and over keeps them from having the issue, vs. when you did it maybe you left them powered off for a long time before you tried to plug them back in again and then finally noticed?) So if you are actively trying repeatedly without waiting more than 10 seconds in between like I was, maybe you never see it happen? I’d like to set this up on an automated rig where it can do this test 1000’s of times and vary the timing between “Flash Success!” and disconnecting the USB power.

I might have missed it, but going to 0.8.0-rc.x from an “unknown” other version might also require you to flash the bootloader via YMODEM (aka particle flash --serial) - have you done that?

You can go directly to the latest version with electron since the bootloader is built into the system firmware. Photon/P1 you should update to 0.7.0 first, then go to the latest 0.8.0-rc.x.

bko,
I have the dim D7 problem as described in my post of Sep. 3. Can you briefly describe the fix with the JTAG programmer that you mention? Or give a link to a description of the fix?
thanks,
john

It has been long since I last visited. We deployed 10 Electron 3Gs in field in November last year and wrote some application level checks to ensure we never get to the low battery blue LED problem (A similar approach to @rickkas7 but checking every N hours linearly where N=1, 2, 3…).
We have not had any problems so far. We are now going to deploy 45 more and I would like to know if there have been any advances since I was last here. Has the bounty been collected? Is the issue still manifesting itself in new firmware?

2 Likes

Nothing has changed and your battery level check code is still required to prevent any potential damage to the Electrons.

The bounty has not been collected due to this being a hard to replicate issue.

I have just suffered this issue on the 1.1.0 firmware.
Fortunately I had the Programmer Shield and could solve it in the day :slight_smile:

So I just left here the command executed on the windows 10 terminal, if someone needs it

C:\Users\ [name] \.particle\toolchains\openocd\0.11.1-adhoc6ea4372.0\bin>openocd -f interface/ftdi/particle-ftdi.cfg -f target/stm32f2x.cfg -c "init; reset halt; stm32f2x unlock 0; reset halt; flash protect 0 0 0 off; program C:\Users\bootloader-1.0.1-electron.bin verify 0x08000000;flash protect 0 0 0 on; exit"

Edit: After the recovery the blue light flashed very fast. In this post found the solution

2 Likes

I just got this problem, sad. I don’t have any programmer on hand or anything. The interesting part is that I believe that mine was not being provided under voltage.

My setup for data point:
Running on 12V SLA, 5V regulator. 470uF capacitor in. No USB, no LiPo connected.
The voltage of the battery was 12.8V, not low at all, the voltage regulator needs at least 10V to provide 5v. I measured the VIN voltage an it was 4.95V.

Last thing before it died was the red LED blinking red at 1Hz, no OS code. Simply blinking steadily at 1Hz without stopping, this went on for minutes before I unplugged it to reboot it. Dim blue LED showed up after that.

@BDub,

I can reproduce this issue reliably on my custom PCB (I have 1000+ of these in the field) using a programmable power supply with an Electron running tinker or our application and device-os at 1.4.4 or 1.5.2. I accomplish this through the following routine:

  • Power supply set to 19V output
    • Our onboard regulator is a TPS54340 outputting 4.88V
    • No UVLO
    • 19V chosen to mimic a solar panel
  • Current set to 30mA
  • No lipo attached
  • Using a Raspi controlling a KORAD power supply over USB serial I run the following routing:
    • Current decreases by 1mA each second until it reaches 1mA, at which point it begins to increase by 1mA each second until it reaches 30mA
    • At 30mA it pauses for 20 seconds
    • It then increases by 1mA each second until it reaches 60mA, at which point it begins to decrease by 1mA each second until it reaches 30mA
    • At 30mA it pauses for 20 seconds
    • Repeat many times
  • I have not isolated whether our watchdog affects this somehow
  • I have not tested a release later than 1.5.2

After 1-2 hours, I will have an Electron with a damaged bootloader. I can recover the Electron relatively simply by following the SWD guide for flashing bootloaders.

For context, our devices experience a bootloader damaged when they either:

  • Have a solar panel directly connected to them (which mimics the above routine)
  • The 12V sealed lead acid battery they are connected to dies and goes to 4V

From viewing the LED patterns and whatnot, I think what is happening is:

  • PMIC turns on because 19V (4.88V) / 30mA (115mA) passes its input source checks
  • STM32 turns on the modem and PMIC turns on battery charging
  • Overdraw of current from modem brings power down to brownout territory for electron
  • Reset triggered
  • Repeat until bootloader is damaged

I think the same thing is happening in the other scenarios in this thread.

I’m happy to run other tests on this or supply additional data if someone at Particle needs more information.

2 Likes

FYI @BDub
(I am mentioning you just to make sure you get a notification - thanks!)

19v does not represent the 6v Voltaic panels that most people are using and seeing this issue but interesting find never less.

Also can’t remember exactly but how close is 19v to the input limit for that pmic chip? 12v is the max recommended input voltage by Particle.

Also good job on getting 1000 units out into the field!

The 19V is coming in through our on board TPS54340 regulator. The PMIC is seeing 4.88v when everything is operating correctly and something between 4V and 4.88V when the Electron is overdrawing current from the 19V via TPS54340 source. Something very similar is likely to be happening with your 6V panel, but directly instead of through another regulator.

It’s in these brief overdraw events that I think the Electron is able to damage its boot loader. It’s my understanding that the point of this thread is to harden the Electron against damaging its bootloader this way. For example, the board also has an ATMEGA on board that is able to survive these events through its bod.

1 Like

@BDub, would it help to have the bins of the bootloaders before and after they are damaged? Any other data need to be collected?

@hwestbrook thank you so much for providing many details on this. If you can capture the bootloader before and after, that would be useful. You can send those through Support. Better though, if you have a hardware setup that reproduces this, I’d love to take a look personally and see what I can figure out. Particle Support will reach out to you to try to get something sent to me, if you are up for it. Again, thanks for the insight on this… it has not really been reproducible through forced methods. Looking forward to finding the root cause, or in the least a working mitigation :smiley: