'EEPROM' persistence issue

Did you have any luck figuring out what is going on? I’m swapping power supply in the mean time :smile:

Same here, this issue is new to me, it didn’t seem to happen under 0.4.5, but since updating to 0.4.6, it has happened a few times. Haven’t been able to make a dump as my program overwrites eeprom with 0x00’s on an unexpected read.
I’m running mostly on batteries, and it happens on low power. I think it has also happens on bad wifi connection, but not 100% sure.

I’m also getting SOS#5 errors now, which did not happen before.

I have tried fw 0.4.7, 0.4.6 and 0.4.5, but it doesn’t seem to change anything. @mdma, is it even possible to roll back firmware, or does .6 or .7 trigger updates of the bootloader or other parts that don’t roll back properly?

Yes, you can roll back (e.g. 0.4.7 -> 0.4.5 without the need for 0.4.6), you just need to apply the system parts 1 and 2 in revers order (2 before 1).

You just need to use Web IDE or local build for your application firmware to target the correct system firmware.

1 Like

Thanks for the quick reply. Did not know about the reverse order, does also apply when building locally using dfu-util?

If you build a monolitic firmware (like on the Core) then not, but when building modules then you should just build and flash the seperate binaries manually using CLI or dfu-util and not the make “parameter” program-dfu.

Or you download the prebuilt system parts, flash them and then do as you normally do, but with the correct tag selected.

1 Like

This issue turned out to be very troublesome for me and some customers already using some devices, so I have put some time into reproducing and dumping the contents like you asked.

I can reproduce when using a 2x AA battery power pack. It has an on-off switch, but in the off position it leaks. In the on position it delivers 5.05V and enough current, in the off position, it delivers 2.6V and very little current. The problem almost instantly occurs when powering it in the on position, wait for the application to start running and connect to wifi. After that I switch to the off position. The led slowly fades, and blinks red a few times and then dies. If that doesn’t work the first time, it will happen by playing around with the switch, connecting and disconnecting… it will break eventually.

After that it still boots and runs “normally”, but the eeprom is empty (I can tell by the behaviour of the device).
Also, when this happens, the device will not be recognised by the computer anymore, it does not show in OSX “System Information”. Neither in application mode or dfu mode. Leaving it sit without power for a minute or so fixes the USB in DFU mode, but it will not always be recognised in application mode. Only after re-flashing everything is back to normal.

It seems like it was very easy to reproduce with 0.4.6, and slightly harder with 0.4.7, but that could be coincidence.

@mdma: I have made a few memory dumps, how do I get them to you?

Addition 1: When trying to view what was the remainder of the EEPROM memory, it looks the EEPROM does not get wiped, but rather corrupted. Some values clearly had parts of the original value. e.g an ip address that that should be “192.168.0.254” turned into|“0.68.0.254”.

Addition 2: searching the community for brown-out seems to get some related results. Especially the ones were keys got lost or corrupted… (I don’t use the cloud and/or keys, so cannot tell wether it is related in my case).

please send the memory dumps to mat at particle dot io.

Hi @mdma, I have send you an email with the dumps, did you receive them? Any news on this?

Hi again, I found this document from ST:


Which I thought my be of interest in this matter.

@nicodegunst nice catch! Section 3.3 is of specific interest IMO. I’ll flag the folks at Particle. :grinning:

1 Like

Thanks! And that section in particular drew my attention to. I can not find the mentioned routine “EE_Init()” anywhere in the firmware, so maybe this can be of any use somewhere somehow.

I hope it will get a higher priority soon. I receiving more and more reports from users now, so it is getting very troublesome.

This article also explains a little: https://www.digikey.com/Web%20Export/Supplier%20Content/lattice-semiconductor-220/pdf/lattice-wp-flash-corruption.pdf?redirected=1

This phenomenon potentially affects all flash memory, but I assume firmware space is set to be protected?

1 Like

@nicodegunst, that is a really good article and quite sobering. The STM32 has brownout voltage detection and I’m not sure it is being used. :smile:

I have tried to turn it on/change its settings, I have added:

FLASH_OB_BORConfig(OB_BOR_LEVEL3);

to setup().

But could not really notice any difference between one or the other. I could still reproduce the problem, regardless of this setting.

I also tried enabling/disabling write protect

FLASH_OB_WRPConfig(OB_WRP_Sector_All, ENABLE);

But also this does not seem to make any difference…

I am not entirely sure my attempts are valid, but it did not produce any compilation errors, so I assume there is nothing wrong with it…

That app note talks about recovering from power loss/brown-out during the update window.

Whether or not the particle code is susceptable to these errors, I have no idea, but this can be fixed so that at worst you lose the last update and have to roll back to the previous version.

I believe the particle problem is deeper than this, and can result in loss of emulated eeprom, even without any write accesses active when the power fails.

2 Likes

I can confirm it happens when no (user code) write actions are being performed. I do not write to “EEPROM” on boot, and it always happens when (un)plugging a device from a (charged) battery power source.
Are there any write actions in the photon firmware at boot time?

2 Likes

As I suspected, thank you for the confirmation.

2 Likes

A detailed description of how to reproduce this, including test firmware, would be a great first step in our journey towards a fix.

Seems like we have plenty of reproduction listings in this thread. I can ship you a USB wallwart that causes this issue it if you want?

I was able to reproduce pretty easy using a USB cable with a broken micro-usb connector. It has a bad connector, so I just have to tap it to cut power.

It hardly ever happens when cutting power when the photon has reached user code. But when cutting power while the LED is white, the eeprom would be cleared after maybe 2-3 attempts.

I think if you make a breadboard and wire it so you can use a momentary switch to cut power. A write some lines of code that read EEPROM, you will have this reproduced in no time. Power it up, cut power few ms after that for a few ms, and then let it boot.

I guess a cheap wall wart could have a bad connector, or have bad circuitry that doesn’t power up nicely. Or in my case, 2xAA battery with step up converter could also give short bursts of power when the batteries are almost flat.

1 Like

Guys,
Just thought I would throw this in - it might just help someone…

I had this issue way back and decided that the EEPROM storage for long-term use was too ‘iffy’ ;-)).

So - I designed my PCB to include a low-cost I2C FRAM (FerroElectric RAM) chip - works like a charm ;-)).

These chips are WAY longer-life than FLASH or anything else - being measured in the trillions of write cycles :open_mouth: and come in many sizes…

Hope this is helpful to someone else ;-)).

BR
Graham

1 Like