Photon stuck in safe mode while upgrading to 0.7.0

We have many Photon devices in the field and want to OTA upgrade to 0.7.0 to pick up the latest WiFi stack (to work around what I believe is a WiFi related memory leak in 0.6.3).

To ensure the upgrade is reliable I have been running repeats of flashing a “blank” 0.7.0 app (which publishes the firmware revision, freemem and local ip address) on a 0.6.3 system and unfortunately, far too often the update does not complete successfully / gets stuck in safe mode.

I have tried initiating “particle flash <device_name> <app_name>.bin” again while the device is stuck and often that will push the upgrade through but with enough trials, even repeated attempts will fail with the photon stuck in safe mode.

When stuck, the device is ping-able on the local network and the connection to the cloud is fine, but clearly the device is not able to finish the OTA flash upgrade. I’ve read in this forum about repeated reboots during upgrade running out of IP addresses but in my case the IP address assigned each time is the same.

Any shared experience / insight / workarounds would be greatly appreciated! It’s a show stopper for us if we can’t make OTA upgrade work seamlessly.

Below is the console log for the case of repeated attempts to refresh where it gets stuck updating system part 1:

event: version-freemem-ipaddr
data: {"data":"0.6.3-57952-192.168.168.58","ttl":60,"published_at":"2018-06-08T17:40:50.707Z","coreid":"270021000a51353335323536"}

event: spark/status
data: {"data":"online","ttl":60,"published_at":"2018-06-08T17:40:56.638Z","coreid":"270021000a51353335323536"}

event: spark/device/last_reset
data: {"data":"user","ttl":60,"published_at":"2018-06-08T17:40:56.831Z","coreid":"270021000a51353335323536"}

event: spark/status/safe-mode
data: {"data":"{\"f\":[],\"v\":{},\"p\":6,\"m\":[{\"s\":16384,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"b\",\"n\":\"0\",\"v\":11,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"1\",\"v\":109,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"2\",\"v\":109,\"d\":[{\"f\":\"s\",\"n\":\"1\",\"v\":109,\"_\":\"\"}]},{\"s\":131072,\"l\":\"m\",\"vc\":30,\"vv\":26,\"u\":\"73A78B585FA378B79B815E2CEED078DAE51D08E63C8F7A9C1B189CA4211B24B1\",\"f\":\"u\",\"n\":\"1\",\"v\":5,\"d\":[{\"f\":\"s\",\"n\":\"2\",\"v\":207,\"_\":\"\"}]},{\"s\":131072,\"l\":\"f\",\"vc\":30,\"vv\":0,\"d\":[]}]}","ttl":60,"published_at":"2018-06-08T17:40:58.095Z","coreid":"270021000a51353335323536"}

event: spark/device/app-hash
data: {"data":"73A78B585FA378B79B815E2CEED078DAE51D08E63C8F7A9C1B189CA4211B24B1","ttl":60,"published_at":"2018-06-08T17:40:58.096Z","coreid":"270021000a51353335323536"}

event: spark/safe-mode-updater/updating
data: {"data":"1","ttl":60,"published_at":"2018-06-08T17:40:58.371Z","coreid":"particle-internal"}

event: spark/flash/status
data: {"data":"started ","ttl":60,"published_at":"2018-06-08T17:41:00.636Z","coreid":"270021000a51353335323536"}

event: spark/flash/status
data: {"data":"success ","ttl":60,"published_at":"2018-06-08T17:41:00.867Z","coreid":"270021000a51353335323536"}


event: spark/status
data: {"data":"online","ttl":60,"published_at":"2018-06-08T17:41:18.688Z","coreid":"270021000a51353335323536"}

event: spark/device/last_reset
data: {"data":"user","ttl":60,"published_at":"2018-06-08T17:41:18.880Z","coreid":"270021000a51353335323536"}

event: spark/status/safe-mode
data: {"data":"{\"f\":[],\"v\":{},\"p\":6,\"m\":[{\"s\":16384,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"b\",\"n\":\"0\",\"v\":11,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"1\",\"v\":205,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"2\",\"v\":109,\"d\":[{\"f\":\"s\",\"n\":\"1\",\"v\":109,\"_\":\"\"}]},{\"s\":131072,\"l\":\"m\",\"vc\":30,\"vv\":26,\"u\":\"73A78B585FA378B79B815E2CEED078DAE51D08E63C8F7A9C1B189CA4211B24B1\",\"f\":\"u\",\"n\":\"1\",\"v\":5,\"d\":[{\"f\":\"s\",\"n\":\"2\",\"v\":207,\"_\":\"\"}]},{\"s\":131072,\"l\":\"f\",\"vc\":30,\"vv\":0,\"d\":[]}]}","ttl":60,"published_at":"2018-06-08T17:41:20.120Z","coreid":"270021000a51353335323536"}

event: spark/safe-mode-updater/updating
data: {"data":"2","ttl":60,"published_at":"2018-06-08T17:41:20.327Z","coreid":"particle-internal"}

event: spark/flash/status
data: {"data":"started ","ttl":60,"published_at":"2018-06-08T17:41:22.571Z","coreid":"270021000a51353335323536"}

event: spark/flash/status
data: {"data":"success ","ttl":60,"published_at":"2018-06-08T17:41:29.289Z","coreid":"270021000a51353335323536"}


event: spark/status
data: {"data":"online","ttl":60,"published_at":"2018-06-08T17:41:41.903Z","coreid":"270021000a51353335323536"}

event: spark/device/last_reset
data: {"data":"user","ttl":60,"published_at":"2018-06-08T17:41:42.062Z","coreid":"270021000a51353335323536"}

event: spark/status/safe-mode
data: {"data":"{\"f\":[],\"v\":{},\"p\":6,\"m\":[{\"s\":16384,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"b\",\"n\":\"0\",\"v\":11,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"1\",\"v\":205,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":26,\"f\":\"s\",\"n\":\"2\",\"v\":205,\"d\":[{\"f\":\"s\",\"n\":\"1\",\"v\":205,\"_\":\"\"},{\"f\":\"b\",\"n\":\"0\",\"v\":101,\"_\":\"\"}]},{\"s\":131072,\"l\":\"m\",\"vc\":30,\"vv\":26,\"u\":\"73A78B585FA378B79B815E2CEED078DAE51D08E63C8F7A9C1B189CA4211B24B1\",\"f\":\"u\",\"n\":\"1\",\"v\":5,\"d\":[{\"f\":\"s\",\"n\":\"2\",\"v\":207,\"_\":\"\"}]},{\"s\":131072,\"l\":\"f\",\"vc\":30,\"vv\":0,\"d\":[]}]}","ttl":60,"published_at":"2018-06-08T17:41:43.348Z","coreid":"270021000a51353335323536"}

event: spark/safe-mode-updater/updating
data: {"data":"0","ttl":60,"published_at":"2018-06-08T17:41:43.373Z","coreid":"particle-internal"}

event: spark/flash/status
data: {"data":"started ","ttl":60,"published_at":"2018-06-08T17:41:44.518Z","coreid":"270021000a51353335323536"}

event: spark/flash/status
data: {"data":"success ","ttl":60,"published_at":"2018-06-08T17:41:44.951Z","coreid":"270021000a51353335323536"}

event: spark/status
data: {"data":"online","ttl":60,"published_at":"2018-06-08T17:41:49.920Z","coreid":"270021000a51353335323536"}

event: spark/device/last_reset
data: {"data":"user","ttl":60,"published_at":"2018-06-08T17:41:50.078Z","coreid":"270021000a51353335323536"}

event: spark/status/safe-mode
data: {"data":"{\"f\":[],\"v\":{},\"p\":6,\"m\":[{\"s\":16384,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"b\",\"n\":\"0\",\"v\":101,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"1\",\"v\":205,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"2\",\"v\":205,\"d\":[{\"f\":\"s\",\"n\":\"1\",\"v\":205,\"_\":\"\"},{\"f\":\"b\",\"n\":\"0\",\"v\":101,\"_\":\"\"}]},{\"s\":131072,\"l\":\"m\",\"vc\":30,\"vv\":26,\"u\":\"73A78B585FA378B79B815E2CEED078DAE51D08E63C8F7A9C1B189CA4211B24B1\",\"f\":\"u\",\"n\":\"1\",\"v\":5,\"d\":[{\"f\":\"s\",\"n\":\"2\",\"v\":207,\"_\":\"\"}]},{\"s\":131072,\"l\":\"f\",\"vc\":30,\"vv\":0,\"d\":[]}]}","ttl":60,"published_at":"2018-06-08T17:41:51.355Z","coreid":"270021000a51353335323536"}

event: spark/safe-mode-updater/updating
data: {"data":"1","ttl":60,"published_at":"2018-06-08T17:41:51.574Z","coreid":"particle-internal"}

event: spark/flash/status
data: {"data":"started ","ttl":60,"published_at":"2018-06-08T17:41:53.815Z","coreid":"270021000a51353335323536"}

event: spark/flash/status
data: {"data":"success ","ttl":60,"published_at":"2018-06-08T17:41:59.450Z","coreid":"270021000a51353335323536"}

event: spark/status
data: {"data":"online","ttl":60,"published_at":"2018-06-08T17:42:03.827Z","coreid":"270021000a51353335323536"}

event: spark/device/last_reset
data: {"data":"user","ttl":60,"published_at":"2018-06-08T17:42:04.024Z","coreid":"270021000a51353335323536"}

event: spark/status/safe-mode
data: {"data":"{\"f\":[],\"v\":{},\"p\":6,\"m\":[{\"s\":16384,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"b\",\"n\":\"0\",\"v\":101,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"1\",\"v\":205,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"2\",\"v\":205,\"d\":[{\"f\":\"s\",\"n\":\"1\",\"v\":205,\"_\":\"\"},{\"f\":\"b\",\"n\":\"0\",\"v\":101,\"_\":\"\"}]},{\"s\":131072,\"l\":\"m\",\"vc\":30,\"vv\":26,\"u\":\"73A78B585FA378B79B815E2CEED078DAE51D08E63C8F7A9C1B189CA4211B24B1\",\"f\":\"u\",\"n\":\"1\",\"v\":5,\"d\":[{\"f\":\"s\",\"n\":\"2\",\"v\":207,\"_\":\"\"}]},{\"s\":131072,\"l\":\"f\",\"vc\":30,\"vv\":0,\"d\":[]}]}","ttl":60,"published_at":"2018-06-08T17:42:05.265Z","coreid":"270021000a51353335323536"}

event: spark/safe-mode-updater/updating
data: {"data":"1","ttl":60,"published_at":"2018-06-08T17:42:05.485Z","coreid":"particle-internal"}

event: spark/flash/status
data: {"data":"started ","ttl":60,"published_at":"2018-06-08T17:42:07.746Z","coreid":"270021000a51353335323536"}

event: spark/flash/status
data: {"data":"success ","ttl":60,"published_at":"2018-06-08T17:42:13.560Z","coreid":"270021000a51353335323536"}

event: spark/status
data: {"data":"online","ttl":60,"published_at":"2018-06-08T17:42:18.858Z","coreid":"270021000a51353335323536"}

event: spark/device/last_reset
data: {"data":"user","ttl":60,"published_at":"2018-06-08T17:42:19.035Z","coreid":"270021000a51353335323536"}

event: spark/status/safe-mode
data: {"data":"{\"f\":[],\"v\":{},\"p\":6,\"m\":[{\"s\":16384,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"b\",\"n\":\"0\",\"v\":101,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"1\",\"v\":205,\"d\":[]},{\"s\":262144,\"l\":\"m\",\"vc\":30,\"vv\":30,\"f\":\"s\",\"n\":\"2\",\"v\":205,\"d\":[{\"f\":\"s\",\"n\":\"1\",\"v\":205,\"_\":\"\"},{\"f\":\"b\",\"n\":\"0\",\"v\":101,\"_\":\"\"}]},{\"s\":131072,\"l\":\"m\",\"vc\":30,\"vv\":26,\"u\":\"73A78B585FA378B79B815E2CEED078DAE51D08E63C8F7A9C1B189CA4211B24B1\",\"f\":\"u\",\"n\":\"1\",\"v\":5,\"d\":[{\"f\":\"s\",\"n\":\"2\",\"v\":207,\"_\":\"\"}]},{\"s\":131072,\"l\":\"f\",\"vc\":30,\"vv\":0,\"d\":[]}]}","ttl":60,"published_at":"2018-06-08T17:42:20.289Z","coreid":"270021000a51353335323536"}

event: spark/safe-mode-updater/updating
data: {"data":"1","ttl":60,"published_at":"2018-06-08T17:42:20.533Z","coreid":"particle-internal"}

event: spark/flash/status
data: {"data":"started ","ttl":60,"published_at":"2018-06-08T17:42:22.781Z","coreid":"270021000a51353335323536"}

event: spark/flash/status
data: {"data":"success ","ttl":60,"published_at":"2018-06-08T17:42:28.366Z","coreid":"270021000a51353335323536"}

...
  1. Does it act any different if you try an intermediate upgrade to 0.6.4 followed by another upgrade to 0.7.0? The 0.7.0 release notes recommend doing it that way for Electron OTA. Not sure if it would make any difference on Photon.

  2. Have you tried watching WiFi.RSSI() to see if signal strength correlates with the incomplete updates?

  3. Have you tried starting with a blank 0.6.3 app? You could give it a little more kick by adding a published function that lets you trigger a call to System.reset() before you initiate the upgrade to 0.6.4 or 0.7.0.

link to 0.7.0 release notes:

[edit: Also, there's a note in the 0.7.0-rc.7 release notes about downgrading to 0.6.4 rather than any earlier versions:]

> Note about Downgrading [Electron/Photon/P1] OTA or YModem transfer: You should downgrade to 0.6.4 to ensure that the bootloader downgrades automatically. When downgrading to other versions, you will have to manually downgrade the bootloader as well (see release notes in previous 0.7.0-rc.3 release)

[better edit: The 0.7.0 release notes instructions on downgrading say that 0.6.3 is the right version for Photon, but 0.6.4 is right for Electron:]

If you need to downgrade, you must downgrade to 0.6.3(Photon/P1), 0.6.4(Electron) to ensure that the bootloader downgrades automatically. When downgrading to older versions, downgrade to 0.6.3(Photon/P1), 0.6.4(Electron) first, then to an older version such as 0.5.3. You will have to manually downgrade the bootloader as well (see release notes in previous 0.7.0-rc.3 release)

1 Like

Thanks for the suggestions, I will give them a try. I am using the same “blank” app for 0.6.3 and 0.7.0.

1 Like

Are you familiar with the threads that discuss downgrading from 0.7.0? If not, might be worth a look.

Example links:

Thanks, yes, I have seen those threads. I haven’t had any problems downgrading from 0.7.0 to 0.6.3 over DFU, only upgrading from 0.6.3 to 0.7.0 OTA. For comparison, I will also run repeats upgrading to 0.7.0 over DFU to see if the Photon gets stuck in the same way.

1 Like

I’ve confirmed that I can get into the same stuck state whether or not the flash update is initiated from the cloud or over DFU, which makes sense because it typically gets stuck in one of the OTA update of system part 0/1/2 phases.

Using Wireshark I am able to see the transmit power from the Photon (as seen in the radio tap header) is consistent for both successful updates and stuck updates and is somewhere in the range of -45 dBm to -48 dBm. I’m using my MacBook to capture and both it and the target photon are on the same desktop.

Similarly, the received signal strength at the photon for both stuck and successful updates is typically around -60dBm, again as viewed in the Wireshark captured radio tap header. Of course, I’m not able to see what the photon actually thinks the RSSI is for a stuck case because the update is failing before I can report WiFi.RSSI() when the app runs.

Unfortunately, it’s pretty difficult to tell what’s going wrong in the CoAP protocol because it’s implemented on TCP for photon and not UDP and to the best of my knowledge there is no Wireshark decoder for CoAP / TCP yet.

Surely, there must be an error message in a log file on one of the device.spark.io servers that would shed some light on why the update is timing out?

Did you try publishing WiFi.RSSI() from your 0.6.3 "blank" app?

When you've had problems with updates, has the Photon always been on your desk near your MacBook--say maybe within a foot or two? Has the MacBook been awake with its wifi radio turned on? If so, perhaps transmissions from the MacBook are overloading the Photon's receiver from time to time. Maybe it would get more reliable if you move the Photon farther away from other radios.

Another thing to consider is your power supply. How are you powering the Photon? Is the supply voltage staying within the Photon's specifications even during current spikes from the radio? Perhaps you're having brownouts once in a while.

Thank you for the follow up questions.

The RSSI reported from my blank app running on 0.6.3 or 0.7.0 is typically in the range -65 to -70 dBm.

I use ethernet for my MacBook so the target photon is the only device on my desktop using WiFi and I am powering it via the MacBook USB port. I'm confident that both the power supply and the WiFi signal are clean.

But even if there was the occasional glitch, the protocol for the safe-mode-healer needs to be incredibly robust and short of someone unplugging the photon or disabling WiFi, it should never fail or hang. Others have posted similar problems and my hunch is that I'm bumping into something similar where the safe mode healer is timing out when it shouldn't. Updating field deployed photons OTA without physical intervention absolutely needs to be bullet proof!

FYI, below is the console output for the last run that hung and a plot of the corresponding WiFi signal and noise as captured by wireshark on my MacBook.

event: version-freemem-ipaddr-rssi
data: {"data":"0.6.3-58128-192.168.168.58(-70)","ttl":60,"published_at":"2018-06-11T20:33:31.955Z","coreid":"270021000a51353335323536"}

event: version-freemem-ipaddr-rssi
data: {"data":"0.6.3-58084-192.168.168.58(-68)","ttl":60,"published_at":"2018-06-11T20:33:32.959Z","coreid":"270021000a51353335323536"}

event: spark/device/app-hash
data: {"data":"BBA44E20A061685C167941266027F8FD7A6D45D895B572357CA73EA51B3312CD","ttl":60,"published_at":"2018-06-11T20:33:33.282Z","coreid":"270021000a51353335323536"}

event: spark/status
data: {"data":"online","ttl":60,"published_at":"2018-06-11T20:33:45.704Z","coreid":"270021000a51353335323536"}

event: spark/device/last_reset
data: {"data":"dfu_mode","ttl":60,"published_at":"2018-06-11T20:33:45.845Z","coreid":"270021000a51353335323536"}

event: spark/status/safe-mode
data: {"data":"{"f":,"v":{},"p":6,"m":[{"s":16384,"l":"m","vc":30,"vv":30,"f":"b","n":"0","v":11,"d":},{"s":262144,"l":"m","vc":30,"vv":30,"f":"s","n":"1","v":109,"d":},{"s":262144,"l":"m","vc":30,"vv":30,"f":"s","n":"2","v":109,"d":[{"f":"s","n":"1","v":109,"":""}]},{"s":131072,"l":"m","vc":30,"vv":26,"u":"55B39334E823E8F0ADD82E75205DE1E2596F96E194F2B9FC1E6B96F208325288","f":"u","n":"1","v":5,"d":[{"f":"s","n":"2","v":207,"":""}]},{"s":131072,"l":"f","vc":30,"vv":0,"d":}]}","ttl":60,"published_at":"2018-06-11T20:33:47.141Z","coreid":"270021000a51353335323536"}

event: spark/device/app-hash
data: {"data":"55B39334E823E8F0ADD82E75205DE1E2596F96E194F2B9FC1E6B96F208325288","ttl":60,"published_at":"2018-06-11T20:33:47.141Z","coreid":"270021000a51353335323536"}

event: spark/safe-mode-updater/updating
data: {"data":"1","ttl":60,"published_at":"2018-06-11T20:33:47.653Z","coreid":"particle-internal"}

event: spark/flash/status
data: {"data":"started ","ttl":60,"published_at":"2018-06-11T20:33:49.864Z","coreid":"270021000a51353335323536"}

event: spark/flash/status
data: {"data":"success ","ttl":60,"published_at":"2018-06-11T20:33:50.074Z","coreid":"270021000a51353335323536"}

event: spark/status
data: {"data":"online","ttl":60,"published_at":"2018-06-11T20:34:09.000Z","coreid":"270021000a51353335323536"}

event: spark/device/last_reset
data: {"data":"user","ttl":60,"published_at":"2018-06-11T20:34:09.127Z","coreid":"270021000a51353335323536"}

event: spark/status/safe-mode
data: {"data":"{"f":,"v":{},"p":6,"m":[{"s":16384,"l":"m","vc":30,"vv":30,"f":"b","n":"0","v":11,"d":},{"s":262144,"l":"m","vc":30,"vv":30,"f":"s","n":"1","v":205,"d":},{"s":262144,"l":"m","vc":30,"vv":30,"f":"s","n":"2","v":109,"d":[{"f":"s","n":"1","v":109,"":""}]},{"s":131072,"l":"m","vc":30,"vv":26,"u":"55B39334E823E8F0ADD82E75205DE1E2596F96E194F2B9FC1E6B96F208325288","f":"u","n":"1","v":5,"d":[{"f":"s","n":"2","v":207,"":""}]},{"s":131072,"l":"f","vc":30,"vv":0,"d":}]}","ttl":60,"published_at":"2018-06-11T20:34:10.436Z","coreid":"270021000a51353335323536"}

event: spark/safe-mode-updater/updating
data: {"data":"2","ttl":60,"published_at":"2018-06-11T20:34:10.862Z","coreid":"particle-internal"}

event: spark/flash/status
data: {"data":"started ","ttl":60,"published_at":"2018-06-11T20:34:13.069Z","coreid":"270021000a51353335323536"}

event: spark/flash/status
data: {"data":"success ","ttl":60,"published_at":"2018-06-11T20:34:19.851Z","coreid":"270021000a51353335323536"}

The photon is ping-able in the hung update state and running diagnostics from the Particle console shows everything is good:

And here's what particle serial inspector shows after the update gets stuck, before the boot loader could be updated:

Platform: 6 - Photon
Modules
Bootloader module #0 - version 11, main location, 16384 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #1 - version 205, main location, 262144 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #2 - version 205, main location, 262144 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: FAIL
System module #1 - version 205
Bootloader module #0 - version 101
User module #1 - version 5, main location, 131072 bytes max size
UUID: 55B39334E823E8F0ADD82E75205DE1E2596F96E194F2B9FC1E6B96F208325288
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: FAIL
System module #2 - version 207
empty - factory location, 131072 bytes max size

Hmm… kinda mysterious. That signal strength sounds fine to me.

One other hardware thing you could try to rule out is heat buildup. If you’re running updates in a loop, your radio may be running at a high duty cycle and warming the module up a lot. Have you done anything to monitor (e.g. temp probe, thermal camera) or manage (e.g. fan) component temperatures?

Assuming no obvious thermal problems, it does sound a lot like a software glitch. Maybe somebody else can offer suggestions from that angle.

The device is out in the open on my desktop running at room temperature.

To summarize, from what I can tell there are two failure modes in the “safe mode healer”, one in which the update fails to complete but trying again will typically succeed, and a second in which the update fails and all successive attempts also fail. In this case the update mechanism gets in a loop retrying the update for a while but eventually times out and just sits quietly in safe mode. This second case is more serious because the only thing that seems to get the device unstuck is reverting to 0.6.3 over DFU and trying again. If this device were in the field it would essentially be bricked.

@khp - I second your experience. On the 10 photon/P1s I’ve upgraded from 0.6.3 to 0.7.0, 2 have gotten stuck in safe mode. One was REALLY inconvenient as its mounted inside a wall… that one ended up basically bricked. The other I was able to push thru the update again after a reset and it worked. Hoping the 0.7.0 upgrades to 0.8.0 go more smoothly…

I am seeing the same behavior - a photon stuck in safe mode after attempting to OTA upgrade to 0.7.0 from 0.6.3. It gets stuck updating one of the system parts and eventually times out.

Platform: 6 - Photon
Modules
  Bootloader module #0 - version 101, main location, 16384 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #1 - version 207, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #2 - version 205, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #1 - version 205
      Bootloader module #0 - version 101
  User module #1 - version 5, main location, 131072 bytes max size
    UUID: 55B39334E823E8F0ADD82E75205DE1E2596F96E194F2B9FC1E6B96F208325288
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: FAIL
      System module #2 - version 207
  empty - factory location, 131072 bytes max size

Have you tried OTA update of of the Device OS (ie system) firmware only?

Refer:

1 Like

Thank you for the suggestion. I tried OTA flash of the system parts at some point without success (hard crashes, flashing red) but with your/ScruffR’s suggestion of waiting for the “online” status before flashing the second part, my repeats are proceeding nicely.

I was also worried about conflicts with safe mode healing, but it seems to stay out of the way for the system-partX updates and only kicks in after part 2 to pick up the boot loader.

I have a support ticket in for this issue and I’m hoping to hear about a patch from Particle soon, but in the mean time this work around seems promising.

@khp,

Excellent!

Can this ticket be marked off now as solved? If so, check the tick button.

1 Like

@UMD @wsnook - Thanks again for your suggestions. To me this issue will only be resolved when Particle makes the safe-mode-healer 100% bullet proof. However, I’ve been able to convince myself that explicitly updating system parts 1 and 2, and then the app is a reasonable work around, at least when upgrading to 0.7.0 from 0.6.3.

For what it’s worth, I’m using curl to wait for the spark/status: “online” events before proceeding to the next step in the update and I found that occasionally the spark/status event doesn’t show up even though the device is clearly up and running. In these cases I repeat the last step and keep going and that’s working ok but I was assuming this event would be more reliable.

@khp, excellent!

I like your “is it back online” script. For those that don’t want to use a script to determine online status, the Particle console is your friend.

Has anyone ever gotten an OTA fix for this? I’ve had multiple customer devices get bricked by this issue, and support has not been able to provide any help so far.

@polystyrene - is this still occurring for you? Work has been done in September/October to relieve this particular pain-point and I’d like to ensure it’s resolved. If not, I’d be helped in my efforts by some Device IDs (please PM those to me, for security reasons!) Thanks.

Yep. You and I are already working on it through the support tickets. :slight_smile:

// stephen

1 Like