We’ve been chasing this Cyan Flash of Death (CFOD) issue for a while and recently learned a couple things about it to make a post here possible.
We have a P1-based product that sometimes goes into CFOD and does not recover until it’s power cycled. The user application stops running, so I’m not able to detect the condition in user firmware to manually reset.
I was finally able to capture some logs and it looks like the issue has to do with a WLAN timeout/reset:
0000246852 [comm.sparkprotocol] WARN: ping ACK not received 0000246852 [system] WARN: Communication loop error, closing cloud socket 0000246952 [system] INFO: Cloud: connecting 0000246953 [system] INFO: Read Server Address = type:1,domain:device.spark.io 0000246974 [system] ERROR: Cloud: unable to resolve IP for device.spark.io 0000246974 [system] WARN: Cloud socket connection failed: -1 ... about 5 seconds worth of user application log mesages, e.g. free memory: 26356 0000251980 [system] WARN: Internet Test Failed! 0000251980 [system] WARN: Resetting WLAN due to 2 failed connect attempts 0000251980 [system] WARN: Handling cloud error: 2 0000252080 [system] WARN: Resetting WLAN due to SPARK_WLAN_RESET
after that last system log message, the device is in the “fast CFOD” mode and does not recover until power cycled.
Happily, I found a library that lets me tap into the STM watchdog: https://github.com/raphitheking/photon-wdgs. Adding the watchdog has allowed us to at least recover from the CFOD state. We do lose a bit of data, but overall it’s much preferrable to lose a bit of data than to go into the CFOD state…
So we’re doing ok, but it would still be nice to get to the bottom of what’s causing this issue in the first place. I’m not sure what to check really, we have plenty of free heap space as you can see from the log messages. Here’s a few other details about our config:
- System thread enabled
- System firmware 0.6.2
- Offline compiler
If I can provide any other information that would be useful, please let me know!