[Workaround identified] Xenon flashing red 10 times after rc.27 upgrade

I just updated my mesh network (1 Argon on Ethernet + 2 Xenons) to test out rc.27 in terms of mesh range and reliability. Running the Marco-Polo code by ninjatill I upgraded the Argon with no issues and then moved onto the Xenons. Upgrading to rc.27 I observed that after flashing Polo the device would immediately enter a panic state blinking red 10 times between 2 SOS patterns. If I flashed tinker onto the Xenon it will behave normally but the Polo code causes the error on both of my Xenons whether I flash through the web or through DFU. I didn’t change anything in the code from the rc.25 version that was previously working and you can find it here.

1 Like

Yes my Argon and both Xenons are doing this. I updated the Argon via the Cloud and then updated all via USB using the hybrid files. Still all flashing red x 10.

EDIT:
They work in safe mode. Flashing user code via the cloud works but results in the device rebooting with SOS x 10.

Console shows: panic, assert_failed

I did not experience any of this so far. I flashed all Argons (2) and Xenons (5) via OTA. I’m using the MarcoPolo_v0.3.2. Marco on the Argons and Polo on the Xenons. The Argon on my desk is spitting out serial debugging info and the mesh looks very stable.

I just flashed that old version of Polo code you shared… I get the same thing SOS 10. I’ll take a look at that older code but you can try 0.3.2 in the time being if you want.

3 Likes

I’ll try flashing a blank project. I’m using my own Marco/Polo code.

I’ve also tried reflashing tinker but it still does SOS 10.

All devices after power on turn blue then immediately SOS.

I put the Xenon into safe mode and re-flashed the current MarcoPolo_v0.3.2 and it is now functioning properly again. Looking at the old v0.3.0 code, I don’t know what is causing that SOS 10. It’s very reproduceable though.

Can confirm that the 3.2 version is working on my devices as well. Pinging @will to raise awareness of this potential firmware issue.

Hey there – reading this thread quickly it sounds like the old version of code (v0.3.0) which was previously working on v0.8.0-rc.25 is producing an SOS, while the new version (v0.3.2) is not.

v0.3.0 source - https://go.particle.io/shared_apps/5c06a9679b02f97f6a0014e9
v0.3.2 source - https://github.com/ninjatill/Mesh_MarcoPoloHeartbeat

This implies the potential of a regression of some kind. Hoping that @mstanley or @ParticleD can jump in to support here.

3 Likes

@will, good synopsis.

Here’s some interesting behavior:

I attempted to “copy project” in the shared app from @TrikkStar. If I flash that code unadulterated, it produces the SOS 10 immediately after the OTA completes (presumably when user code starts to run.). I attempted to add a SerialLogHandler logHandler(LOG_LEVEL_ALL); just after the version variable declaration and it compiles, flashes, and goes to breathing cyan and responds to heartbeats as it should. If I delete that one line, so the code is identical to the original shared project and flash again, it produces the SOS 10. Does the firmware binary for that shared project get stored somewhere and it isn’t recompiling?

Does anyone else get an SOS 10 by flashing tinker from the Particle app?

@JumpMaster, no. After getting the SOS 10 with that v0.3.0 I then put the Xenon into safe mode and OTA flashed tinker. It flashed successfully and went to breathing cyan.

During this process of flashing the Xenon node multiple times with various projects, the Argon gateway became unresponsive twice and had to be reset. It was breathing cyan but the D7 LED wasn’t “beating”. As soon as I pressed reset, it came back and performed normally.

I guess, make sure the tinker app actually flashed. I mean you saw the magenta led flash a bunch of times and you’re sure it has tinker (not that v0.3.0 or whatever your equivalent code is)?

I think they maybe failing to flash. The particle app still believes all devices are flashing magenta when in fact they’re in an SOS loop.

Hey this is James from Particle.

Please try adding waitUntil(Mesh.ready); before Mesh.subscribe() calls in setup() and see if that gets rid of the SOS.

We’re looking into these issues.

6 Likes

Yes that’s the cause. Mesh.subscribe in setup with SYSTEM_THREAD(ENABLED); Thanks @JamesHagerman

1 Like

I guess that’s an unintentional change or oversight on the Mesh.subscribe()? The Mesh.subscribe() does not have the same comments as Particle.subscribe() regarding when to register mesh subscriptions:

It is OK to register a subscription when the device is not connected to the cloud - the subscription is automatically registered with the cloud next time the device connects.

It works. Thanks.

1 Like

Thanks everyone for reporting this! We’ve identified the root cause of the issue and already have the PR with the fix. It will be included in the next release of Device OS for Gen 3 devices.

This issue was caused by OpenThread version upgrade to 2018/12/17 master, that changed the behavior of multicast subscription management. It’s now not possible to modify the subscriptions while the Thread network interface is down or not fully up, that is why waitUntil(Mesh.ready) workaround helps.

2 Likes