Afternoon All,
We're running an IoT solution that uses a Boron 404X as a general supervisor, and it brings a Raspberry Pi 4 online to do some special purpose tasks throughout the day.
The architecture we use has the Boron configured as an SPI Slave, with the Raspberry Pi having access to a selection of functions over that interface.
It's been working fine for a couple of years now, but I've just discovered a really serious issue. When I power up the Pi, it reaches a stage in the process where there's a brief strobe on the Boron's CS line, along with a falling edge on the SCK line. As soon as this glitch happens, the Boron freezes absolutely solid. The status light that was breathing cyan even locks at whatever level it was at when the glitch occurred. The Boron stays in this jammed state until my application launches on the Pi, and it manipulates the CS and SCK lines to make a normal transaction.
The issue that I've found today is that after some amount of time in the frozen state (doesn't seem to be deterministic/repeatable, but seconds 5-15 seconds), the boron will reset with a white flash on the status LED. No Red flashes at all, nothing printed on the USB serial, just a straight reset. I don't even get a reset reason publish when it reconnects to the cloud, just the spark/status offline
and spark/status online
publishes.
Previously the Pi would be up and running, and would presumably make an SPI hit fast enough that this Boron reset would never occur, but it's happening more and more often on our systems, possibly correlated with slightly slower Pi boots as I bring more features online.
This one's got me pulling my hair out at the moment, as the hardware is all out in the field. It's been running fine for the last couple of years, with only the occasional reset, but now the systems are dropping data 3-4 times a week on this issue.
Any thoughts on:
- What the cause may be on the particle side?
- What I can do to remediate it remotely?
Details:
- Boron 404X running Device OS 5.3.0
- Boron WDT is enabled, on a 2-minute timeout, and getting poked regularly.
- Power rails are all 100% stable (Boron's backed by on-board LiPo, and 5V is derived from 130 Ah of LiFe cells)
P.S. I am also looking into whether I can make the Pi not generate these glitches on the SPI interface, but it looks like it would require major OS-level work to achieve anything, which I can't really do on fielded devices.