Best practices for long term reliability?

So about a month ago I got my cool new Photon, and also had a current use for it – monitoring the water level of our A/C overflow pan and also temperature from a four different locations. We’d had some issues some clogs/overflow, and this was perfect…it’s also running in a pretty “dumb” mode. One webpage for human readable status and a second with a JSON response. My home-controller Pi queries the Photon once per minute and handles all the logging and alerting.

Anyway, it’s located a very hard to get to place so needs to be reliable. From some reading at the time, I disabled the RSSI query, it reboots once every hour…and just by nature of how its setup, we’re not using any cloud functions other than the potential for remote reprogramming. That got it from staying alive for about a day to not failing at all. (Also using the latest firmware that was available at the time.)

So it ran happily for about one month with no issues.

Last night it stopped reporting, and the Pi whined at me after it had been offline for more than 2 hours. Router says its not connected to the wifi. Let it sit till today, no change.

Power cycled the breaker for the outlet its plugged into, and it happily came back online.

But…what causes this, and what can prevent it? (I can’t easily get to it, so I can’t tell you what the lights were doing.)

I’m planning a more remote application, and am considering if I need some sort of other watchdog that will cycle the power of the Photon if doesn’t toggle a pin every X amount of time, and the code only toggles that pin if it can get a certain value from a remote server, or something like that. Just not sure if I should “need” to do that…even if it is good practice, and I’d probably do it anyway. (And the regular reboots will be an issue in this future application…I couldn’t make it more than few days in my testing without that.)

Maybe some firmware since then has helped, haven’t messed with it much since then, since it was just doing its job.

Really just curious what kinds of things could knock a Photon hard offline or locked up (I presume, or else the soft resets should have fixed it), that a hard reset would fix but would also let it run fine for a month?

1 Like

I’ve had some issues with cores in the field going offline and never coming back.
It’s a core fix, but it’s a strategy I’ll continue to use with P1/photons – reset the Wifi module by cycling WiFi.Off/On if your device is disconnected from WiFi for more than X minutes.
I posted some sample code here [Core] Getting stuck in flashing green (disconnected) bug and fix

It’s hard to predict what bugs exist or will come up in the future, so it’s good to check return values for critical functions (like connectivity) and reset your device if you get too many errors.

1 Like