Long Term Photon Connection Stability

So far the root cause of stability issues have been corner case bugs in the networking stack supplied by the Wi-Fi chip manufacturer. Those are hard to reproduce and track down but it’s something I’ll keep on doing. Stability is an area that we are always improving.

In terms of architecture, the default system mode is meant to make writing user programs simple. It interleaves system/networking code with user code so you could have code that is blocked during reconnection. One alternative is using the SYSTEM_THREAD(ENABLE) to have the user code running on its own thread. A third alternative is creating your own user threads and managing concurrency yourself.

Does this address your concerns?

hello @mhazley,

were you able to do your tests? How did the results turn out?

thanks a lot
rifo

So far so good but I haven’t ran it in anger yet.

This week I’m going to run it with a scripted access point to go up and down every few mins… So we’ll see!

great! looking forward to hearing positive results :smile:

thanks a lot one more time
rifo

I’ve bought two photons.

The first I put on a relay shield and made a basic remote-controlled light switch. Zero problems - it has been running for weeks without issue, only resetting when I have a temporary power outage. Absolutely love this board!

The second… worked pretty well for a while, but now I cannot get it to stay connected for longer than 5 minutes. In fact, it’s almost exactly 300 seconds every time I reset it. I’ve put it physically next to the other one (which is surrounded by a plastic enclosure and a nest of 14GA wire) and I have the same issue, so it can’t be the strength of my wifi signal. My router is a Netgear Nighthawk about 10 feet away, should be more than enough signal to go around.

The firmware for this second photon is just sensor readings - I have it on a custom PCB which has pinouts for OneWire, I2C, SPI and analog and all loop() does is cycle through the sensors. I even added a call to Particle.process() to each sensor’s read function even though none of them should be blocking for long. I have an LCD connected to the board via the TX serial pin and it continues to display updated readings after the RGB starts blinking cyan, so the firmware is running even though the WiFi doesn’t work. Nothing special happens at the 5 minute mark, or any other predefined time. According to the IDE I am using the latest firmware (0.5.1).

I have tried @rygh’s suggestion of resetting the WiFi when Particle.connected() returns false at the beginning of loop(), and that doesn’t seem to solve the issue at all. I have similarly used System.reset() and that does work, in the sense that the device will be connected most of the time, but I lose calibration data stored in local memory. Obviously I could reload those values from the cloud but that seems like an overly cumbersome solution - especially considering I know a Photon can run for weeks without needing a reset.

Is it possible this is just defective hardware? For the first Photon, the setup process via the Android app worked the first time without issues - for the second it took a couple of resets and false starts to get it to connect. Not sure if that’s my phone or not but it definitely made me feel like the second board had some boggarts in it.

@bmboucher, you may want to consider isolating the “odd” photon on a breadboard by itself, installing Tinker and letting it run? If this is stable then we can start looking at your code and hardware. :grinning:

@peekay123 Thank you for the quick response! I did as you suggested and put it on a breadboard, powered only by the micro USB and with nothing connected to any pin. Tinker ran fine for 10+ minutes, then I reflashed my firmware and it died at 5 minutes per usual. So the issue is not the board thankfully (don’t want to spend that $$$ again) but I don’t see how my firmware could be the problem.

I have uploaded my .ino to https://github.com/bmboucher/PhotonSensor, would appreciate it if someone can point out what I’m doing wrong.

Have you tried commenting out each of the 5 functions you call in loop() to try to isolate your issue?

@bmboucher, first, I think you should start a new topic for your specific issue. Next, you may want to consider using SYSTEM_THREAD(ENABLED); to separate the system and user threads. This allows WiFi to stay connected without Particle.process() in your code, though it is still needed if you have Particle.function() defined.

You should consider adding Serial.print() debugging output so you can isolate where the issue might come from in your code. Give these ideas a shot and let me know what you find. :wink:

OK, I’m an idiot :slight_smile: Thanks for all your help.

Following @BulldogLowell’s suggestion, I traced the issue eventually to my use of malloc without an accompanying free. I presume that causes a memory leak that grows on each loop and takes about 5 minutes to fill up and cause issues - at some point that must leave the system firmware unable to grab memory it needs, hence the WiFi disconnect.

What I don’t understand is why that doesn’t (a) cause a hard fault the next time malloc is called or (b) cause the RGB to start blinking red indicating that the heap is full… but it was definitely the problem. After adding free I’ve now had it running stably for 10+ minutes. Sorry to clog the thread with an unrelated issue.

I'd say because you have used good programming practice to check the pointer returned by malloc() against null and when malloc() doesn't find a big enough chunk to allocate it'll not crash but courteously return a null-pointer :wink:

Was there any resolution to @ehart01’s code sample? I have an issue where I am having the blinking cyan issue with spotty wifi and I’m wondering if there’s a known resolution to it at this point. The thread doesn’t seem to indicate if there ever was one or not.

As far as my experience with long term connectivity issues:
I gave up on the first project I posted far above. Went back to Arduino.
I tried Photon again for a different project about 3 months ago.
(Since I had it leftover from previous project, and was really hoping…)
It does seem to have improved. Perhaps firmware. But it still occasionally has issues.
Basically : Network traffic + spotty internet / wifi still leads to hangs.
With a lot of traffic, it will hang every couple of days.
I checked for memory leaks and a few other things. Nope.
Nothing really makes it long-term reliable for me other than shutting off wifi.
Sorry.

I was never able to get it stable with software either. I had to work around it with hardware - I built a little board that reboots the photon if it hasn’t connected to wifi in a few minutes. The photon toggles a GPIO line whenever it sends something successfully over wifi, and if the board doesn’t see this toggle for a few minutes it resets the photon.

I have 4 Photons and 1 of the original Spark Cores sensing data and reporting it around my house. The Core (Garage) is fairly unreliable, and I have a watchdog reboot it whenever it goes off the rails. But the 4 photons never get hosed. The 3 that are plugged into the wall only reset when there is a power outage. The outside one (Weather) is running solar/battery and it has been running continuously for 65 days now.

That said, I have seen hangs during my development that I would like to have blamed on the hardware or firmware. However, after digging into these situations in detail, I found the mistake usually in code.

My Wifi environment is your basic home router. I do everything through this router, like watch Netflix etc., so there is lots of traffic contention. I’ve also had internet connectivity issues with Verizon, power outages, and updates of the router firmware all occur, and the connected Photons just keep on chugging along.

Granted my sample size of 4 is pretty small. And my WiFi sample size of 1 is even smaller. However, my conclusion is that the Photon platform is very stable.

5 Likes

Any other success stories? Running for months without failure? Considering using a Photon at work

I’ve run a Photon for temp and humidity logging for like 6 months without needing to touch it.

It would connect to my phones wifi when I was home, lose it while I was gone daily, and then automatically reconnect when my phones wifi hotspot was back in range for like 6 months straight.

It’s rock solid now. I remember when that was only a dream when the first Spark Core was released after the Kickstarter campaign. Now Particle is exactly where they wanted to be and so much more with even more to come.

2 Likes

HI All,

I have recently (26 August 2017) had a photon forget it’s wifi credentials. My router credentials were not changed. This photon had been up and connected to the same router for over 18 months.

I re-loaded the WiFi credentials and the photon came back on-line. The other two photons at the same location remained on-line.

Has anyone else had Photon have to have it’s WiFi setup re-done without a AP (router) change after being attached to the Particle Cloud for 1 year+?

I am very interested finding the root cause.

ChipMonk

Photon platform is rock solid, I 1st thought is was flaky but it turned out to be a bug in a library. I have had 8 in operation for over 1 year without a single hiccup!

1 Like

which mode is you are using? manual or semi automatic?are you saving data when wifi is not available?