Long Term Photon Connection Stability


#61

Does the failure seem to scale with the network congestion ?

When you say “under load” do you mean the wifi is congested ? the internet connection is bogged down ?

If we can zero in on the exact scenario, we can create a test case and get the failure in captivity.


#62

Sorry for the late response.

I don’t think the network is failing in any way, tought internet connection latency is surely increased around noon when personnel are working in the office ( using the same wifi network ), and usually before/after noon the photon is most likely to go into a repeated restart cycle due to an error in the connect call.

Below is a graph from sensor data collected by a photon, it sends about 500 bytes of data twice a minute, which contains count values. The larger downward spikes (data loss) are caused by the connection faults ( A separate device is logging the serial of the photon which tells us where the faults are originating from ). The system firmware was relatively recent.

I’ll update the system firmware to the latest dev branch state and collect another 48 hours of data to get a more clearer graph. I’ll also try to get some kind of a network load graph out of the router as well so that perhaps cross referencing it with the data might reveal some kind of a regularity in the faults.


#63

Just wanted to say thank you to @AndyW and everyone else who worked on the fix for this, I can at least say that I haven’t had to reset either my P1 module or photon since the release of 0.4.4, which have been running 24/7. I’ve had multiple drops of connection, mainly due to my internet connection but the system always comes back online within a few seconds of the internet coming back on. So cheers! :slight_smile:


#64

I’ve had the same problem with my Photons regarding long term stability and thanks to the remarkable work of the individuals on this post there seems to be a solution in a new release (0.4.4). To be honest I am not sure what this is, but I assume it’s a new firmware release that needs to be flashed to the Photon. Could anyone tell me how I find it and flash it?

Many thanks.


#65

The Web IDE and Particle Dev defaults to 0.4.4 or rather the latest version if you do not specify so when you flash a new user firmware, the system firmware will be updated to 0.4.4 as well :wink:

If you want to do it via USB, there is particle update using DFU mode for the CLI :smiley:


#66

Great. Thanks, Kenneth.


#67

(New user here, so hopefully I am posting this to a pertinent thread).

The short story is that my Photon continues to lose cloud connection randomly (though infrequently … perhaps every day or two) even after upgrading to 0.4.4 (about a week back now). Due to the very intermittent nature I have not been able to correlate it to particular events. As I believe reported by others, the observation is rapidly flashing cyan, with the connection quickly established by a reset or power cycle. (I will add however that qualitatively 0.4.4 does seem to lose connection less frequently than I experienced with 0.4.3).

I’d appreciate feedback as to anything I might be overlooking, also if it’s reasonable to expect automatic re-connection without manual intervention or other workaround to effect the needed reset. Thanks for any advice.


#68

I am seeing similar connectivity issues and my log is a continuous stream of the device going online and then offline. I have my Photon set up as a garage door controller with external antenna. It will be fine for hours at a time but then randomly go out. One thing I know for sure is that when my microwave is turned on it knocks out the Photon 100% of the time. The RSSI is around -57dB so I don’t believe its low signal strength. I did read that there were some old bugs with the WiFi.RSSI() command causing lockups but they appear to be fixed.


#69

Thanks for providing your observations @PM.

My issue isn’t so much an occasional loss of connectivity (I believe this is inevitable), rather, the seeming inability to automatically re-establish a connection in some instances. I’ve ran a few basic loss of signal tests (router outage, low RSSI), and the device will reconnect fine, as desired. However, as mentioned, I will also note the occasional random persistent fast-flashing cyan pattern, indicating the device has not reconnected. (BTW, I haven’t been able to knock it out setting it next to my microwave oven so far).

It’s not clear from your feedback if you’re taking any action in your application to re-establish a lost connection … (your log time stamps would suggest an automatic recovery). In my case I will have to tap the reset button or cycle power.

Again, right now I’m just trying to understand what others are experiencing with the Photon (with rev 0.4.4) re: it’s ability to maintain a persistent (or automatically re-establish) cloud connection, and if other application measures might be required to achieve this.

Thanks again.


#70

I believe I am seeing the same issue, but wasn’t clear in my last post. The continuous connect and disconnect happens, but eventually it disconnects and will not reconnect until I manually press the reset button on the Photon. I get the same fast flashing cyan of death (FFCOD) you do. I am using 0.4.4 and currently have no code to try to force a reconnect.
My router is simultaneous dual band 2.4/5GHz and I have not tried to change any of the settings, but all other WiFi devices in my house are rock solid. I read some older posts and disabling 5GHz may help reduce interference.


#71

Gotcha, check @PM … we are indeed on the same page. (My routers all happen to be 2.4GHz). Again, even without persistently solid WiFi (an unachievable scenario) it would certainly be desirable for the device to auto-reconnect when the network is available. (Not really even being sure this is the cause for the disconnects we are experiencing).

This would appear to be similar to the early CFoD I have read much about for the Core (though perhaps due to a different root issue). Hopefully there are add’l upcoming firmware revs that improve upon the device ability to reinitiate connection when necessary.


#72

Quick update. Taking “WiFi.RSSI();” out of my main loop and putting it on a 1 minute read cycle appears to have really helped the uptime. The log now shows only 1 offline even with a successful recovery in the past hour. We’ll see how long it lasts.


#73

I found that if I completely removed the WiFI.RSSI(), I only drop connection 1/per day or 1/2days thought monitoring the api that publishes and event on change.


#74

@mdma, I had similar issues with "WiFi.RSSI() appearing to cause loss of cloud connectivity and flashing cyan. It would work a few times and then fail with flashing cyan. Removing WiFi.RSSI() from the loop cured the flashing cyan.

I also have another situation that causes flashing cyan. I am using the SparkFun Photon Weather Shield to collect weather and soil temperature and moisture data. I have three functions that execute in the loop. Specifically, print data out to the serial line, upload data to Weather Underground and upload data to www.dweet.io (so I can make custom plots on freeboard.io). All three functions work and upload data the first time through the loop (I see new data on both servers) but then very soon after the first pass the Photon starts flashing cyan. If I comment out one of the uploads stability returns for an extended period. This flashing cyan is a real pain as it is not self correcting and requires a hard reset to get back on line. It sure would be nice if this was fixed or at least self corrected. I should not need to program in looking for loss of cloud connection and then do a System.reset. Unless flashing cyan is crushed I can’t use the Photon in a finished product.


#75

Just an fyi @wmjenk … I’ve also encountered the ‘fast flashing cyan’ issue occasionally while just running the basic Tinker app long term … (Photon firmware rev 0.4.4).

That said, I’ve decided to give it a go once again, this time using the Photon Weather Shield, after upgrading to 0.4.5. I’m just using the SF example with minimal modifications to post to Phant (once/hour). So far things appear stable, but it’s only been running 48 hours. Will post results again (positive or negative).

+1 on your final comments … the Photon would also not be usable for me if I continued to observe the fast flashing cyan. Hard reset is not an attractive option.


#76

I made a test app that was

void loop()
{
   WiFi.RSSI();
}

It ran for hours without issue. Please double-check on 0.4.5 since connection stability was specifically addressed in that release.

I totally agree - hard resets should not be necessary.


#77

The only thing I might add to the discussion (and to my earlier comments) @mdma is that the issue for me wasn’t a basic app (Tinker in my case) running for hours without issue. This was not a problem. The issue was running for days without encountering the fast flashing cyan, thereby requiring a hard reset. (I believe others have posted similar observations on other threads).

As mentioned, that was all with 0.4.4 … I’ve recently switched to 0.4.5 and am trying it again (with the Weather Shield and Phant posting app as previously mentioned). So far so good for about 36 hours.

To your last comment … is it a reasonable expectation (or the goal) that the device should be able to maintain a persistent cloud connection indefinitely?


#78

I am using 0.4.5. Ok, so I don’t think that WiFi.RSSI() is causing the problem. I am using a Photon in a SparkFun battery shield along with an Adafruit SI1145 UV breakout board(I2C) along with a SparkFun soil moisture probe (analog read) and a SparkFun soil temperature probe(Dallas-OneWire)-no parasitic power. The program takes sensor readings then prints out values to the serial line which is also printing out upload communication data/success, then uploads data to Phant (SparkFun), then uploads the same data to Dweet.io. After about 6-8 times through the loop it goes into flashing cyan and requires a hard reset. If I include WiFi.RSSI() readings and include them in the data stream it does not seem to affect the number of times through the loop before failing with flashing cyan. However, one interesting data point is that the WiFI.RSSI() value is around -59 until the last successful upload set where the WiFi.RSSI() reading is 2. At this point it flashes cyan. I believe the problem is related to uploading to two separate servers.

For this program and hardware I can get around the problem by putting it into deep sleep after a data upload. It wakes up and then uploads correctly as I never get the flashing cyan after one data upload to both servers.

Unfortunately, I see the same problem when trying to upload data from the SparkFun Photon Weather Shield to two separate servers. I can’t use deep sleep with the weather shield as I am continuously collecting rain, wind, and wind direction data. The flashing cyan is a show stopper in the weather shield-for the time being I am limited to uploading to a single server (Wunderground) and that does seem more stable.


#79

@mdma, So my SparkFun Photon Weather Shield has been running a little less than 24 hours uploading ONLY to Weather Underground every 5 minutes and not uploading to www.dweet.io and NOT using WiFi.RSSI(). About two hours ago it stopped uploading to Weather Underground, it also does not show online via Build or Dev yet the Photon is STILL breathing cyan. The only way to get it back showing in Build or Dev and uploading to Weather Underground is a hard reset. Gosh this is frustrating, Any advice on how to keep these Photons online continuously? BTW, I have seen the breathing cyan but not available via Build or Dev and not uploading data several times in the past week. Do I need to program in a hard reset every 12-24 hours? Given it is a weather station and will be deployed outside not in a convenient location for manual resets there really needs to be a solution here. Better stability or some remediation programming on my part. Suggestions welcome.


#80

I have a Photon sending data to an emoncms server every 10 seconds and it currently has 11 days uptime since the 0.4.5 release.

Is it possible a memory leak or something similar in your code is causing the failure?