Photons failing to publish data beyond 30-60 min / breathing green(?)

Hello-

I am using 30 Photons to publish wind speed and direction data every 2 seconds on a single network. Everything works great when turned on, but the devices stop publishing and drop off the wifi completely starting within 15 minutes and all cease transmitting within a few hours. I have increased / decreased the data packet size, tried publishing every 6 seconds instead of every 2, switched to static IP addresses, and added two more routers to the network. None of these actions have made a noticeable difference. All devices went through the latest firmware update yesterday, and the problem seems to have gotten much worse. Before the recent firmware update, the devices would breathe green when they dropped off.

I am publishing at half the stated 255 byte publish limit, and using a frequency of double the 1 second max publish event. Has anyone successfully used Photons for networked data collection? Is there an issue with the Particle cloud / servers? Do I need an enterprise account to operate 30 Photons in publish mode? Do the Photons somehow max out their onboard memory over time through data collection? My current ‘solution’ is to power cycle every 30 minutes. Any and all help would be greatly appreciated.

Thanks!

Ping @Dave.

One thing that it might be is a memory leak. Could you add

Serial.println(System.freeMemory());

and monitor that value over time. For testing, you might want to speed up the rate of publishing to try to get the problem to occur more quickly.

1 Like

Hi @cek,

If you’re pulling wind speed and direction, I’m guessing you’re using interrupts? My first guess would be you’re running into some kind of thread / variable contention, is everything declared volatile, etc, etc? Any chance you could share your code, or send me the device id of one of the devices that’s not doing what you expect?

Thanks!
David

1 Like

Hi David-

Thanks so much for the response. The code is posted here:

The sensor side is wsd_sensor.ino, in the wsd_sensor folder. If you have time to take a quick look, that would be great. I can also post the device IDs if that’s helpful.

Cheers!

Hi @cek,

Hmm, I’m seeing that the wsd_sensor app does some work to manage the IP of the device, and the connection, and potentially blocks forever if the sensor doesn’t initialize. I wonder if that’s what is happening? Maybe getting a conflict on the static IP of the device, or blocking forever trying to reconnect / initialize the sensor?

I’m not sure about the order of operations with regards to being connected to Wifi and setting the IP address, does the Wifi connection need to be established before setting an IP? Flashing green for a long time might indicate a problem connecting to that WiFi network, so automatic behaviors would be a good way to narrow down the problem.

What happens if you run the app on AUTOMATIC mode, without any IP management, and maybe reporting / publishing an error if the sensor doesn’t initialize?

Thanks,
David

Hi Dave,

I’m working with cek on this project. I’m Dave also but sometimes go by tgidave.

A couple things in response to your message. First, when the compass hardware does not initialize, the photon device hangs with a solid blue LED. We are not seeing that indication. I agree with you that there is a potential for problems there and am working on flashing the user LED if that occurs. I will see if I can get that going soon.

Concerning managing the IP addresses, when you say “AUTOMATIC mode” I am assuming that is dynamic ip address assignment, is that correct? We have tried both dynamic and static IP addresses and both seem to fail in a similar manner.

I will see what I can do about reporting error messages. The problem with publishing the error messages is that with 30 devices publishing every 2 seconds the error message gets lost very rapidly. I am looking at other options like setting a variable in the cloud to an error code. In this situation though, that may be a problem because there is no cloud connectivity. I am also not sure how our code can recognize when the problem occurs. I put in code at the beginning of the loop function that checks to see if the device is disconnected and if so, try to reconnect. That did not seem to make a difference.

I will continue debugging this problem and appreciate any and all help.

Thanks,

tgidave

Dave,
I’m working with @cek on the network side of things. As it stands now, right after the devices are power-cycled, all 30 authenticate successfully and join the wifi network with valid IP addresses. I can also see through my network monitor that they are successfully transmitting/receiving packets with zero retry failures. However, after about 10-15 minutes they start slowly dropping off one by one. Sometimes 5 or 6 seem to hang on, other times they all eventually drop off.

It really looks to me like the problem resides in the devices themselves and may not be related to wifi. Are they running out of memory? Going into power save mode? Error state? My network monitor would report both wifi authentication/network errors, and I see none. Beyond that, and TX/RX stats, I can’t see much else. Any way to get a diagnostic log file off of the devices? It might be useful to verify whether or not that at 1:05 PM all 30 devices uploaded something to the Cloud or otherwise communicated with Particle.

thanks

Hi David-

Thanks again for working with us on this. An update from the past few days:

The breathing green mode seems to have disappeared since the firmware update on Tuesday / Wednesday. The devices still come offline, and seemingly more quickly than before the firmware update, but the breathing green indicator does not show up anymore. My devices will now remain breathing blue, but cease transmitting data. Does this change seem congruent with changes implemented in the latest firmware update?

I have commented out the static IP code and re-flashed everything. Is AUTOMATIC mode the default? No discernible changes in behavior with the static IP request removed.

tgidave will write something to blink a LED if the sensor doesn’t initialize.

I now have a way to cycle power remotely for all sensor devices. Below are the IDs for the devices involved. Let me know if you want to set up a time for me to cycle everything, and you can take a look from your end.

Cheers!

270022001347343339383037
340037000447343233323032
3f0022000e47343432313031
1b003e000347343233323032
330039001347343432313031
2b0021000b47343138333038
1d0022001047343339383037
240035000d47343432313031
290034000d47343432313031
2c001f000d47343233323032
1b0029000d47343233323032
1f0036000747343232363230
440033000f47343432313031
1d0020000947343432313031
2a0028001347343339383037
3f003a000447343232363230
3c001e000447343233323032
260026000447343337373737
41003e000b47343337373738
2c0025000947343337373738
34003a001347343432313031
1c002f000a47343337373738
34002c000c47343233323032
43003b000a47343432313031
35003e000f47343339383037
420026000a47343432313031
310021000447343233323032
3b003e000f47343339383037
290020001347343339383037
380026000647343232363230

1 Like

If the main loop blocks for more than 10 seconds, that will knock the device off the cloud. This could be the behavior you are seeing. We have taken steps in the latest release to ensure the LED shows the state of the device when it is disconnected from the cloud while the main loop is blocked.

As a first step to troubleshooting, please check your app for any places that may block for longer than 10s. Alternatively you might want to try enabling threading which will allow your application loop to block without disrupting the cloud connection. https://docs.particle.io/reference/firmware/photon/#system-thread

1 Like

When you say the devices get knocked off the cloud, would this also cause them to disconnect from the local network? That is what we are seeing. All 30 will authenticate successfully and join the wifi network with valid IP addresses, transmitting/receiving packets. But then they start disconnecting one by one and don’t rejoin - the only way to get them to reconnect is another power cycle.

Hi guys, the same thing was happening to one of my Photons yesterday… After about 10 seconds, the Photon would breathe green, but after a power cycle, the Photon would then reconnect and breathe cyan, but would then repeat the process of breathing green.

After some troubleshooting, I discovered that I had forgotten to make one solder bridge on this board I made for a DS18B20 application that caused the Photon to continuously try to poll the data on the DS18B20 while the DS18B20 was unpowered. This effectively froze my application in this polling loop, which caused the breathing green LED on the Photon.

I mention all of this just to encourage you guys to investigate the code you wrote for your 30 Photons. I’m thankful that the Particle team incorporated the RGB LED. I would have never realized about the forgotten solder bridge otherwise.

2 Likes

Thanks for the recommendations everyone. We re-flashed yesterday with threading enabled. This seems to have solved our system wide problems with the Photons stopping data transmission and pulling themselves off the wifi.

As a side note, it seems that breathing green went away with the last firmware update, but solid blue is now a thing? Does this make sense with what everyone else is seeing?

Thanks again for the input!

1 Like