Code Not Working after 15 Days in the Field

I fixed that issue with the power, and it is still connecting with the cloud, but logging no data.

@dcliff9 Yeah. I’m pretty sure it’s the Broker issue and something on the particle cloud end. It’s a weird error.

if you read through this community post, they have the same issue… (won’t publish). It sounds like a cloud issue they still haven’t fixed completely.

Cool, just got an email that the cellular is down for now.

I have seen the Electron stop sending data even though it looks like it is sending data before. A battery pull fixed the problem.

Adding an external watchdog circuit to rest the unit every so often is not a bad idea either for these remotely located units since a reset usually will fix a lot of problems.

I left a unit outside last year for like 3 months in the winter and the only problem I ever saw was the data stop sending to Ubidots directly, no particle cloud involved, even though it looks as if everything was working fine by the way the Electron was operating. A battery pull always fixed it.

If you subscribed to your publishes to make sure it was posted then you could trigger a reset if you did not receive the publish response and then trigger a reset to try to fix the problem.

Yes, whenever I disconnect all power for 5+ seconds, and then press reset and boot up the device again, it seems to start working again (but if I have 30,000 devices I can’t be doing that) and also sometimes it will only continue reporting the data for a few more days.

What I was able to figure out was that it had nothing to do with the Particle Cloud since I was not using the cloud but sending data to Ubidots via their MQTT code. So it must be a cellular or network issue.

Subscribing to your publishes if you’re using the Particle Cloud is a way of catching if the events did not get received and then you can trigger an automatic reset.

Okay… That’s interesting and is starting to make some sense. But just doing a normal reset doesn’t fix the problem (I’ve tried that), and from what I understand is when I go to Deep Sleep it resets the device upon wakeup (which hasn’t fixed the problem either). Or am I missing something?

It only happens on some devices, or after a period of time. It happens with certain devices more frequently as well.

Yes, a reset will not reset the modem but there is some code you can run that will reset the modem as if you pulled the battery and that's what you want to run.

Check out this great post for more info:

I tried using that code from that post to reset the modem.
However, I can see noticeable differences between that, and a hard reset of pulling the battery away (basically it didn’t work for me).

Here is the code I used (just like @rickkas7 used):

void smartReboot(){

Particle.disconnect();

Cellular.command(30000, "AT+CFUN=16\r\n");
Cellular.off();
delay(1000);

System.reset();
}
1 Like

Should work but let’s see what @rickkas7 has to say about this.

@liddlem I feel like you are always one step ahead of me.
I am starting to understand more about this problem and the one correlation I have been able to find is that the events begin showing on the stream again once the device does a new handshake. But this only seems to happen if I pull the battery for a bit and plug it back in.
As a side note, I’m also having a problem getting an OTA update (automated through a product update, not by pushing the OTA button on the web IDE). This also seems to only happen if/when the device does a handshake with the particle cloud.
So I am back to trying to figure out a way to force a handshake. Doing a particle.disconnect() and then a particle.connect() does not force a handshake. Was hoping killing the modem and reconnecting would do it.
Have you found that to be true?

I’m wondering if using these 3rd party SIM cards is somehow causing an issue with handshakes to the particle cloud. Is it possible that the shared keys just remain the same until the device is completely powered down? And if so, is there a surefire way to reistablish (or force a handshake)?

@rickkas7 I tried that code as mentioned above, but it is not acting like removing the battery and a ‘hard reset.’

@dcliff9 I know man… It’s kinda frustrating. I see as well what you mentioned, if you force a handshake by removing the battery for a bit and plugging it back in it seems to always publish and work.

For OTA updates, you need to have it handshake, and then you have to keep the device alive long enough so that the update has enough time to get downloaded and installed into the device as the new firmware.

I’m not sure what forces a handshake. It sometimes works with the particle.disconnect() and then a particle.connect() on the next wakeup. it’s really confusing though what is working. i was hoping it kills the modem and reconnects to do a handshake, and it seems to kill the modem if you do the cellular.off() commands, but it doesn’t act the same as removing the battery.

Maybe it’s something with the backend cloud receiving the devices, or it’s a firmware issue or a 3rd party SIM issue. I don’t know what to say, but I’m submitting a support request.

I have seen the same issue with needing to pull the battery to get the data sent via MQTT to Ubidots in the past so I think we can rule out the Particle cloud as the issue since I was not using it and saw the same issue your having.

I was using the Particle SIM cards.

@liddlem, have you seen my post about skipped handshake in the other thread you mentioned this issue?

Hey all,
I hate to open this back up, but this issue has reared it’s ugly head again and I’m not having much luck figuring it out. Wondering if anyone else is having this same issue again.

Rundown of issue :

  • Particle Electron running 0.6.0.
  • Hologram (3rd party) SIM.
  • Typical sleep, turn on, take sensor readings, publish, go back to sleep scenario.
  • Device stopped reporting on Apr 19th and has not reported since.
  • Currently the device turns on, flashes green, flashes blue, strobes cyan, disconnects, then goes back to sleep as intended. Serial debugging shows everything running according to plan and particle publish occurring during breathing cyan with plenty of time to publish.
  • Hologram (SIM) dashboard shows the connection with bytes out as expected.
  • Talked to @rickkas7 and he said the broker issue has been resolved so he doesn’t believe that is the issue. Haven’t been able to reach support for a while.
  • Last handshake occurred on April 15th. Even though I am breathing cyan, I can’t get the device to handshake again.
  • Particle functions and variables are currently unavailable, so even though the device is breathing cyan, I have no way of remotely interacting with it.

I assume some combination of unplugging the battery or reflashing the firmware will cause the device to force a handshake, and from this thread, I know I can add code into my firmware to force a handshake each time the device connects, but I would rather not incur the extra data usage if possible and I am worried about shipping these devices out in the field until I have a solid fix for this as they will be sealed.

Any help would be appreciated.

@ScruffR Has some code do just that. You could subscribe to your publish and if you do not receive a reply indicating the published even was received then you could trigger a device + Modem reset which would fix the issue.

Updating to the latest firmware may also help.

Opening this one back up again as I have another device showing the same issue.
We are now on version 0.7.0.

Our code now watches for a successful publish by way of particle.subscribe.
If not successful, we force a handshake with @rickkas7 suggested (Particle.publish("spark/device/session/end", "", PRIVATE);). The device then resets and runs again to hopefully handshake.

To recap, the device is waking, taking sensor measurements, turning on the modem, connecting to the cell network (via hologram sim) successfully, sending packets (according to both serial monitor and the hologram dashboard), and then going to sleep. But the publish is not being seen on the particle cloud. In fact, even though I can see the forced handshake seemingly publish (via serial monitor), the particle cloud is never registering a handshake.

Any other ideas?

This honestly sounds like either the UDP packets are getting lost over the air and not making it to particle, or there is an issue in the Particle backend. I’m guessing something in the backend that they need to fix, but I don’t know. I just know it’s consistently the same devices dealing with this problem.

Have you tried removing the battery for about at least 5 seconds and then re-plugging it in? Does it then handshake? If it doesn’t, then something is really messed up. I tried doing a modem reset and such like suggested earlier, and even include code to force a handshake, but it only works sometimes.

Yes. An unplug of the battery will force a new handshake. In fact, a press of the reset button will even force a new handshake. Then everything works again as intended. Unfortunately, my devices are in enclosures. We do have a reset button that grounds the reset pin but this does not seem to work the same as the reset button on the board.
At any rate, after forcing a handshake it will run fine for maybe a month, coming out of sleep, checking sensors, and publishing several times a day like clockwork. Then suddenly I will find devices in this state. So far I have seen this issue on at least 4 devices.
As a failsafe, I have a function that runs the publish end session that Rick suggested, but that only works if the Particle Cloud is already recognizing the connection so it sort of defeats the purpose. What I would love is to replace that particular function With the equivalent of what happens after a battery pull or actual reset. Although not ideal, that would be a great failsafe for this issue. But I haven’t been able to figure out what exactly is happening that is different than a soft reset or reset pin grounding.

This sounds a bit like a well known effect when running into heap fragmentation (often caused by extensive use of String objects).

1 Like

I've seen this said a lot on Particle's forums, and there was certainly some truth it it back in AVR Arduino days, but I wonder if today it's really a red herring. The major issue with String dates back to an avr-libc bug with free(). This was resolved, Google Code Archive - Long-term storage for Google Code Project Hosting., and merged into the main Arduino codebase, Backported malloc and realloc from avr-libc 1.8.0 (without test code) by cmaglie · Pull Request #1329 · arduino/Arduino · GitHub.

Of course, Particle being an STM32 would never have pulled from avr-libc, so perhaps it has its own free() bug, or there's some other issue. However, The HATRED for String objects - "To String, or not to String" - Programming Questions - Arduino Forum has a very nice conversation about the true pitfalls and it seems that there is a firm consensus that String() isn't inherently problem, just the implementation which can be.

The issue is not that you are running out of free heap memory as such (as may have been the cause with buggy free() implementation), but that - due to the lack of garbage collection - the heap will fragment over time eventually while still having enough free space no big enough block for a particular allocation request can be found.

But I agree, it’s not the implementation of String in itself nor the mere use of it, but due to the mutating nature of String objects when used without care and have them grow and relocate, you tend to leave a trail of smallish free blocks on the heap that may eventually lead to issues.

1 Like