Photon locked up

Hi all,
Just like to report that one of my Photons, that I use as my main device testing, just locked up. The RGB LED was cyan in color, but did not breathe; there was no blinking either. My code was also not executing and no cloud functions/variables were accessible. For all intents and purposes, the Photon was completely offline to the cloud. Hitting the reset button on the Photon reset everything fine and now things are working normally.
I do use a 60 second watchdog timer that is supposed to manually reset the Photon in case of a firmware hangup, but to my chagrin, this lockup was so “severe” that the watchdog was unable to reset it.
Besides integrating an external piece of hardware that would poll the Photon for normal operation, does anyone have any suggestions to help make the Photon uninterruptable? I don’t want my customers to have to manually reset their devices if this happens to them.
By the way, I do run in manual mode, and do not use system threading.

https://docs.particle.io/reference/firmware/#application-watchdog

Hi @kennethlimcp . I do use the watchdog timer already in my code. Thanks for the suggestion though. Have you ever encountered a scenario similar to mine (i.e. where the Photon would lock up despite using the watchdog timer)?

Hmm… I dont have lock up issues.

What is it that you are running on the Photon?

@tommy_boy, for the application watchdog to fail, your app is crashing FreeRTOS. This means your app may be overwriting memory or similar issue.

Thanks @peekay123
A particle variable I use is to track the freeMemory() in the system.
The photon has been running for 9 hours, and usually, the freeMemory() is around 48k bytes, but it’s reading at around 38k bytes. So, I think I may have overlooked some poor memory management that some of my updates could have introduced. I’ll update the thread if that’s what it definitely is.

1 Like

I haven’t yet nailed down what the cause of the lockup is yet. I suspect that I may be running out of memory for the Photon to perform basic tasks like maintaining its cloud connection, watchdog, etc. So, I wrote this short test code…

//this will be the particle variable
uint32_t freemem;

//using this array of strings to consume memory space.
String test [100];

void setup()
{
//using this particle variable to track the amount of free memory available to the system.
Particle.variable("Memory", freemem);
}

void loop()
{
//updating the particle variable.
freemem = System.freeMemory();
}

I used the following sizes for the array of strings, and here’s the results:
100… 59308 free memory
1000… 34108 free memory
1500… 20108 free memory
2000… 6108 free memory
2100… When connecting to the cloud, the RGB LED was fast-blinking cyan with an occasional red blink.
2500… Hard fault SOS

So, it seems that when the system’s free memory drops to roughly 5000 bytes, the photon can no longer function. This may or may not be the root cause of my issue. The photon I’m using to currently test my code that is locking up has 46616 bytes available, and it’s been running for 1.5 days so far.
I’ll keep updating this thread with findings.

@tommy_boy, system firmware will allocate memory dynamically on the heap, thus explaining the 5KB or so boundary. I strongly suggest that you DO NOT use Arduino String objects in your code if you are doing so. The class dynamically allocates and releases memory from heap causing heap fragmentation. This creates smaller and smaller available blocks for the system firmware to allocate from, leading to failure. I suggest, if you are already not doing so, to use cstring functions and fixed char[] memory allocation. Even Arduino programmers recommend not using Arduino Strings!

Wow, thanks @peekay. I used to only use fixed char[] for all strings, but started using arduino style Strings, because I felt it was so much easier and faster to accomplish what I was trying to do. Thanks for the tip.
Does the heap fragmentation still occur if the variable declaration is global?

1 Like

Ok, just went line by line and converted any usage of the String class to char arrays.
The System.freeMemory() is now hanging at 43984 bytes most of the time, although I see it briefly dip to 42720 bytes. I’m thinking that this 1264 byte difference is caused by background tasks the Particle firmware is running during tasks such Particle.process(). I noticed in this thread another user was seeing this 1264 byte difference as well.
I didn’t see this reported issue explicitly fixed in the firmware updates for 0.4.9 or otherwise. Is the 1264 byte allocation just a section of memory that the firmware stack is utilizing to update things like particle variables, particle functions, maintaining cloud connection, etc? If so, is the memory utilization freed up in the heap/stack or will there possibly be heap fragmentation introduced by this?

@tommy_boy, not sure where the 1264 bytes are going and would have to dig through your code and the system firmware to find out. :fearful:

Me either. I’m committed to finding it out though. I’ll be sure to keep updating the thread with anything I discover.

1 Like

OK, the photon running my application is freezing after about 1.5 days. Actually, the last reset, which just occurred an hour ago, was caused by a watchdog reset. The RGB LED was breathing green, and after a 60 second timeout, the watchdog issued a system.reset.

Before I got rid of any String class usage, the watchdog would not reset the photon after 1.5 days, most likely caused by heap fragmentation due to so much dynamic memory allocation.

So, I started commenting out large swathes of code in my application, and I kept seeing this 1264 byte difference coming and going when calling freeMemory() even after everything was commented out.
So, I wrote the following test code:

SYSTEM_MODE(MANUAL);
uint32_t free_memory = 0;

void setup() 
{
    WiFi.on();
    WiFi.connect();
    Particle.connect();
    Serial.begin(9600);
}

void loop() 
{
    free_memory = System.freeMemory();
    if (free_memory < 61500)
    {
        Serial.println(free_memory);
        Serial.println("Where is the memory going?");
        delay(5000);
    }
    else
    {
        Serial.println(free_memory);
    }
}

Using a console to read the output from the serial connection, you’ll see after about 20 seconds, the “Where is the memory going?” string appear, and the amount of available memory will drop by 1264 or even some multiple of 1264. I’ve seen it drop by 2528 a couple times. Looks like it coincides with the Photon losing its cloud connection because I’m not using Particle.process(). Even if you add Particle.process() at the end of loop(), you’ll still see the 1264 bytes disappearing and reappearing in freeMemory().
It wasn’t until my application grew to a point where freeMemory() normally reads at 43kb that I started seeing intermittent issues with the photon freezing. So, I’m guessing that most folks using Photons probably won’t experience this issue unless their application’s memory needs get large.

I think that I probably have a memory leak somewhere in my application that will break the application after about a 1.5 days. However, I’d like to understand what’s going on in the background that’s causing freeMemory to fluctuate.
@KyleG, could you forward this along to the appropriate Particle team member?

@tommy_boy, can you run it with SYSTEM_THREAD(ENABLED) and report back?

Yes. I had to adjust the if statement because using system threading seems to use more memory resources.
Here’s the code I just tested:

SYSTEM_MODE(MANUAL);
SYSTEM_THREAD(ENABLED);

uint32_t free_memory = 0;

void setup()
{
    WiFi.on();
    WiFi.connect();
    Particle.connect();
    Serial.begin(9600);
}

void loop()
{
    free_memory = System.freeMemory();
    if (free_memory < 58156)
    {
        Serial.println(free_memory);
        Serial.println("Where is the memory going?");
        delay(5000);
    }
    else
    {
        Serial.println(free_memory);
    }
    Particle.process();
}

I’m still seeing the 1264 byte difference appear periodically.

Converted any usage of sprintf to snprintf using the sizeof() the char array being written to as a parameter. Just in case any memory addresses were being accidentally written over when using sprintf.
Will report back if/when my test photons reset or hang.

@tylercpeacock can you take a look at this.

Updating the thread…
After running through different iterations of my code with certain features enabled/disabled, I’ve narrowed down the portion of the code that was causing the lockup. I recently added the PietteTech_DHT library to the code to regularly poll a DHT22 sensor. When I polled the DHT22 every 10 seconds, my photon would lock up after running for about 2.5 days. Without the regular polling, the photon ran for 5.5 days without a hitch.
I noticed this line in the header file on line 34:

 // There appears to be a overrun in memory on this class.  For now please leave DHT_DEBUG_TIMING enabled
#define DHT_DEBUG_TIMING        // Enable this for edge->edge timing collection

I’ve seen other users on the forum have success with this library, so my lockup issue may be an outlier due perhaps to other features in my code. The lockup occurs only after 2.5 days, so debugging is taking some time. :sweat:

Another update.
Turns out the Piettetech DHT library was most definitely the cause of the lockup my photon was experiencing. Without any class usage of the DHT objects or polling the sensor, the photon worked fine for over 7 days.
So, my best guess for the lockups’ cause is heap fragmentation due to the interrupt-driven method of polling the humidity sensor. Despite the FreeMemory() method always returning the same “high-level water mark”, the Piettetech DHT library interacting with the rest of my code must be slowly consuming more and more memory space. I may write my own low-level library for my DTH22’s. I’ll update the thread again if that happens.