Argon - locks up periodically - don't know what else to try

Hi,

We have a Particle based product out in the field. It consists of serial interface with Atmel 328P, I2C connections to Energy Measuring Chip, RTC and Digital I/O expander.

Of our deployed devices (500+) we have a small amount devices that periodically “lock up” with solid cyan requiring Power on Reset. The lock ups occur often after 24+ hours of operation.

OS : 1.4.2

Free memory is 34312 bytes.

Config
SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(SEMI_AUTOMATIC);
STARTUP(System.enableFeature(FEATURE_ETHERNET_DETECTION));

WDT is coded enabled static ApplicationWatchdog s_wd(30000, System.reset);

We have 1 thread operating that serves the serial interface to the Atmel 328P.

We have a unit on soak test in the office which never exhibits the lock up. So I am wondering if it’s environment issues such as network.

I am running out of ideas on what to try as I cannot reproduce.

Particle support have said that the WDT won’t fire if we are in a condition where interrupts are disabled but none of the application code or libraries disable interrupts. They also said to limit use of CStrings (which I have) and I’ve moved a lot of memory allocation from stack to heap.

I’m not expecting anyone to pin point the problem just would be grateful if you could suggest a debug route.

Many thanks
Ian

Welcome, So a couple of thoughts -

is it the same devices? -

  • if so (or devices are physically close to others that have frozen ) then:
    I’d be checking power aspects (noise, dropouts, surges).
    Electrical noise / radio signals / high power equipment nearby (EMP style noise)
    Mechanical issues - vibration, sudden forces - points to poor connections perhaps or under spec components

  • if not:
    Instrument the code heavily, use a function to publish an event each time a function is entered (or each time a serial message is received) etc, to start to establish if the failures occur at regular points in the code or are just random or even a heartbeat every 5s with freememeory count - you get the idea.
    Check power consumption when everything is fully loaded and running - perhaps the power supply is unstable at a certain load level
    Does the power connection over the serial link shares common ground?
    How long are the serial cables and could they be close to anything that is “noisy”
    Same for I2C connection - is it close to Argon? if not is it shielded or protected?

Post a picture of the setup (if possible) that might help us to help you diagnose?

2 Likes