Photon freezing randomly during operation

We have a product that uses the Photon that is currently in production. Some of these devices freeze after a couple of hours of normal operation (in normal operation the photons are connected to the cloud). When the Photons freeze, they become unresponsive, and the LED is stuck in either solid green or off. After a power cycle, the photons appear to work fine.

According to this the firmware is getting stuck while connecting to the router (because it gets stuck while blinking green).

We are running system version 1.2.1 and are connecting to a WPA2-PSK network.

Has anyone experienced a similar issue or have any insights into what is going on here?

Seeing your code would help a lot.
Any chance you’re using a lot of strings?

As there is no code to go on, some basic diagnostics steps could be

If you flash Tinker to the module - does it freeze?

If not then its probably your code. In that case remove all functions from setup() & loop() and test - then start adding code back section by section .

If you want more help on this matter you will need to share at least those parts of your application pertaining to WiFi.connect(). If they are freezing and getting stuck then this suggests you are running without SYSTEM_THREAD(ENABLED) or you are waiting for WiFi.ready() with no timeout?

Neither of these approaches is realistic for a product in production because loss of WiFi signal/connection is a fact of life.

@Moors7 we are only using c style strings (char arrays).

We are using SYSTEM_THREAD(ENABLED) and SYSTEM_MODE(SEMI_AUTOMATIC), unfortunately I can’t share the code as it is both proprietary and an extremely large project, but I can say with confidence that when it freezes, the firmware does not call any WiFi functions, any WiFi activity is controlled by the system thread.

The difficulty with this is that can’t reliably replicate the issue, it shows up seemingly randomly in approximately 5% of 1000 devices, and after reset it does not return.

We have had issues with connecting to wifi as we have about 50 particles connected to a single router, is it possible that this could create a situation that would cause a freeze like what we’re experiencing?

You could identify on for two of the experts here and see if they would sign an NDA - failing that I would look at the following, based on some recent experiences I have had:

Looking for code that may access non resources in separate threads i.e.

  1. Software timers doing a Serial.print() in their callback function.
  3. If threads are used for functions, check as per 1. above
  4. Any call-backs that do things other than set flags, likewise interrupt functions

Not much help but then not much to go on :slight_smile:

@shanevanj This is helpful I will take a look at those issues, I appreciate the help! I know it’s not much to go on, but unfortunately company policy restricts me from sharing much of the code base. If based on your advice I can narrow the issue down to a smaller section of code I would be able to share that.

Hey Jking,
Were you able to recreate this issue? and fix this?