Mesh Sleep Issues

We have a system where we have a Xenon constantly sleeping and sending notifications to a Boron working as a gateway. It basically sleeps a defined time (SLEEP_TIME) and is woken up either by the timer or an external trigger. However since we have noticed the mesh connection isn’t very reliable we have enabled the watchdog with a (WATCHDOG_TIMER) to prevent the Xenon from freezing when it tries to connect to the Boron. From time to time we get a panic when the Xenon is sleeping (we catch and log the reset reason at boot time). We wonder if the issue could be that the Watchdog is somehow causing the panics? Additionally we have noticed that when the device awakens it sometimes takes a lot of time to connect to the Boron even if it is nearby. This issue happens more often when there is more than one Xenon connected to the Boron’s mesh network.

Hi - not much to go on here - please post (minimalist) code that duplicates the issue - the use case described seems very trivial so that must be other factors at play here.

I can understand what you describing and I am seeing something similar with >1 sleepy end node with a Xenon in an ethernet featherwing. As @shanevanj has said, there are a number of things at play which might be causing the Panics. I have ruled out low memory/heap and stack with my code, the timing of the Panic appears quite random, can be very soon after a restart or a while later. I have some reason to believe that the handling of the mesh publish by the end node might be the cause in that a collision might occur and then this puts the publish in a spin leading to the panic. I have just started to investigate the Workbench debugger but to properly use that I need a reproduce-able panic. Waiting for hours is just not feasible.

If you unable to share your code then sharing any observations of the factors that are relevant to the causing of panics would be gratefully received.

Unfortunately i cant share the code due to NDA reasons, but the idea is something like this-

We use SYSTEM_THREAD(ENABLED) and SYSTEM_MODE(MANUAL) to be more in control of the mesh functionality. We also use STARTUP(System.enableFeature(FEATURE_RESET_INFO)) to be able to tell why the Xenon reboots. Then on boot we set D2 to work as an input, and read it. If it is HIGH (our device was woken up externally) we send an alarm to the Boron. To send the alarm we turn on the mesh, and connect to the mesh. This part usually fails a couple of times and triggers the watchdog. If there was no external signal we just send a “ping” to the Boron (following the same procedure as the alarm) and after we send the alarm or ping our device goes to sleep. The code is very simple so i doubt we are corrupting the stack. We have some logging in place to help us debug further and on every panic we have noticed that the device panics while sleeping.

This sounds rather unlikely. Since no code is executed while sleeping there is no reason for a panic condition to occure.
It may either panic on sleep (aka going to) or on wake but I cannot imagine a condition that would cause this during sleep.

Great - so why not create a new, simple application that shows the problem and then post this (no NDA issues) - we can then really help - the problem with the above descriptions are that it just leads to guess work and wastes time that could be used to help you solve your problem.

We needs things like:

  • Error producing code (as above)
  • Hardware module used
  • Device OS version

Sorry but your observations raise a ton of questions about how you have coded this. I will try and deal with what you have shared - as @shanevanj has explained you really need to share code snippets. @ScruffR has hopefully disavowed you of any thought that once a device is sleeping that it will panic whilst in stop. You are not describing which sleep mode you are using, I assume not SLEEP_MODE_DEEP.

To send the alarm we turn on the mesh, and connect to the mesh. This part usually fails a couple of times and triggers the watchdog.

How do you turn on Mesh radio and connect to the network? What timeout are you giving the Mesh.connect() and how are you tested for connected? Are you using the Application Watchdog and how have you got this setup. Are you tripping the WD because you are not tickling it whilst waiting to Mesh connect? Mesh Gateways cannot sleep, if you use the Boron as a gateway and you are putting it to sleep then the architecture you described will not work. IMO the Application Watchdog is not necessary on a Xenon mesh node.

If there was no external signal we just send a “ping” to the Boron (following the same procedure as the alarm) and after we send the alarm or ping our device goes to sleep.

Is this external signal the D2 on the Xenon? How do you "ping" the Boron? And which device is going to sleep?

Have you tried using SYSTEM_MODE(SEMI_AUTOMATIC)? Also, after wake from sleep you will need to call Mesh.connect() and check for Mesh.ready() before Mesh publishing.