Many unresponsive Xenons with Ethernet

xenon
Tags: #<Tag:0x00007fe2240b6900>

#1

I have many Xenons with an Ethernet Featherwing, not using mesh network, that will stop responding seemingly all at the same time. These Xenons are on different networks but all running the same code and on firmware 0.9.0.

The Xenons will be working great for days, then all of a sudden stop responding yet they are still breathing cyan. A power cycle will bring it right back

The latest even was around 1:30pm PST May 8th.

I know about the May 3rd outage, my Xenons have connected back since then, but have gone offline/unresponsive a couple of times since then.


#2

Do the device all run the same code?
Do they behave the same if they’re running Tinker/Blink an LED?
What’s the gateway doing?

Mind sharing the code that you have running on the devices, since that can cause issues as well?


#3

Yes, all run the same code.
I can test if Tinker also causes problems (stay tuned)
Monitoring an input and flipping relays.

The code is very simple, but would welcome any critiques,

int photoresistor = A0;
int power = A5;
int relay1 = D7; 
int relay2 = D6; 
bool beamBroken = false;
bool isLockOrOpen = false;
int analogvalue = 0;
String localIP;

void setup() {
    pinMode(photoresistor,INPUT);
    pinMode(power,OUTPUT);
    
    pinMode(relay1,OUTPUT);
    pinMode(relay2,OUTPUT);

    digitalWrite(power,HIGH);
    digitalWrite(relay1,LOW);
    digitalWrite(relay2,LOW);

    localIP = Ethernet.localIP().toString();

    Particle.variable("beamIntact", &beamBroken, BOOLEAN);
    Particle.variable("isLockOrOpen", &isLockOrOpen, BOOLEAN);
    Particle.function("cabinet",cabinetToggle);
    Particle.variable("localIP", localIP);
    
    Particle.keepAlive(20);
}


void loop() {
    analogvalue = analogRead(photoresistor);
 if (analogvalue>3400) {
    if (beamBroken==true) {
        delay(200);
        if (analogRead(photoresistor)>3400) {
            Particle.publish("beamStatus","broken",60,PRIVATE);
            beamBroken=false;
            isLockOrOpen=true;
        }
        
    }
    else {
        
    }
  }

  else if (analogvalue<2800){
      if (beamBroken==false) {
       delay(200);
        if (analogRead(photoresistor)<2800){
			Particle.publish("beamStatus","intact",60,PRIVATE);
			beamBroken=true;
			isLockOrOpen=false;
        }
        
      }
      else {
         
      }
  }
    
}


int cabinetToggle(String command) {
    if (command=="unlock") {
        digitalWrite(relay1,HIGH);
        delay(250);
        digitalWrite(relay1,LOW);
        return 1;
    }
    else if (command=="lock") {
        digitalWrite(relay2,HIGH);
        delay(500);
        digitalWrite(relay2,LOW);
        return 0;
    }
    else {
        return -1;
    }
}


#4

This is old syntax it would now be

    Particle.variable("beamIntact", beamBroken);
    Particle.variable("isLockOrOpen", isLockOrOpen);

I don’t think Particle.keepAlive() is required with Ethernet - unless your IP keeps changing :wink:

How have you got the photoresistor connected? Is it part of a voltage divider (intended use) or just set between A0 and A5 (seen somewhere else but not ideal)?

Can you expose System.freeMemory() via a Particle.variable()?


#5

@ScruffR I update the variable syntax, thanks.

I have found that Particle.keepAlive() is required because most corporate firewalls have a UDP timeout from 30-500 seconds. Before I added this the Xenons would only stay connected when idle for as long as the UDP timeout setting was for.

Yes, I have a voltage divider using a 10K resistor.

The results of System.freeMemory() from randomly checking every 5-10 minutes are
96584
96544
96584
96584
96552
96584
96584


#6

@amillen, you may want to add SYSTEM_THREAD(ENABLED); to decouple the user thread from the DeviceOS thread.


#7

@peekay123 I have made this change. I’m trying to learn why this would help?


#8

I believe keepAlive in fw v0.9.0 is still bugged. Everytime the device reconnects to the Particle Cloud it loses the keepAlive value. Thus, you should implement it as follow:

void cloud_status_handler(system_event_t event, int param) {
   if (param == cloud_status_connected) {
            #if Wiring_Cellular
            Particle.keepAlive(20);
            #endif
   }
}

// in setup

        System.on(cloud_status, cloud_status_handler);

It’s possible your issue is simply the keepAlive. Of course it may be something else entirely.


#9

@amillen - I’ve seen this behavior with Xenons, too. Symptoms are similar to what was happening with Argons which, I’m told, was what led to interim OS version 0.9.1. I believe that was a wifi issue (someone please confirm!), but again, same symptoms.


#10

@justicefreed_amper I’m almost certain that keepAlive is still working after reconnects or I would be having more problems. Regardless, I set up a new Xenon with your code addition.

@retupmoc1 I don’t see 0.9.1 as an option, but I did set up a new Xenon with 1.1.0-rc.2 on it.


#11

@amillen - We got 0.9.1 from Particle Support when we began seeing this behavior in Argons.


#12

I wanted to give everyone an update on the status, since my post 13 days ago I have not had a system wide outage.

I have 5 Xenons on my desk…
My original code
Updated code with @ScruffR recommendations
Updated code on separate network
Updated code on 1.1.0-rc2
Updated code with cloud_status_handler from @justicefreed_amper

The only one that I cannot communicate with is the one with the cloud_status_handler although it is breathing cyan.

The output from freeMemory over the last two weeks is
96584 - 5/9 7:33
89792 - 5/9 9:38
89792 - 5/9 10:18
89792 - 5/9 11:50
89744 - 5/9 12:37
89696 - 5/9 14:27
96576 - 5/13 10:21
96608 - 5/13 10:38
96008 - 5/20 10:21
96608 - 5/21 10:26

There have been many disconnects, 14-17, but I can’t say for sure that isn’t my ISP dropping out. I am using 2 different ISPs (Comcast & AT&T) so it is unlikely.


#13

@amillen Did you find a cause and solution - I am guessing by now you will have implemented 1.1.0?

I am running a small mesh network of xenon endnodes and a xenon with ethernet featherwing as a gateway. Everything is on 1.1.0 and I have been noticing the gateway go offline (spark offline) but the LED remain as breathing cyan. The mesh network drops and similar to what you describe the mesh and internet connections recover if the gateway is repowered. I haven’t experimented with all the components, the ethernet LAN cable from the main router over a powerline link and split in an AP - so there could be something there.


#14

I have moved to 1.2.1-rc.2 and my xenons have been MUCH more stable and I have very few disconnects now.


#15

I think that I have 3 different problems.

  1. All Xenons disconnected at 1:30pm PST May 8th.
  2. Some networks will cause the Xenon to disconnect and immediately reconnect several times an hour. See How to Diagnose Disconnects with Xenon on Ethernet
  3. Some networks will cause the Xenon to disconnect and will never reconnect until the unit is power cycled. This (seemingly) randomly happens every 1-5 days.

#1. I truly think this was caused by a Particle system update that happened at the same, but I’m only speculating. It has not happened since.
#2. It is looking like 1.2.1-rc.2 may fix this, but I’m waiting on confirmation, I should know more in a day or two.
#3. It is looking like 1.2.1-rc.2 may fix this, but I’m waiting on confirmation, I need 5-7 days to know for sure.

For problem #3, 1.2.1-rc.2 may not fix the problem but cover it up since there is a new retry policy in the OS. While I’m fine with this as a solution, I would really like to know why this is happening because if it is a networking problem then I would know what to fix or avoid.


#16

Is it possible that for #3 that the network router or AP is causing this - re-leasing of the IP address. What change in OS 1.2.1-rc.2 are you thinking has potentially fixed these issues?


#17

I’m wondering if “expires a session after 3 unsuccessful attempts at connecting to the cloud #1776” will cause the Xenon to re-establish a connection.


#18

On 1.1.0 I was getting a permanent failure after 3 attempts

0051327229 [comm.dtls] WARN: mbedtls_ssl_write returned ffffffff
0051327237 [comm.protocol] ERROR: Event loop error 3
0051327241 [system] WARN: Communication loop error, closing cloud socket
got event 536964888 with value 32
got event 536964888 with value 64
got event 536964888 with value 64
got event 536964888 with val0051327355 [system] ERROR: Failed to load session data from persistent storage
got event 536964888 with value 32
got event 536964888 with value 32
got event 536964888 with value 32
got event 536964888 with value 32
got event 536964888 with value 32
got event 536964888 with value 32
0051337477 [comm.dtls] ERROR: handshake failed -6800
0051337484 [comm.protocol.handshake] ERROR: handshake failed with code 17
0051337490 [system] WARN: Cloud handshake failed, code=17
got event 536964888 with value 64
got event 536964888005 with value 64
1338550 [system] ERROR: Failed to load session data from persistent storage
0051340450 [comm.protocol] ERROR: Channel failed to send message with error-code <0>

Once this sequence Event Loop error 3 appeared - then [system] WARN: Cloud handshake failed, code=17 seemed to be inevitable and a hard reset was needed - even the application watchdog will not bring it back. In 1.2.1 this has gone away and the device reconnects consistently.

I have now got to 7 days on the bench with no reconnect issues at all.


#19

I can now confirm on 1.2.1-rc.2 that neither #1 (Some networks will cause the Xenon to disconnect and immediately reconnect several times an hour) or #2 (Some networks will cause the Xenon to disconnect and will never reconnect until the unit is power cycled), has shown any improvements.

I was able to connect and log the serial output of a device going offline and not reconnecting, my problem #2. Below is the output

[19:57:22.448287 5.175596] 0108954053 [comm.protocol] TRACE: Reply recieved: type=2, code=0
[19:57:22.461926 0.013641] 0108954054 [comm.protocol] INFO: message id 21237 complete with code 0.00
[19:57:22.475860 0.013936] 0108954055 [comm.protocol] INFO: rcv'd message type=13
[19:57:27.553103 5.077238] 0108959157 [comm.protocol] TRACE: Reply recieved: type=2, code=0
[19:57:27.569657 0.016556] 0108959158 [comm.protocol] INFO: message id 21238 complete with code 0.00
[19:57:27.584530 0.014875] 0108959159 [comm.protocol] INFO: rcv'd message type=13
[19:57:32.655909 5.071374] 0108964260 [comm.protocol] TRACE: Reply recieved: type=2, code=0
[19:57:32.670928 0.015022] 0108964261 [comm.protocol] INFO: message id 21239 complete with code 0.00
[19:57:32.685790 0.014863] 0108964262 [comm.protocol] INFO: rcv'd message type=13
[19:57:37.758134 5.072342] 0108969363 [comm.protocol] TRACE: Reply recieved: type=2, code=0
[19:57:37.775279 0.017146] 0108969364 [comm.protocol] INFO: message id 21240 complete with code 0.00
[19:57:37.790570 0.015294] 0108969365 [comm.protocol] INFO: rcv'd message type=13
[19:57:42.864894 5.074319] 0108974469 [comm.protocol] TRACE: Reply recieved: type=2, code=0
[19:57:42.878761 0.013869] 0108974469 [comm.protocol] INFO: message id 21241 complete with code 0.00
[19:57:42.893918 0.015160] 0108974470 [comm.protocol] INFO: rcv'd message type=13
[19:57:47.968239 5.074267] 0108979572 [comm.protocol] TRACE: Reply recieved: type=2, code=0
[19:57:47.982731 0.014543] 0108979573 [comm.protocol] INFO: message id 21242 complete with code 0.00
[19:57:47.997849 0.015119] 0108979574 [comm.protocol] INFO: rcv'd message type=13
[19:58:22.973421 34.975564] 0109014578 [comm.protocol] ERROR: Event loop error 1
[19:58:22.985245 0.011832] 0109014579 [system] WARN: Communication loop error, closing cloud socket
[19:58:23.000231 0.014984] 0109014580 [system] INFO: Cloud: disconnecting
[19:58:23.010027 0.009801] 0109014581 [system] INFO: Cloud: disconnected
[19:58:23.077631 0.067583] 0109014682 [system] INFO: Cloud: connecting
[19:58:23.087917 0.010306] 0109014684 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
[19:58:23.106103 0.018183] 0109014686 [system] INFO: Loaded cloud server address and port from session data
[19:58:23.122519 0.016420] 0109014687 [system] TRACE: Address type: 1
[19:58:23.131438 0.008918] 0109014687 [system] TRACE: Cloud socket=0, family=2, type=2, protocol=17
[19:58:23.146245 0.014807] 0109014688 [system] INFO: Cloud socket=0, connecting to 54.166.181.191#5684
[19:58:23.161701 0.015457] 0109014689 [system] TRACE: Cloud socket=0, connected to 54.166.181.191#5684
[19:58:23.177137 0.015436] 0109014691 [system] TRACE: Updating cloud keepalive for AF_INET: 30000 -> 30000
[19:58:23.193386 0.016249] 0109014692 [system] TRACE: Applying new keepalive interval now
[19:58:23.203764 0.010380] 0109014693 [system] INFO: Cloud socket connected
[19:58:23.208622 0.004858] 0109014694 [system] INFO: Starting handshake: presense_announce=0
[19:58:23.215128 0.006507] 0109014695 [comm.protocol.handshake] INFO: Establish secure connection
[19:58:23.468351 0.253217] 0109015072 [comm.dtls] WARN: session has 0 uses
[19:58:23.480950 0.012602] 0109015084 [comm.dtls] INFO: (CMPL,RENEG,NO_SESS,ERR) restoreStatus=0
[19:58:23.488124 0.007175] 0109015085 [comm.dtls] INFO: out_ctr 0,1,0,0,0,0,9,113, next_coap_id=52fb
[19:58:23.495348 0.007223] 0109015086 [comm.dtls] INFO: app state crc: cached: ad350c27, actual: ad350c27
[19:58:23.503382 0.008035] 0109015087 [comm.dtls] WARN: skipping hello message
[19:58:23.508505 0.005124] 0109015088 [comm.dtls] INFO: restored session from persisted session data. next_msg_id=21243
[19:58:23.517441 0.008936] 0109015089 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 2
[19:58:23.523864 0.006423] 0109015089 [comm.protocol.handshake] INFO: resumed session - not sending HELLO message
[19:58:23.532236 0.008373] 0109015095 [system] INFO: Send spark/hardware/max_binary event
[19:58:23.538303 0.006066] 0109015100 [system] INFO: Send subscriptions
[19:58:23.542684 0.004382] 0109015105 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
[19:58:23.549098 0.006413] 0109015106 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
[19:58:23.555480 0.006382] 0109015107 [comm] INFO: Sending TIME request
[19:58:23.559868 0.004389] 0109015111 [system] INFO: Cloud connected
[19:58:58.529353 34.969476] 0109050133 [comm.protocol] ERROR: Event loop error 1
[19:58:58.541264 0.011914] 0109050133 [system] WARN: Communication loop error, closing cloud socket
[19:58:58.556156 0.014890] 0109050134 [system] INFO: Cloud: disconnecting
[19:58:58.565837 0.009686] 0109050136 [system] INFO: Cloud: disconnected
[19:58:58.632101 0.066260] 0109050236 [system] INFO: Cloud: connecting
[19:58:58.643446 0.011317] 0109050239 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
[19:58:58.661698 0.018283] 0109050240 [system] INFO: Loaded cloud server address and port from session data
[19:58:58.678178 0.016480] 0109050241 [system] TRACE: Address type: 1
[19:58:58.687097 0.008919] 0109050242 [system] TRACE: Cloud socket=0, family=2, type=2, protocol=17
[19:58:58.697830 0.010736] 0109050243 [system] INFO: Cloud socket=0, connecting to 54.166.181.191#5684
[19:58:58.705292 0.007463] 0109050244 [system] TRACE: Cloud socket=0, connected to 54.166.181.191#5684
[19:58:58.712765 0.007472] 0109050245 [system] TRACE: Updating cloud keepalive for AF_INET: 30000 -> 30000
[19:58:58.720607 0.007842] 0109050246 [system] TRACE: Applying new keepalive interval now
[19:58:58.726785 0.006179] 0109050248 [system] INFO: Cloud socket connected
[19:58:58.731660 0.004875] 0109050248 [system] INFO: Starting handshake: presense_announce=0
[19:58:58.738318 0.006657] 0109050250 [comm.protocol.handshake] INFO: Establish secure connection
[19:58:59.022225 0.283903] 0109050627 [comm.dtls] WARN: session has 1 uses
[19:58:59.035123 0.012901] 0109050640 [comm.dtls] INFO: (CMPL,RENEG,NO_SESS,ERR) restoreStatus=0
[19:58:59.042901 0.007779] 0109050641 [comm.dtls] INFO: out_ctr 0,1,0,0,0,0,9,139, next_coap_id=5303
[19:58:59.050296 0.007395] 0109050642 [comm.dtls] INFO: app state crc: cached: ad350c27, actual: ad350c27
[19:58:59.058113 0.007817] 0109050643 [comm.dtls] WARN: skipping hello message
[19:58:59.063610 0.005496] 0109050644 [comm.dtls] INFO: restored session from persisted session data. next_msg_id=21251
[19:58:59.072836 0.009226] 0109050646 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 2
[19:58:59.079310 0.006476] 0109050647 [comm.protocol.handshake] INFO: resumed session - not sending HELLO message
[19:58:59.087852 0.008541] 0109050651 [system] INFO: Send spark/hardware/max_binary event
[19:58:59.094059 0.006208] 0109050660 [system] INFO: Send subscriptions
[19:58:59.098532 0.004473] 0109050664 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
[19:58:59.105024 0.006490] 0109050665 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
[19:58:59.111522 0.006499] 0109050666 [comm] INFO: Sending TIME request
[19:58:59.116081 0.004560] 0109050668 [system] INFO: Cloud connected
[19:59:34.087462 34.971369] 0109085702 [comm.protocol] ERROR: Event loop error 1
[19:59:34.098943 0.011487] 0109085702 [system] WARN: Communication loop error, closing cloud socket
[19:59:34.114131 0.015188] 0109085702 [system] INFO: Cloud: disconnecting
[19:59:34.123698 0.009569] 0109085704 [system] INFO: Cloud: disconnected
[19:59:34.189565 0.065866] 0109085804 [system] INFO: Cloud: connecting
[19:59:34.200108 0.010529] 0109085806 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
[19:59:34.217989 0.017897] 0109085808 [system] INFO: Loaded cloud server address and port from session data
[19:59:34.234151 0.016162] 0109085809 [system] TRACE: Address type: 1
[19:59:34.242825 0.008674] 0109085812 [system] TRACE: Cloud socket=0, family=2, type=2, protocol=17
[19:59:34.257104 0.014281] 0109085812 [system] INFO: Cloud socket=0, connecting to 54.166.181.191#5684
[19:59:34.264480 0.007377] 0109085812 [system] TRACE: Cloud socket=0, connected to 54.166.181.191#5684
[19:59:34.271806 0.007325] 0109085814 [system] TRACE: Updating cloud keepalive for AF_INET: 30000 -> 30000
[19:59:34.279480 0.007674] 0109085814 [system] TRACE: Applying new keepalive interval now
[19:59:34.285588 0.006109] 0109085816 [system] INFO: Cloud socket connected
[19:59:34.290369 0.004780] 0109085816 [system] INFO: Starting handshake: presense_announce=0
[19:59:34.296737 0.006368] 0109085816 [comm.protocol.handshake] INFO: Establish secure connection
[19:59:34.583917 0.287177] 0109086198 [comm.dtls] WARN: session has 2 uses
[19:59:34.596557 0.012642] 0109086211 [comm.dtls] INFO: (CMPL,RENEG,NO_SESS,ERR) restoreStatus=0
[19:59:34.604012 0.007455] 0109086211 [comm.dtls] INFO: out_ctr 0,1,0,0,0,0,9,167, next_coap_id=530b
[19:59:34.611248 0.007237] 0109086213 [comm.dtls] INFO: app state crc: cached: ad350c27, actual: ad350c27
[19:59:34.619269 0.008021] 0109086213 [comm.dtls] WARN: skipping hello message
[19:59:34.624531 0.005262] 0109086213 [comm.dtls] INFO: restored session from persisted session data. next_msg_id=21259
[19:59:34.633476 0.008945] 0109086215 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 2
[19:59:34.639865 0.006389] 0109086215 [comm.protocol.handshake] INFO: resumed session - not sending HELLO message
[19:59:34.648193 0.008328] 0109086221 [system] INFO: Send spark/hardware/max_binary event
[19:59:34.654301 0.006108] 0109086229 [system] INFO: Send subscriptions
[19:59:34.658680 0.004379] 0109086231 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
[19:59:34.665071 0.006392] 0109086233 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
[19:59:34.671470 0.006398] 0109086233 [comm] INFO: Sending TIME request
[19:59:34.675842 0.004372] 0109086236 [system] INFO: Cloud connected
[20:00:09.643954 34.968057] 0109121258 [comm.protocol] ERROR: Event loop error 1
[20:00:09.655430 0.011526] 0109121258 [system] WARN: Communication loop error, closing cloud socket
[20:00:09.670719 0.015290] 0109121258 [system] INFO: Cloud: disconnecting
[20:00:09.680432 0.009715] 0109121260 [system] INFO: Cloud: disconnected
[20:00:09.744860 0.064426] 0109121360 [system] INFO: Cloud: connecting
[20:00:09.754951 0.010093] 0109121365 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
[20:00:09.773972 0.019020] 0109121367 [system] INFO: Loaded cloud server address and port from session data
[20:00:09.790513 0.016538] 0109121367 [system] TRACE: Address type: 1
[20:00:09.799315 0.008806] 0109121370 [system] TRACE: Cloud socket=0, family=2, type=2, protocol=17
[20:00:09.814452 0.015132] 0109121370 [system] INFO: Cloud socket=0, connecting to 54.166.181.191#5684
[20:00:09.830071 0.015621] 0109121372 [system] TRACE: Cloud socket=0, connected to 54.166.181.191#5684
[20:00:09.845565 0.015496] 0109121372 [system] TRACE: Updating cloud keepalive for AF_INET: 30000 -> 30000
[20:00:09.861880 0.016316] 0109121374 [system] TRACE: Applying new keepalive interval now
[20:00:09.873058 0.011180] 0109121374 [system] INFO: Cloud socket connected
[20:00:09.877893 0.004835] 0109121374 [system] INFO: Starting handshake: presense_announce=0
[20:00:09.884410 0.006517] 0109121376 [comm.protocol.handshake] INFO: Establish secure connection
[20:00:10.138194 0.253780] 0109121752 [comm.dtls] WARN: session has 3 uses
[20:00:10.143690 0.005498] 0109121752 [comm.dtls] WARN: session has expired after 3 uses
[20:00:10.149727 0.006039] 0109121754 [comm.dtls] INFO: (CMPL,RENEG,NO_SESS,ERR) restoreStatus=2
[20:00:10.156511 0.006785] 0109121754 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
[20:00:19.148258 8.991738] 0109130763 [comm.dtls] ERROR: handshake failed -6800
[20:00:19.160151 0.011896] 0109130765 [comm.protocol.handshake] ERROR: handshake failed with code 17
[20:00:19.175220 0.015070] 0109130767 [system] WARN: Cloud handshake failed, code=17
[20:00:19.402043 0.226821] 0109131017 [system] INFO: Cloud: disconnecting
[20:00:19.412344 0.010303] 0109131017 [system] INFO: Cloud: disconnected
[20:00:20.204445 0.792097] 0109131819 [system] INFO: Cloud: connecting
[20:00:20.214738 0.010297] 0109131821 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
[20:00:20.232525 0.017787] 0109131823 [system] ERROR: Failed to load session data from persistent storage
[20:00:20.248537 0.016014] 0109131823 [system] INFO: Discarding session data
[20:00:20.258822 0.010285] 0109131825 [system] TRACE: Resolving e00fce68f8cf6601fa0e2fce.udp-mesh.particle.io#5684
[20:00:36.165056 15.906209] 0109147777 [system] ERROR: Failed to determine server address
[20:00:36.178405 0.013368] 0109147779 [system] TRACE: Address type: 3
[20:00:36.187352 0.008951] 0109147779 [system] WARN: Cloud socket connection failed: -230
[20:00:36.200257 0.012906] 0109147780 [system] WARN: Internet available, Cloud not reachable!
[20:00:36.213880 0.013623] 0109147781 [system] WARN: Handling cloud error: 3
[20:00:37.166293 0.952408] 0109148781 [system] INFO: Cloud: connecting
[20:00:37.179818 0.013526] 0109148783 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
[20:00:37.197521 0.017703] 0109148785 [system] ERROR: Failed to load session data from persistent storage
[20:00:37.213590 0.016072] 0109148785 [system] INFO: Discarding session data
[20:00:37.223874 0.010286] 0109148786 [system] TRACE: Resolving e00fce68f8cf6601fa0e2fce.udp-mesh.particle.io#5684
[20:00:53.165114 15.941234] 0109164777 [system] ERROR: Failed to determine server address
[20:00:53.178446 0.013335] 0109164779 [system] TRACE: Address type: 3
[20:00:53.187363 0.008920] 0109164779 [system] WARN: Cloud socket connection failed: -230
[20:00:53.200262 0.012899] 0109164780 [system] WARN: Internet available, Cloud not reachable!
[20:00:53.213907 0.013644] 0109164781 [system] WARN: Handling cloud error: 3
[20:00:54.166369 0.952455] 0109165781 [system] INFO: Cloud: connecting
[20:00:54.176723 0.010358] 0109165784 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
[20:00:54.194446 0.017725] 0109165784 [system] ERROR: Failed to load session data from persistent storage
[20:00:54.210439 0.015994] 0109165786 [system] INFO: Discarding session data
[20:00:54.220771 0.010332] 0109165787 [system] TRACE: Resolving e00fce68f8cf6601fa0e2fce.udp-mesh.particle.io#5684
[20:01:10.164885 15.944089] 0109181777 [system] ERROR: Failed to determine server address
[20:01:10.178203 0.013341] 0109181779 [system] TRACE: Address type: 3
[20:01:10.187153 0.008952] 0109181779 [system] WARN: Cloud socket connection failed: -230
[20:01:10.200058 0.012906] 0109181781 [system] WARN: Internet available, Cloud not reachable!
[20:01:10.213685 0.013626] 0109181781 [system] WARN: Handling cloud error: 3
[20:01:11.166386 0.952658] 0109182781 [system] INFO: Cloud: connecting
[20:01:11.176607 0.010262] 0109182783 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
[20:01:11.194373 0.017767] 0109182785 [system] ERROR: Failed to load session data from persistent storage
[20:01:11.210509 0.016137] 0109182785 [system] INFO: Discarding session data
[20:01:11.220768 0.010259] 0109182788 [system] TRACE: Resolving e00fce68f8cf6601fa0e2fce.udp-mesh.particle.io#5684
[20:01:27.164549 15.943777] 0109198777 [system] ERROR: Failed to determine server address
[20:01:27.178135 0.013589] 0109198779 [system] TRACE: Address type: 3
[20:01:27.187000 0.008866] 0109198779 [system] WARN: Cloud socket connection failed: -230
[20:01:27.199667 0.012666] 0109198781 [system] WARN: Internet available, Cloud not reachable!
[20:01:27.213103 0.013437] 0109198783 [system] WARN: Handling cloud error: 3
[20:01:28.168942 0.955832] 0109199783 [system] INFO: Cloud: connecting
[20:01:28.179007 0.010070] 0109199785 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
[20:01:28.196394 0.017388] 0109199786 [system] ERROR: Failed to load session data from persistent storage
[20:01:28.212238 0.015844] 0109199787 [system] INFO: Discarding session data
[20:01:28.222481 0.010243] 0109199790 [system] TRACE: Resolving e00fce68f8cf6601fa0e2fce.udp-mesh.particle.io#5684
[20:01:44.164588 15.942062] 0109215777 [system] ERROR: Failed to determine server address
[20:01:44.178016 0.013473] 0109215779 [system] TRACE: Address type: 3
[20:01:44.186857 0.008841] 0109215779 [system] WARN: Cloud socket connection failed: -230
[20:01:44.199630 0.012773] 0109215781 [system] WARN: Internet available, Cloud not reachable!
[20:01:44.213194 0.013565] 0109215783 [system] WARN: Handling cloud error: 3
[20:01:46.169157 1.955961] 0109217783 [system] INFO: Cloud: connecting
[20:01:46.173825 0.004672] 0109217787 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
[20:01:46.182653 0.008827] 0109217787 [system] ERROR: Failed to load session data from persistent storage
[20:01:46.190432 0.007780] 0109217789 [system] INFO: Discarding session data
[20:01:46.195308 0.004877] 0109217790 [system] TRACE: Resolving e00fce68f8cf6601fa0e2fce.udp-mesh.particle.io#5684

#20

I added this into my app to aid with some insight into what is happening in the background without going the full debugger route

In setup()

System.on(all_events, handle_all_the_events);

Added

// ---------------------------------------------------------------------------
void handle_all_the_events(system_event_t event, int param )
// ---------------------------------------------------------------------------
{
    const int maxSizeOfEventText = 59;
    const int maxSizeOfEventMsg = 60;

    char systemParamMsg[maxSizeOfEventMsg] = "Non defined event"; // undefined event in DeviceOS
    
    switch (event) {
        case setup_begin: // signals the device has entered setup mode
            snprintf(systemParamMsg, maxSizeOfEventText, "Setup started");   
            break;
        case setup_update: // periodic event signaling the device is still in setup mode.
            snprintf(systemParamMsg, maxSizeOfEventText, "Setup busy for last %d ms", param);
            break;
        case setup_end: // signals setup mode was exited
            snprintf(systemParamMsg, maxSizeOfEventText, "Setup ended after %d ms", param);
            break;
        case network_credentials: // network credentials were changed
            switch(param) {
                case network_credentials_added:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Network credentials added" ); 
                    break;   
                case network_credentials_cleared:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Network credentials cleared" ); 
                    break;   
                default:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Unknown network_credentials event ID [%d]", event ); // unknown event so just show value
                    break;
            } 
            break;
        case network_status: // network connection status
            switch(param) {
                case network_status_powering_on:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Network starting" ); 
                    break;   
                case network_status_on:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Network on" ); 
                    break;   
               case network_status_powering_off:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Network stopping" ); 
                    break;   
                case network_status_off:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Network off" ); 
                    break;   
               case network_status_connecting:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Network connecting" ); 
                    break;   
                case network_status_connected:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Network connected" ); 
                    break;   
                default:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Unknown network_status event ID [%d]", event ); // unknown event so just show value
                    break;
            } 
            break;
        case cloud_status: // cloud connection status
            switch (param) {
                case cloud_status_connecting:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Cloud connecting" ); 
                    break;
                case cloud_status_connected:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Cloud connected" ); 
                    break;
                case cloud_status_disconnecting:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Cloud disconnecting" ); 
                    break;
                case cloud_status_disconnected:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Cloud disconnected" ); 
                    break;
                default:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Unknown cloud_status event ID [%d]", event ); // unknown event so just show value
                    break;
            }    
            break;
        case button_status: // button pressed or released
            snprintf(systemParamMsg, maxSizeOfEventText, "Mode button pressed for %d ms", param);
            break;
        case firmware_update: // firmware update status
            switch (param) {
                case firmware_update_begin:
                    snprintf(systemParamMsg, maxSizeOfEventText, "F/W update started" ); 
                    break;
                case firmware_update_progress:
                    snprintf(systemParamMsg, maxSizeOfEventText, "F/W update in progress" ); 
                    break;
                case firmware_update_complete:
                    snprintf(systemParamMsg, maxSizeOfEventText,"F/W update done" ); 
                    break;
                case firmware_update_failed:
                    snprintf(systemParamMsg, maxSizeOfEventText, "F/W update failed" ); 
                    break;
                default:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Unknown firmware_update event ID [%d]", event ); // unknown event so just show value
                    break;
            }  
            break;
        case firmware_update_pending: //  notifies the application that a firmware update is available. 
            snprintf(systemParamMsg, maxSizeOfEventText, "Firmware update available");
            break;
        case reset_pending: // notifies the application that the system would like to reset
            snprintf(systemParamMsg, maxSizeOfEventText, "Reset requested");
            break;
        case reset: // notifies that the system will reset once the application has completed handling this event
            snprintf(systemParamMsg, maxSizeOfEventText, "Reset started");
            tbDisconnect();
            break;
        case button_click: // event sent each time setup button is clicked
            snprintf(systemParamMsg, maxSizeOfEventText, "Mode button clicked %d times", system_button_clicks(param) );
            break;
        case button_final_click: // at the end of a series of clicks.
            snprintf(systemParamMsg, maxSizeOfEventText, "Mode button clicked %d times in total", system_button_clicks(param) );
            break;
        case time_changed: // device time changed
            switch (param) {
                case time_changed_manually:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Time updated manually" ); 
                    break;
                case time_changed_sync:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Time synchronised" ); 
                    break;
                default:
                    snprintf(systemParamMsg, maxSizeOfEventText, "Unknown time_changed event ID [%d]", event ); // unknown event so just show value
                    break;
            }
            break;
        case low_battery: // low battery condition is detected
            checkBattV(); // sends particle.publish immediately with battery level
            break;
        case out_of_memory: // out of memeory
            snprintf(systemParamMsg, maxSizeOfEventText, "Out of memory : %d bytes required", param );
            break;
        default: // undefined event in DeviceOS
            snprintf(systemParamMsg, maxSizeOfEventText, "Unknown event ID [%d]", event ); // unknown event so just show value
            break;
    }
    
    Log.warn(systemParamMsg);
    
    // special cses where the Log message needs to be shown before the next action i.e. Reset]
    switch (event) {
        case firmware_update_pending: //  notifies the application that a firmware update is available. 
            gracefullReset();
            break;
        case reset_pending: // notifies the application that the system would like to reset
            gracefullReset();
        default:
            break;
    }
}

Its a bit long winded and could probably be improved - but it does break out a lot of info and helped me understand what was happening without copious amounts of trace information being displayed. YMMV :slight_smile: