Boards continuously waking from sleep mode due to mysterious network messages, draining battery

This is a repost of an earlier post I made but with new information and I have consolidated all the information. I've been struggling with this problem for a while and it is causing a big issue for my current project. Any help is appreciated!

I have a fleet of 30 boards. I have them all set up to sleep for most of the time but wake up from network messages, a GPIO pin, or a timeout of 15 minutes (to keep them all cloud connected). Only the GPIO pin wakeup method keeps the board awake for an extended period of time, the other wake up methods wake the board up for a very short amount of time and then puts them back to sleep. This is becoming a big issue for me because my boards are battery powered and these continuous wakeups are draining the batteries much faster than I had hoped.

This mostly works except I've noticed some of the boards will wake up from mysterious network messages very frequently. In fact, there seems to be a relationship between the amount of boards in my fleet that are powered on and the amount of boards that wake frequently from their sleep mode. I am trying to figure out the cause of the network message that is waking them up and determine if there is a way to prevent these messages from happening.

Here is a screenshot of my event logs during a short snippet of time when all 30 of my boards were powered on. The board names are omitted for project privacy but the first two letters were kept to indicate that this is happening to multiple different boards, not just one.

Here is the extended version of one of those captures:

{
"device":{
"network":{
"cellular":{
"radio_access_technology":"LTE"
"operator":"AT&T Wireless Inc."
"cell_global_identity":{
"mobile_country_code":310
"mobile_network_code":"410"
"location_area_code":21766
"cell_id":199695120
}
}
"signal":{
"at":"LTE Cat-M1"
"strength":76.66
"strength_units":"%"
"strengthv":-84
"strengthv_units":"dBm"
"strengthv_type":"RSRP"
"quality":37.5
"quality_units":"%"
"qualityv":-13.5
"qualityv_units":"dB"
"qualityv_type":"RSRQ"
}
"connection":{
"status":"connected"
"error":0
"disconnects":0
"attempts":1
"disconnect_reason":"unknown"
}
}
"cloud":{
"connection":{
"status":"connected"
"error":0
"attempts":1
"disconnects":0
"disconnect_reason":"none"
}
"coap":{
"transmit":17
"retransmit":4
"unack":0
"round_trip":296
}
"publish":{
"rate_limited":0
}
}
"power":{
"battery":{
"charge":{
"err":-210
}
"state":"disconnected"
}
"source":"USB adapter"
}
"system":{
"uptime":140
"memory":{
"used":71152
"total":165960
}
}
}
"service":{
"device":{
"status":"ok"
}
"cloud":{
"uptime":124
"publish":{
"sent":3
}
}
"coap":{
"round_trip":195
}
}
}

Here is a log of the SerialLogHandler(LOG_LEVEL_TRACE) of one of these boards during a wakeup:

0000140860 [comm.coap] TRACE: Sending CoAP message
0000140860 [comm.coap] TRACE: CON 0.00  size=4 token= id=17
0000140869 [ncp.at] TRACE: > AT+COPS=3,2
0000140870 [ncp.at] TRACE: < +CEREG: 5,"5506","0be71b10",7
0000140875 [ncp.at] TRACE: < OK
0000140881 [ncp.at] TRACE: > AT+COPS?
0000140885 [ncp.at] TRACE: < +COPS: 0,2,"310410",7
0000140886 [ncp.at] TRACE: < OK
0000140888 [ncp.at] TRACE: > AT+UCGED?
0000141375 [ncp.at] TRACE: < +UCGED: 2
0000141376 [ncp.at] TRACE: < 6,4,310,410
0000141378 [ncp.at] TRACE: < 5110,12,50,50,5506,be71b10,181,cccb4d98,ff01,13,57,13,-1.43,3,255,255,56,7,-12,300,255,255,80
0000141389 [ncp.at] TRACE: < OK
0000141394 [ncp.at] TRACE: > AT+COPS=3,2
0000141398 [ncp.at] TRACE: < OK
0000141399 [ncp.at] TRACE: > AT+COPS?
0000141405 [ncp.at] TRACE: < +COPS: 0,2,"310410",7
0000141406 [ncp.at] TRACE: < OK
0000141411 [ncp.at] TRACE: > AT+CEREG?
0000141416 [ncp.at] TRACE: < +CEREG: 2,5,"5506","0be71b10",7
0000141421 [ncp.at] TRACE: < OK
0000141422 [comm.coap] TRACE: Sending CoAP message
0000141427 [comm.coap] TRACE: CON POST /d?\x04 size=197 token=6b id=18
0000141435 [comm.coap] TRACE: Received CoAP message
0000141436 [comm.coap] TRACE: ACK 0.00  size=4 token= id=17
0000141639 [comm.coap] TRACE: Received CoAP message
0000141640 [comm.coap] TRACE: ACK 0.00  size=5 token=6b id=18
0000142371 [system.sleep] TRACE: Entering system_sleep_ext()
0000142371 [system.sleep] TRACE: Interface 3 is off already

For more context, here is my sleep config settings:

config.mode(SystemSleepMode::STOP)
    .gpio(LIMIT_SWITCH, CHANGE) 
    .duration(15min)
    .network(NETWORK_INTERFACE_CELLULAR);

Any clue here as to why my boards are receiving these messages? Any way to stop them? Any help is greatly greatly appreciated. Thank you!

What's before that log? The previous sleep?

From that log, it looks like it's sending something, not to receive something. It sends two CoAP messages and receives two ACKs.

When message 16 was sent (before this section of the log), was the ACK received for that?

Unfortunately, I took this capture a long time ago and I don't remember what the log looked like before this.

But your response just prompted me to check into something. I think I figured out the issue - at some point in my code I am publishing vitals. Then, the ACK is waking it up, and I have a bug that publishes the vitals again upon wakeup, and then this manifests in a seemingly endless loop.

I'm making a fix right now. Hopefully that is all it is.

2 Likes

That sounds like a plausible cause!

2 Likes

Hopefully! I can't believe it took me so long to recognize this.

1 Like