Hello Particle Community!
This is my first post! So, while I’ve been reading through the guidlines etc. I might get some things wrong, so please gently point me in the right direction if I misbehave I’m also a web developer by heart and super fresh into the Particle sphere, so I apologize in advance for potentially using the wrong vocabulary, etc.
To the issue!
We’re experiencing spontaneous freezing (see below for my definition of freezing) in my company’s Electrons (2G) that are being used by our customers, requiring manual restart through the onboard-reset button which entails opening of the devices’ chassis – not ideal since the devices are business critical to our customers, and customers themselves are located all over the country we’re operating in.
After a bunch of debugging I believe I’ve narrowed the freezing down to the devices loosing cellular/cloud connection and then trying (but seemingly failing) to reconnect. The freeze MIGHT then suddenly occur (i.e. sometimes it reconnects successfully, although I have yet to confirm this).
I managed to reproduce the issue while having the device connected to a terminal. The following logs were the last ones printed before this particular device froze:
0000761078 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 0
0000761178 [comm.protocol] ERROR: Event loop error 3
0000761178 [system] WARN: Communication loop error, closing cloud socket
0000761278 [system] INFO: Cloud: connecting
0000761278 [system] TRACE: sparkSocket Now =0
0000761278 [system] TRACE: Close Attempt
socketClose(0)
761.277 AT send 12 "AT+USOCL=0\r\n"
761.329 AT read OK 6 "\r\nOK\r\n"
socketFree(0)
0000761330 [system] TRACE: socket_close()=success
0000761330 [system] TRACE: HAL_FLASH_Read_ServerAddress() = type:1,domain:$id.udp.particle.io,ip: 46.100.105.36, port: 65535
761.331 AT send 56 "AT+UDNSRN=0,\"33003e000351353337353037.udp.particle.io\"\r\n"
763.025 AT read + 28 "\r\n+UDNSRN: \"54.86.250.117\"\r\n"
763.037 AT read OK 6 "\r\nOK\r\n"
0000763039 [system] INFO: Resolved host 33003e000351353337353037.udp.particle.io to 54.86.250.117
socketSocket(UDP)
763.038 AT send 18 "AT+USOCR=17,5684\r\n"
763.080 AT read + 13 "\r\n+USOCR: 0\r\n"
763.090 AT read OK 6 "\r\nOK\r\n"
Socket 0: handle 0 was created
0000763091 [system] TRACE: socketed udp=1, sparkSocket=0, 1
0000763091 [system] TRACE: connection attempt to 54.86.250.117:5684
0000763093 [system] INFO: Cloud socket connected
0000763093 [system] INFO: Starting handshake: presense_announce=0
0000763095 [comm.protocol.handshake] INFO: Establish secure connection
0000763117 [comm.dtls] TRACE: restore size mismatch 1: 0/220
0000763119 [comm.dtls] INFO: (CMPL,RENEG,NO_SESS,ERR) restoreStatus=2
0000763121 [system] TRACE: send 107
socketSendTo(0,54.86.250.117,5684,,107)
763.121 AT send 37 "AT+USOST=0,\"54.86.250.117\",5684,107\r\n"
0000763122 [app] INFO: Disconnected. Number of disconnects: 1
763.172 AT read > 3 "\r\n@"
763.222 AT send 107 "\x16\xfe\xfd\x00\x00\x00\x00\x00\x00\x00\x00\x00^\x01\x00\x00R\x00\x00\x00\x00\x00\x00\x00R\xfe\xfdI\x18Q7I\xf2\x0e\xfc\xb2\xe1_c\xe8\x9e|\xd4]Bk[\x06\xb9\x88B\x94\xac\xec\xe1\x89\x10\xe28\x00\x00\x00\x04\xc0\xae\x00\xff\x01\x00\x00$\x00\r\x00\x06\x00\x04\x04\x03\x03\x03\x00\n\x00\x04\x00\x02\x00\x17\x00\v\x00\x02\x01\x00\x00\x13\x00\x02\x01\x02\x00\x14\x00\x02\x01\x02"
763.268 AT read + 17 "\r\n+USOST: 0,107\r\n"
763.278 AT read OK 6 "\r\nOK\r\n"
0000763673 [app] INFO: *********************** System.resetReason(): 0
0000763673 [app] INFO: *********************** System.freeMemory(): 87512
764.208 AT read + 17 "\r\n+UUSORD: 0,60\r\n"
Socket 0: handle 0 has 60 bytes pending
764.218 AT send 16 "AT+USORF=0,813\r\n"
764.268 AT read + 98 "\r\n+USORF: 0,\"54.86.250.117\",5684,60,\"\x16\xfe\xfd\x00\x00\x00\x00\x00\x00\x00\x00\x00/\x03\x00\x00#\x00\x00\x00\x00\x00\x00\x00#\xfe\xff Y\x0fN\x9f\xdb\x7f\x99\xb4\r\x1a\xa4\x9b}\xaa\x9f\xca\x13\nM\xaf\xbf\xfe\x81HN\xe9\xe5\xc0\xb7*\x8aW\""
764.284 AT read UNK 2 "\r\n"
764.294 AT read OK 6 "\r\nOK\r\n"
0000764295 [system] TRACE: received 60
0000764295 [system] TRACE: send 139
socketSendTo(0,54.86.250.117,5684,,139)
764.296 AT send 37 "AT+USOST=0,\"54.86.250.117\",5684,139\r\n"
764.348 AT read > 3 "\r\n@"
764.398 AT send 139 "\x16\xfe\xfd\x00\x00\x00\x00\x00\x00\x00\x01\x00~\x01\x00\x00r\x00\x01\x00\x00\x00\x00\x00r\xfe\xfdI\x18Q7I\xf2\x0e\xfc\xb2\xe1_c\xe8\x9e|\xd4]Bk[\x06\xb9\x88B\x94\xac\xec\xe1\x89\x10\xe28\x00 Y\x0fN\x9f\xdb\x7f\x99\xb4\r\x1a\xa4\x9b}\xaa\x9f\xca\x13\nM\xaf\xbf\xfe\x81HN\xe9\xe5\xc0\xb7*\x8aW\x00\x04\xc0\xae\x00\xff\x01\x00\x00$\x00\r\x00\x06\x00\x04\x04\x03\x03\x03\x00\n\x00\x04\x00\x02\x00\x17\x00\v\x00\x02\x01\x00\x00\x13\x00\x02\x01\x02\x00\x14\x00\x02\x01\x02"
764.446 AT read + 17 "\r\n+USOST: 0,139\r\n"
764.456 AT read OK 6 "\r\nOK\r\n"
0000767297 [system] TRACE: send 139
socketSendTo(0,54.86.250.117,5684,,139)
767.296 AT send 37 "AT+USOST=0,\"54.86.250.117\",5684,139\r\n"
767.348 AT read > 3 "\r\n@"
767.398 AT send 139 "\x16\xfe\xfd\x00\x00\x00\x00\x00\x00\x00\x02\x00~\x01\x00\x00r\x00\x01\x00\x00\x00\x00\x00r\xfe\xfdI\x18Q7I\xf2\x0e\xfc\xb2\xe1_c\xe8\x9e|\xd4]Bk[\x06\xb9\x88B\x94\xac\xec\xe1\x89\x10\xe28\x00 Y\x0fN\x9f\xdb\x7f\x99\xb4\r\x1a\xa4\x9b}\xaa\x9f\xca\x13\nM\xaf\xbf\xfe\x81HN\xe9\xe5\xc0\xb7*\x8aW\x00\x04\xc0\xae\x00\xff\x01\x00\x00$\x00\r\x00\x06\x00\x04\x04\x03\x03\x03\x00\n\x00\x04\x00\x02\x00\x17\x00\v\x00\x02\x01\x00\x00\x13\x00\x02\x01\x02\x00\x14\x00\x02\x01\x02"
767.448 AT read + 17 "\r\n+USOST: 0,139\r\n"
767.458 AT read OK 6 "\r\nOK\r\n"
0000767793 [app] INFO: *********************** System.resetReason(): 0
0000767794 [app] INFO: *********************** System.freeMemory(): 87444
0000771945 [app] INFO: *********************** System.resetReason(): 0
0000771945 [app] INFO: *********************** System.freeMemory(): 87444
lib/mbedtls/library/ssl_tls.c:2334: handshake timeout
lib/mbedtls/library/ssl_tls.c:3939: mbedtls_ssl_fetch_input() returned -26624 (-0x6800)
lib/mbedtls/library/ssl_cli.c:1526: mbedtls_ssl_read_record() returned -26624 (-0x6800)
0000773461 [comm.dtls] ERROR: handshake failed -6800
0000773463 [comm.protocol.handshake] ERROR: handshake failed with code 17
0000773463 [system] WARN: Cloud handshake failed, code=17
0000773713 [system] INFO: Cloud: disconnecting
0000773713 [system] TRACE: Close Attempt
socketClose(0)
773.712 AT send 12 "AT+USOCL=0\r\n"
773.754 AT read OK 6 "\r\nOK\r\n"
socketFree(0)
0000773755 [system] TRACE: socket_close()=success
0000773755 [system] INFO: Cloud: disconnected
0000774557 [system] INFO: Cloud: connecting
0000774557 [system] TRACE: sparkSocket Now =-1
0000774557 [system] TRACE: HAL_FLASH_Read_ServerAddress() = type:1,domain:$id.udp.particle.io,ip: 46.100.105.36, port: 65535
774.558 AT send 56 "AT+UDNSRN=0,\"33003e000351353337353037.udp.particle.io\"\r\n"
775.820 AT read + 28 "\r\n+UDNSRN: \"54.164.79.148\"\r\n"
775.832 AT read OK 6 "\r\nOK\r\n"
0000775834 [system] INFO: Resolved host 33003e000351353337353037.udp.particle.io to 54.164.79.148
socketSocket(UDP)
775.833 AT send 18 "AT+USOCR=17,5684\r\n"
775.875 AT read + 13 "\r\n+USOCR: 0\r\n"
775.885 AT read OK 6 "\r\nOK\r\n"
Socket 0: handle 0 was created
0000775886 [system] TRACE: socketed udp=1, sparkSocket=0, 1
0000775888 [system] TRACE: connection attempt to 54.164.79.148:5684
0000775888 [system] INFO: Cloud socket connected
0000775888 [system] INFO: Starting handshake: presense_announce=0
0000775890 [comm.protocol.handshake] INFO: Establish secure connection
0000775912 [comm.dtls] TRACE: restore size mismatch 1: 0/220
0000775914 [comm.dtls] INFO: (CMPL,RENEG,NO_SESS,ERR) restoreStatus=2
0000775916 [system] TRACE: send 107
socketSendTo(0,54.164.79.148,5684,,107)
775.917 AT send 37 "AT+USOST=0,\"54.164.79.148\",5684,107\r\n"
775.969 AT read > 3 "\r\n@"
776.019 AT send 107 "\x16\xfe\xfd\x00\x00\x00\x00\x00\x00\x00\x00\x00^\x01\x00\x00R\x00\x00\x00\x00\x00\x00\x00R\xfe\xfdT*\x1a\x1c%\b\x817\x93\x85\x1f\xef\x1e\xf6rl\x81w\x9f\x1c\xca\xad\x18\x82`:\xe6bo(\xc9\v\x00\x00\x00\x04\xc0\xae\x00\xff\x01\x00\x00$\x00\r\x00\x06\x00\x04\x04\x03\x03\x03\x00\n\x00\x04\x00\x02\x00\x17\x00\v\x00\x02\x01\x00\x00\x13\x00\x02\x01\x02\x00\x14\x00\x02\x01\x02"
776.067 AT read + 17 "\r\n+USOST: 0,107\r\n"
776.077 AT read OK 6 "\r\nOK\r\n"
0000776195 [app] INFO: *********************** System.resetReason(): 0
0000776195 [app] INFO: *********************** System.freeMemory(): 87512
776.857 AT read + 17 "\r\n+UUSORD: 0,60\r\n"
Socket 0: handle 0 has 60 bytes pending
776.869 AT send 16 "AT+USORF=0,813\r\n"
776.919 AT read + 98 "\r\n+USORF: 0,\"54.164.79.148\",5684,60,\"\x16\xfe\xfd\x00\x00\x00\x00\x00\x00\x00\x00\x00/\x03\x00\x00#\x00\x00\x00\x00\x00\x00\x00#\xfe\xfd Y\x0fN\xac\x04\xfail\x83C\xc1zA\x8f\x8f^\x10>\x81jv\xf44\x17\xe4\xce\x8b\xefs\xa47\x1a\""
776.935 AT read UNK 2 "\r\n"
776.945 AT read OK 6 "\r\nOK\r\n"
0000776946 [system] TRACE: received 60
0000776946 [system] TRACE: send 139
socketSendTo(0,54.164.79.148,5684,,139)
776.947 AT send 37 "AT+USOST=0,\"54.164.79.148\",5684,139\r\n"
776.999 AT read > 3 "\r\n@"
777.049 AT send 139 "\x16\xfe\xfd\x00\x00\x00\x00\x00\x00\x00\x01\x00~\x01\x00\x00r\x00\x01\x00\x00\x00\x00\x00r\xfe\xfdT*\x1a\x1c%\b\x817\x93\x85\x1f\xef\x1e\xf6rl\x81w\x9f\x1c\xca\xad\x18\x82`:\xe6bo(\xc9\v\x00 Y\x0fN\xac\x04\xfail\x83C\xc1zA\x8f\x8f^\x10>\x81jv\xf44\x17\xe4\xce\x8b\xefs\xa47\x1a\x00\x04\xc0\xae\x00\xff\x01\x00\x00$\x00\r\x00\x06\x00\x04\x04\x03\x03\x03\x00\n\x00\x04\x00\x02\x00\x17\x00\v\x00\x02\x01\x00\x00\x13\x00\x02\x01\x02\x00\x14\x00\x02\x01\x02"
777.099 AT read + 17 "\r\n+USOST: 0,139\r\n"
777.109 AT read OK 6 "\r\nOK\r\n"
777.949 AT read + 18 "\r\n+UUSORD: 0,118\r\n"
Socket 0: handle 0 has 118 bytes pending
777.961 AT send 16 "AT+USORF=0,813\r\n"
778.011 AT read + 157 "\r\n+USORF: 0,\"54.164.79.148\",5684,118,\"\x16\xfe\xfd\x00\x00\x00\x00\x00\x00\x00\x01\x00i\x02\x00\x00]\x00\x01\x00\x00\x00\x00\x00]\xfe\xfdY\x0fN\xad\x05C=\x14\x8c@\xc7\xdbg\x8b\xb2(MX\x93\xd9\x13\xac8\xbe\x10Zi^\x10X\xdf\x1a \x00z-\xc3\x86\xb3\x91D\xa0\xea\x04\xd4\\\xa9dS\x03/N\xc6\xc5*\xb7\x92\xd1=\xd7r7y-V\xc0\xae\x00\x00\x15\xff\x01\x00\x01\x00\x00\v\x00\x02\x01\x00\x00\x13\x00\x01\x02\x00\x14\x00\x01\x02\""
778.031 AT read UNK 2 "\r\n"
778.041 AT read OK 6 "\r\nOK\r\n"
0000778042 [system] TRACE: received 118
778.041 AT read + 18 "\r\n+UUSORF: 0,228\r\n"
Socket 0: handle 0 has 228 bytes pending
778.053 AT send 16 "AT+USORF=0,813\r\n"
778.113 AT read + 209 "\r\n+USORF: 0,\"54.164.79.148\",5684,170,\"\x16\xfe\xfd\x00\x00\x00\x00\x00\x00\x00\x02\x00\x9d\f\x00\x00\x91\x00\x02\x00\x00\x00\x00\x00\x91\x03\x00\x17A\x04\xb2\xa1\x1a\x03\xd45Hv\x85H\xf27\x9cD\x03\x86\x04\x0e\xd1K\x00+oG\n\xdf\x9f\xb9\xe1\xdf)v\x1eBC\x11\x81p\x80,\xc6\xacF\x1a5\xc7}\xae\xa9pj\r\xde\xee<\xbaJ\x02\x89\x9e\x85MIY\x04\x03\x00H0F\x02!\x00\xe0pO0\xb2\xd5d8\xedre.\xe3\xfe\xe6\xf7\x8a\x8c9\x9f\x1e\xcd\xae\xe5\x85\xbf:;\xa0\xad\xf6\xdc\x02!\x00\x9e\tZ\xe9\xd1l\xda2\xb6B\x16\xb0\xbeYA\xb1\x8a\xa7\xd9\f\xe2w\x87u6\xe2\x1b\x17\x11\xcdu\xca\""
778.137 AT read UNK 2 "\r\n"
778.147 AT read OK 6 "\r\nOK\r\n"
0000778148 [system] TRACE: received 170
I am looking for advice in how to fix this, or of how to better debug the problem. It is quite difficult to reproduce the issue, as the devices can run flawlessly for long periods of time . After hours of scouring the community forum (and Google) I’ve found several helpful advice, although but one reference to the same problem. As far as I can tell, this is the same issue as posted on github here (by @BDub?): https://github.com/spark/firmware/issues/1175, although what we’re experiencing seems to be more severe since the devices are effectively bricked in the eyes of our customers.
If this is indeed a known problem, are there other potential solution than as outlined in the github issue? Unfortunately, restricting the devices only to 3G (as suggested, if I understood it correctly) is not an option. [Edit: Although, possibly as last resort]
Thank you in advance!
Ps.
We’re running:
- Firmware 0.6.1
- SYSTEM_THREAD(ENABLED), and
- SYSTEM_MODE(SEMI_AUTOMATIC).
To clarify what I mean by freezing (please correct me if freezing is not the correct term to use here) and what the setup is – the Electron is enclosed in a plastic casing, and attached to the Electron are a display and a few buttons for user interaction. The display still shows “reasonable” data after the freeze (i.e. no glitches – numbers from a valid state when the freeze occurred is shown), but the device does not respond to button clicks or data sent from the Particle cloud. Nothing is logged, and of course nothing changes on the display. Also, the cyan LED is solid on the Electron (red LED is off), rather than breathing (although this might be coincidental, sometimes the LED is actually completely off while the device/display is still on). [UPDATE: managed to produce another freeze, this one had it’s LED stuck in solid Green]. The serial monitor does not get disconnected when this freeze happens, rather the logs simply stops flowing.
Also, I suspect the code running on our Electrons needs to be posted, although I’m quite paranoid and reluctant to do so straight away (even though it is definitely not rocket science). If the issue can be solved without posting the code it would be tremendous (if not possible, well then by all means I’ll post necessary code).
Futhermore, I do have more logs as well as photographs etc. if needed.
Ds.