Hello!
I have a 2G Electron (SARA G350) that has stopped being able to complete OTA updates.
We used to be able to flash code OTA with this same 3rd party SIM (and same application code) as of about 2 months ago. Since trying it at the beginning of this week, it has not worked once.
Here are some specifics
Device system firmware: 0.6.4 # Although we have also tested with 0.8.0-rc.4
Electron: 2G Sara-G350
SYSTEM_THREAD: not enabled # although we have tried both flavors
SYSTEM_MODE: MANUAL
Other maybe relevant info
- We are mainly doing work in Africa, but are in the United States (Bay Area CA) currently, where we are experiencing this issue.
- We’ve been able to flash the exact same application code in the past; this has stopped working recently
- Unfortunately, the exact time this stopped working is hard to say. All I can say for sure is that OTA flash updating was working before April 10th, and is not working as of June 25th
- Tiny code-bases (for example, tinker but changed for our 3rd party APN – as found in the example here: 3rd party SIM tips – successfully update sometimes, but fail other times (I didn’t test this extremely systematically, i.e. I don’t remember the device system FW version, but we saw both successes and failures)
Configurations
We have 2 different 3rd party SIMs on hand, from GeoSIM and eseye. We tested each of these, as well as a Particle SIM card.
We have tried this with all combinations of:
- SYSTEM_THREAD(ENABLED) and not
- 0.6.4 and 0.8.0-rc.4 (some testing with 0.7.0 but not systematically so I won’t report on that)
- Particle, geosim, and eseye SIM cards.
Results
- Every iteration with the Particle SIM flashed new code successfully (tested by watching output, and also by using a changed static
Particle.variable
to ensure new code was running). So, with all threading and FW version combos, particle SIM worked every time. - Every iteration with both the GeoSIM and the eseye SIM failed to flash.
Failed to flash
When I say failed to flash, it always starts successfully, but then does not complete.
output from device subscribe either looks like this:
{"name":"spark/flash/status","data":"started ","ttl":60,"published_at":"2018-06-28T22:46:56.830Z","coreid":"123"}
{"name":"spark/device/last_reset","data":"update_timeout","ttl":60,"published_at":"2018-06-28T22:48:44.513Z","coreid":"123"}
…or, it simply prints the first “started” event, but then never prints another update-related event.
Serial output
I logged output from every single iteration of this, but will just post 2 failure cases and 1 success case. The output (as far as I can tell, and I spent a lot of time looking through these logs) basically falls into one of 2 camps. I am only printing what I think is relevant (but let me know if more info would be useful vs just noisy).
Failure 1 - Single thread
One note: When using SYSTEM_THREAD(DISABLED); (not including anything related to system threading), it always stops downloading at the chunk with idx=15
.
... [ update starts, and starts receiving chunks ]
...
0000139911 [comm] TRACE: chunk
0000139912 [comm] TRACE: chunk idx=14 crc=1 fast=1 updating=1
0000139914 [comm.protocol] INFO: rcv'd message type=7
139.913 AT read + 18 "\r\n+UUSORF: 0,557\r\n"
Socket 0: handle 0 has 557 bytes pending
139.924 AT send 16 "AT+USORF=0,813\r\n"
140.024 AT read + 597 "\r\n+USORF: 0,\"54.152.148.189\",5684,557,\"\x17\xfe\xfd\x00\x01\x00\x00\x00\x00\x00w\x02 \x00\x01\x00\x00\x00\x00\x00w\x10\"H\xf1\x97\xad\xbc\xe3\xa8\x95 \xd2\x9d#4\x87\x95V{Y\xee)[.\xdd|1%\xfd\xf5~\xc9=rP\xf3AH=\xb8]M\xa2f\xbb'\xd2UX\xe3\x92\xcd\xe5\xb4\x014\v*$:\x11E\xdf\xd1\xf1\x13\xc3\xab\xd2\xbcF\xb4L\xd1\xf7\x12\xc8\xa4^Qv6\x1eZ\x1a\xa7\xfdl3\x93\xa0^\xf4\xa6\x8d\x958\xa2d\x8e\x9aI\x81\x82\xd9(\x90\xc5P\xb0l \xb6\xbe\xc1\"\xcb\xad\x04\x81\xf2\xa8g\r\xcb\xc7\xd1\xf1\xfa\xe3\xd0\xed\xa4-\x8c\xd0\xcbQ\x03\xba\x05\xd6\xa9\x98\x83f*\xa5\xde\xbb\xb8\x16\xc1e\x04s\xaa\xde\x80\x90\x7fo6\xe9\xdf\x16\x95\xa8>\x05W9\xdb\x13\xc9\xa7\xbd\xa3\xed\xffaQ\xb6\x1bb\x8ey\x83\x91{\xc0\xc3\xb3\x7f\x00(X\x99\x9a)4:\b^\x909\xe9\xd2\"\xe6'\xc2!\xc1Nc\x06\xa4\x87\xc3I\x87\xe0@I\xa5\xa6\xe6\x02\xec\x83\xaf\xa5v\x1a\xee\x8d\xdbM\x91\x153g\xd7\x8c\xeer\xb9\xf0\xca1u&\xa4\xe3\xe5\xb9o\xcd\xd4B\x94\x91\x11\xf4\x94\xff@\xcb@XRQ\xb3\x8f*\x05vL\xc8z\xd9\xa2X~z\x8d\x87\xe96\xee(2\xe8ny\x0e}yd|\xe1V\xe7G\x1d\v\x0eU,\x86,f\xbf|\xbcu\x90S\x03\xed\x86\x12&\xc1\x1f\x00\x00v\x83\xbd\xcf\x97v\x1eG\x02Y\"\xc5\xee\x93\xc3:|\x96\x1e|\xd1\xc7\xd2z\xf1\xa2\xe2R=6\xfa\xcbD\xef\x98\x97Y\x02\xc9<\xc2J\\<\x80\x0eE\xda\x9d)\xe4Z]J(\x8fkf\xb1CJ\r\x15'\xee\xf9\x111-8\x7fH\xd3J}YH\xbe\x00\xd7$\xb1\xc4F\xdfj\xd8\xdag\x050\b\xa1nQ\x0eh\x11\x9cz3,I8\x02o1\xe4\xeb\xba\xd5\xa7\xf4h\xbd\xa6\x1f\xe1?#\x924\x19\xa5\b\xa4K6\xda\x8e\xcf\x9c-gB\x05z\a\xd2\xe9\x92\x9e\xa1\nU\x94\t\x01\x8e\x8d\xc3.\xe6\xcb\x18Lw\xea\x9f\x8fe\xc8Tj\xcf[\xd9Q\xd6#OQ\x80\xe8\xe6\v\xbb\xcb\x93!\xb6\xfd\x1c\x15\x04Z\x18_\xd05Q\xf4\x13\xe1\x8c6W\x04\xbf\xcc\x10\xe3\xabG\xf4\xcc\xbc\v\""
140.054 AT read UNK 2 "\r\n"
140.064 AT read OK 6 "\r\nOK\r\n"
0000140065 [system] TRACE: received 557
0000140066 [comm] TRACE: chunk
0000140067 [comm] TRACE: chunk idx=15 crc=1 fast=1 updating=1
0000140069 [comm.protocol] INFO: rcv'd message type=7
[ There is no more output. It simply hangs after idx=15 ]
Failure 2: System Threading ENABLED
This failure is similar, with 2 differences:
- when threaded, the application output continues to print out after the chunk receiving stops
- where the
idx
stops:- when threaded in 0.6.4, it always stops at
chunk idx=15
- when threaded in 0.8.0-rc.4, it always gets to
idx=15
, then jumps to some non-sequential index, at which point it stops (such aschunk idx=31
in the log output below)
- when threaded in 0.6.4, it always stops at
... [ update starts, we download some chunks ]
...
0000709960 [system] TRACE: received 557
0000709962 [comm] TRACE: chunk
0000709964 [comm] TRACE: chunk idx=15 crc=1 fast=1 updating=1
0000709968 [comm.protocol] INFO: rcv'd message type=7
710.066 AT read + 18 "\r\n+UUSORF: 0,557\r\n"
Socket 0: handle 0 has 557 bytes pending
710.078 AT send 17 "AT+USORF=0,1024\r\n"
710.178 AT read + 596 "\r\n+USORF: 0,\"54.205.74.202\",5684,557,\"\x17\xfe\xfd\x00\x01\x00\x00\x00\x00\x003\x02 \x00\x01\x00\x00\x00\x00\x003\xdcw\xa8)\x1b\xde\xff\xa81\xb1b\xf6\xb6?\xe4U\xf2^\xcd\xa6g2qA\nB6u\xben\xc9\x9e\xc5/K\xd2\xcc\xddA\x8d\xaeb\xfd\x80\xad\xe2h|\xdb)\xaa\xa2f\xae\x10\xd9H9\xd5vh\xe5\xfa\xb2m\xedE\v\xaec+w\xf47\xbf\x93\x16\\&\a1\xc4\xbb\xecv\xe9\xd1\x93\x82M\xf2Z\xe2\x86\xb0m|>O\xf3(\xdc:\x1db8S\xbe\x98([0\xa3\xe6\x91\xae\xb8\x1d%\xca\xe0\x93\xf47\x99Wt\x8e\xb1\x8f\x7fR\b\xe4\xb31\xbbDF\x9e\x1d#\x19\x9a\x9a\x1e\xd9\x004E\x89\xfe\x84#\xbd\xe4\x16\xccw\r\x04\x86\xc3Z\x85R\x14/\xd4\xbe|\xac=\xf0\xe4Z\x91e\xe5+\x13\xe0\xd6\xe9\xc3\xbe\xad\xc7\xe3\"\xcd\x95b\v\x7f-w-\x84\xa33P(\x96Eo:,\x94\xdeD[g#\xb1\xc0\xc2}a\xe8\xb6=\xb9\xd3\xa6\xc5\xc5\x89\a)\x87\x85\xe9\xa4h\xc4v\x81q'\xd2\xcf\xd1tF`NS\x9aEMV\xd0\x9a\x11;\xa7\x92L8 \x82\xb7\x157\n8\xcdtS\x16\xe5\xf7\xad\xda\xfd\x17\xe5\xa8G\xf3:\xed\tU<\xff\x1b\xa7\xb6k]\ft\xe7k\xaff}\xd8|\x03'\"\x93\x11\xe4\xd2\xc4\xf4\xf9V\x10vl1p\x8e\x0f\x00\xf4\x80H\xa3\xec\x8f\x8by\x13~\xd4\x8d\xaa\xf7:\x005\x8e\xd3\xbf\xcfIz\xd8\xc5\x13{\xae\xdf\xd8\x00\xa3\xc1\x05/k\xf46\xcd\xcf\x81\xbf\xf3\x15\xfc\x93\x7f5\x1c5$\x11\x1a\xdcZW\xe6L\xe0\x11\xbf\xcf`\x1f)R\xc0\xdd\x9f!\x92\x81\xffa\xa4e\xf9!\xe3\xae\x95x)<\\r3$\xeea\xf2X\xf2\x18J\xa8\xf8#BdZ\xcegW\xa3\xd1\xfbb\xe7\x13\x97]a$\xd2v\xb07y\xdfz\xd2\xc4\xf4\xffo/\xb4\xc8\xcflGGs)\x84\x8c\xc2<\x9d\xe8\x1d\xb0\xa69\xf1O\xb9\xb5\x86\x06d\x84Z\xfaR\x8f\x84[\xe7\xb8\x18\xf7KQ\xbb\xac\xae\xceL0\xec1\x9f\x1e\xda,\xbeB\xbd<\xc0wS\x92\n\x15\xd6\x15\xa5\xe5\x1av\xff\x020\xa3Z\x15BakM\x06\x00F\f\xd7\x9d ;1\x8e*}:\""
710.226 AT read UNK 2 "\r\n"
710.236 AT read OK 6 "\r\nOK\r\n"
710.236 AT send 14 "AT+USORF=0,0\r\n"
710.278 AT read + 15 "\r\n+USORF: 0,0\r\n"
Socket 0: handle 0 has 0 bytes pending
710.290 AT read OK 6 "\r\nOK\r\n"
0000710292 [system] TRACE: received 557
0000710294 [comm] TRACE: chunk
0000710296 [comm] TRACE: chunk idx=31 crc=1 fast=1 updating=1
[ NOTE: output from application code continues to run here, which is different than single threaded case, but nothing related to cloud or cellular ]
Success
Come to think of it, I’m not going to print the success logs, unless someone wants to see them specifically. It doesn’t seem like it would be super useful. But, it succeeds in downloading all the chunks in order (I think there are 94 in our code total), starts to say downloaded successfully, and then closes serial output.
Last couple notes
I’ve looked at 2 similar posts, and tried the suggested things in each:
- Electron OTA Flash Times Out, Even in Safe Mode
- Automatic over-the-air for cellular devices not working for 3G electron
To answer a couple commonly asked questions:
- yes we do have software timers (I have tried disabling on
firmware_update_pending
, but to no avail) - yes we do particle publish. We have one specifically which in my view could be problematic, in a “cloud_status_connected” handler. I have commented this out at some points, but I don’t remember what other changes were in effect when I was trying that (and I never got OTA with a 3rd party SIM to work with different mutant versions of our code)
I have also run @rickkas7 's test code here. Notably, the modem output (with default times) would usually come up like this:
0000009370 [app] INFO: manufacturer=u-blox
0000009410 [app] INFO: model=
0000009450 [app] INFO: firmware version=
0000009530 [app] INFO: ordering code=SARA-G350-01S-00
0000009580 [app] INFO: IMEI=[redacted, but present and correct]
0000009620 [app] INFO: IMSI=
0000009670 [app] INFO: ICCID=[present and correct]
0000009670 [app] INFO: attempting to connect to the cellular network...
0000019501 [system] INFO: Sim not inserted? Detecting...
0000020831 [system] INFO: Sim Ready
0000020831 [system] INFO: ARM_WLAN_WD 1
0000021671 [system] INFO: ARM_WLAN_WD 2
0000021671 [system] INFO: CLR_WLAN_WD 1, DHCP success
0000021671 [app] INFO: connected to the cellular network in 12001 milliseconds
0000021672 [app] INFO: running cellular tests
However, if I extended the MODEM_ON_WAIT_TIME_MS
to 8000 instead of 4000, I get correct output:
0000065227 [app] INFO: manufacturer=u-blox
0000065277 [app] INFO: model=SARA-G350
0000065327 [app] INFO: firmware version=08.70
0000065377 [app] INFO: ordering code=SARA-G350-01S-00
0000065427 [app] INFO: IMEI=[correct]
0000065477 [app] INFO: IMSI=u-blox
0000065527 [app] INFO: ICCID=[correct]
Thanks in advance for any help or insights!