Automatic over-the-air for cellular devices not working for 3G electron

The new OTA updates rolled out for cellular devices seems to be causing problems with my electrons. When flashing OTA it goes through the magenta blinking but then the RGB either goes solid magenta for 2-3 minutes before restarting the device, or the RGB will go completely off for 2-3 minutes and will then restart the device, which will connect normally after the reset except it will not have loaded the new firmware. I have manually checked the device version via particle.identify within the CLI to be version 7.0 I also checked the web IDE version to be the matching version 7.0 when compiling and then flash OTA and it goes through the same timeout, blinking magenta, solid magenta/rgb off, restart without the new firmware being updated. Anyone else having problems with the new OTA updates? Before the new OTA update recently implemented everything was running smoothly for me using the same procedure. I have also tried sending a previous version of firmware (6.2) just to see what happens and I get the same result. I have also tried flashing from the CLI OTA and the same problem occurs. For more information I am also running my electrons on 3rd party sims with 30 sec keep alive and blynk firmware, not sure if that has anything to do with it.

****EDIT not sure if this has anything to do with the new automatic updates, but am leaving original post for reference

Sounds similar to another guy on here also using a 3rd part SIM and trying the update with issues.

@rickkas7 Is going to test this on his end to see if the 3rd Party SIM is causing issues. For him the Particle SIM worked.

do you have a link to the thread? I did a search and I did not see it.

Thank you very much @RWB!

1 Like

I have tracked this down somewhat more. I found that this problem only happens when my code contains functions. For example:

if (pump1 == true){
        pump1timer();
}

if I comment out all functions throughout my code the problem goes away, and my OTA updates work again as normal. Also note, this program was running the same code with OTA updates and the same code previous to the cellular update changes

I have also manually downgraded to firmware 6.4 (verified with CLI) and then compiled code for 6.4 with the web IDE and flashed my code locally through serial and after completing this, the OTA updates still have the same problem. (also tried 6.2 after trying 6.4)

The symptoms you are experiencing are usually rooted in the application firmware locking shared resources when employing multi threaded tasks (e.g. SYSTEM_THREAD(ENABLED) or Software Timers).
In such cases the attempted OTA update and one of the application threads demand a shared resource and deadlock.

Without seeing your actual code it’s difficult to say and listing all the possible reasons that may lead to such a behaviour doesn’t fit the scope of this topic.

If you have located some suspect functions, you could subscribe to the system event that signals a pending update and set a flag that prevents the call of these functions.

2 Likes

@ScruffR thank you for your help. Maybe I’m not understanding the way code is flashed to the electron, but I tried some tests, I flashed the suspect code to the electron locally with CLI and then try to flash a different user code OTA, it will work, but if I flash suspect code with CLI locally and then try to flash same code again OTA it will not work. Also if I have tinker or other user code uploaded and try OTA of different user code it will work, but if I have tinker or other user code running and try suspect code OTA, it will fail. Because I can not flash suspect code OTA while running tinker that made me believe it is not a shared resource, or am I missing something on how OTA flashing works?

Also, I am not using system_thread (enabled)

How big is the binary of your `“supect” code and how big were your other test binaries?
Try flashing this sketch and then watch what the serial log output shows during OTA update

// set APN as needed for 3rd party SIM
STARTUP(cellular_credentials_set(APN, "", "", NULL));
SYSTEM_THREAD(ENABLED)

SerialLogHandler traceLog(LOG_LEVEL_TRACE);

void setup() 
{
  Particle.keepAlive(30);
}
void loop() { }

@ScruffR the serial log length is too large to post. How can I post the file?

Here is where it fails I think

0000120142 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 0
0000120142 [comm] WARN: handle received message failed - aborting transfer
0000120144 [system] INFO: Send spark/device/ota_result event
{"r":"error"}
0000121144 [comm] WARN: handle received message failed - aborting transfer
0000121144 [system] INFO: Send spark/device/ota_result event
{"r":"error"}
0000122146 [comm] WARN: handle received message failed - aborting transfer
0000122146 [system] INFO: Send spark/device/ota_result event
{"r":"error"}
0000123148 [comm] WARN: handle received message failed - aborting transfer
0000123148 [system] INFO: Send spark/device/ota_result event
     1.636 AT read  +   14 "\r\n+CIEV: 2,2\r\n"
     1.646 AT read OK    6 "\r\nOK\r\n"
     1.646 AT send      11 "AT+CMEE=2\r\n"
     1.656 AT read OK    6 "\r\nOK\r\n"
     1.656 AT send      19 "AT+CMER=1,0,0,2,1\r\n"
     1.668 AT read OK    6 "\r\nOK\r\n"
     1.668 AT send      15 "AT+IPR=115200\r\n"
     1.678 AT read OK    6 "\r\nOK\r\n"
     1.778 AT send      10 "AT+CPIN?\r\n"
     1.788 AT read  +   16 "\r\n+CPIN: READY\r\n"
     1.798 AT read OK    6 "\r\nOK\r\n"
0000001798 [system] INFO: Sim Ready
0000001798 [system] INFO: ARM_WLAN_WD 1

here is the file in it’s entirety

I think the missing chunks seem to indicate the pending issue which eventually caused the update to fail.
Maybe @rickkas7 has some input on that.

@rickkas7 @ScruffR here is another attempt with a better formatted serial log. Tera term is acting strange for me for some reason and keeps crashing my PC so I used the arduino serial monitor