Updating from DeviceOS 1.4.4 to 1.5.0-rc.1 -> SOS Panic 1

@avtolstoy,

Have updated an existing production P1 device from DeviceOS 1.4.4 to the just released 1.5.0.rc-1 by simply compiling against 1.5.0-rc.1 and OTA downloading in the usual way.

The outcome is SOS 1 blink - HARD FAULT, which comes up immediately upon reset.

As I could enter safe mode, I recompiled targeting v1.4.4 and was able to download ok.

My program then successfully ran (good), but it was reporting back as running 1.5.0-rc.1.

Next I recompiled targetting v1.5.0-rc.1 and OTA downloaded. It reported back the last reset reason as Firmware Update Timeout.

Seems to be working a.okay now.

Will report back on any issues encountered.

Harry,

Thank you for the report!

The outcome is SOS 1 blink - HARD FAULT , which comes up immediately upon reset.

Did the hardfault come immediately after the user application built for 1.5.0-rc.1 was applied to the device? Or somewhere after the safe mode healing was started?

Would you also mind sharing a bit more info about your application? What APIs are used etc?

I am unfortunately unable to reproduce this issue, so it may potentially be caused by whichever functionality your application is using compared to mine.

1 Like

@avtolstoy, yes, the hard fault came up immediately after the application was downloaded.

The interesting thing was that a hard reset (ie pressing the reset button) did not cure the fault, it continuously flashed red. This tells us that the fault must be very early on in my app start up or that it does not start at all.

In relation to my app, the hardware is interfaced to the usuals: OLED display, Neopixel LEDs, PN7150 NFC reader, keypad, SD CARD, barcode reader (via UART) and has many software features.

The app compiles with a text segment of 106412 bytes and has 17K bytes of RAM remaining when running.

Good news is that I have other devices I can repeat the exercise with. Back soon with the second units results.

That’s very curious, because the application does not run if it has unmet dependencies or for example it doesn’t pass the integrity check. You can also replicate this behavior by putting your device into safe mode using the SETUP/MODE button (as if entering the bootloader, but holding just until it starts blinking magenta and releasing).

See earlier note about safe mode - was able to enter this. Am about to run the test again on a different unit.

Repeated the experiment on another device running DeviceOS 1.4.4 which had the same outcome after OTA updating to 1.5.0-rc.1 -> SOS #1

Here is output from the Particle console events tab during the update process:

spark/flash/status	started	EVT-00017A	2/7/20 at 5:03:33 pmspark/status/safe-mode	{"f":[],"v":{},"p":8,"m":[{"s":16384,"l":"m","vc":30,"vv":30,"f":"b","n":"0","v":400,"d":[]},{"s":262144,"l":"m","vc":30,"vv":30,"f":"s","n":"1","v":1500,"d":[{"f":"s","n":"2","v":207,"_":""}]},{"s":262144,"l":"m","vc":30,"vv":30,"f":"s","n":"2","v":1406,"d":[{"f":"s","n":"1","v":1406,"_":""},{"f":"b","n":"0","v":400,"_":""}]},{"s":131072,"l":"m","vc":30,"vv":26,"u":"5F0A2C296207EDC7E791668DA836700445DAB425FD45144B78EC8126129D7CE1","f":"u","n":"1","v":6,"d":[{"f":"s","n":"2","v":1500,"_":""}]}]}	EVT-00017A	2/7/20 at 5:03:30 pmspark/device/diagnostics/update	{"device":{"system":{"uptime":5,"memory":{"total":82944,"used":29000}},"network":{"connection":{"status":4,"error":0,"disconnects":0,"attempts":1,"disconnect":0},"signal":{"rssi":-57,"strength":86,"quality":83.87,"qualityv":35,"at":1,"strengthv":-57}},"cloud":{"connection":{"status":1,"error":0,"attempts":1,"disconnect":0},"disconnects":0,"publish":{"rate_limited":0},"coap":{"unack":0}}},"service":{"device":{"status":"ok"},"coap":{"round_trip":506},"cloud":{"uptime":0,"publish":{"sent":2}}}}	EVT-00017A	2/7/20 at 5:03:29 pmspark/device/last_reset	user	EVT-00017A	2/7/20 at 5:03:29 pmparticle/device/updates/enabled	true	EVT-00017A	2/7/20 at 5:03:29 pmparticle/device/updates/forced	false	EVT-00017A	2/7/20 at 5:03:29 pmspark/status	online	EVT-00017A	2/7/20 at 5:03:29 pmspark/flash/status	success	EVT-00017A	2/7/20 at 5:03:19 pmspark/flash/status	started	EVT-00017A	2/7/20 at 5:03:06 pmspark/status/safe-mode	{"f":[],"v":{},"p":8,"m":[{"s":16384,"l":"m","vc":30,"vv":30,"f":"b","n":"0","v":400,"d":[]},{"s":262144,"l":"m","vc":30,"vv":30,"f":"s","n":"1","v":1406,"d":[{"f":"s","n":"2","v":207,"_":""}]},{"s":262144,"l":"m","vc":30,"vv":30,"f":"s","n":"2","v":1406,"d":[{"f":"s","n":"1","v":1406,"_":""},{"f":"b","n":"0","v":400,"_":""}]},{"s":131072,"l":"m","vc":30,"vv":26,"u":"5F0A2C296207EDC7E791668DA836700445DAB425FD45144B78EC8126129D7CE1","f":"u","n":"1","v":6,"d":[{"f":"s","n":"2","v":1500,"_":""}]}]}	EVT-00017A	2/7/20 at 5:03:03 pmspark/device/diagnostics/update	{"device":{"system":{"uptime":6,"memory":{"total":82944,"used":29000}},"network":{"connection":{"status":4,"error":0,"disconnects":0,"attempts":1,"disconnect":0},"signal":{"rssi":-52,"strength":96,"quality":100,"qualityv":40,"at":1,"strengthv":-52}},"cloud":{"connection":{"status":1,"error":0,"attempts":1,"disconnect":0},"disconnects":0,"publish":{"rate_limited":0},"coap":{"unack":0}}},"service":{"device":{"status":"ok"},"coap":{"round_trip":517},"cloud":{"uptime":0,"publish":{"sent":2}}}}	EVT-00017A	2/7/20 at 5:03:02 pmparticle/device/updates/forced	false	EVT-00017A	2/7/20 at 5:03:02 pmspark/device/last_reset	user	EVT-00017A	2/7/20 at 5:03:02 pmparticle/device/updates/enabled	true	EVT-00017A	2/7/20 at 5:03:02 pmspark/status	online	EVT-00017A	2/7/20 at 5:03:02 pmspark/flash/status	success	EVT-00017A	2/7/20 at 5:02:53 pmspark/flash/status	started	EVT-00017A	2/7/20 at 5:02:30 pm

Do you want me to try something?

More information: Powered down, then powered up again via the USB connection.

The LED goes white for less than 0.5s and then flashes RED with SOS.

It continues to cycle through the same pattern: ie

  • white for < 0.5s
  • SOS 1
  • SOS 1
  • repeat

As far as I can see the device started to get healed and gone through system-part1 update at least. I don’t see whether system-part2 was applied to the device or not.

See earlier note about safe mode - was able to enter this

Can you provide particle serial inspect output while in safe mode? (so, in a state where without safe mode, it would crash the device).

I’m in the same position (P1 device running fine on 1.4.4, instant Hard Fault on 1.5.0-rc1) except I did a DFU update instead of OTA.
Whatever is happening is happening extremity early as my STARTUP code is never executed when compiled for 1.5.0-rc1

STARTUP(
  pinMode(WKP, INPUT);
  if(!digitalRead(WKP)){
    System.sleep(SLEEP_MODE_DEEP);
  }
  WiFi.selectAntenna(ANT_AUTO);
);

Per you request @avtolstoy here’s serial inspect info from Safe Mode

Platform: 8 - P1
Modules
  Bootloader module #0 - version 400, main location, 16384 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #1 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 207
  System module #2 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #1 - version 1500
      Bootloader module #0 - version 400
  User module #1 - version 6, main location, 131072 bytes max size
    UUID: *** #Think this needs censored, can edit it in if it's not identifying info
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 1500
1 Like

@Trikkstar, glad I am not the only one with this issue!

There is a work around, which is to place the device in SAFE MODE, then download via OTA. Have you tried that?

I think there is a clue with the fact that even a DFU firmware update has a bad outcome.

I have had similar issues in the past with OTA updates where the only way to get one to work was to download TINKER first, then follow up with the OTA update of the application.

By the way, how do you know that your STARTUP code is not executed? This could be a critical piece of the puzzle. The time it takes for the SOS flashing to start after power up in my situation is pointing to your conclusion.

Wondering if anyone out there can put the jigsaw puzzle together on this?

@avtolstoy, apologies for the delay in responding.

As requested, output from serial inspect from listening mode.

Here is serial inspect from the same type of device, running DeviceOS 1.4.4 (which should be the same build level as the errant device prior to the DeviceOS upgrade):

Platform: 8 - P1
Modules
  Bootloader module #0 - version 400, main location, 16384 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #1 - version 1406, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 207
  System module #2 - version 1406, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #1 - version 1406
      Bootloader module #0 - version 400
  User module #1 - version 6, main location, 131072 bytes max size
    UUID: E8184.....
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 1406

Here is serial inspect from the the errant device, running DeviceOS 1.5.0-rc.1:

Platform: 8 - P1
Modules
  Bootloader module #0 - version 400, main location, 16384 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #1 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 207
  System module #2 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #1 - version 1500
      Bootloader module #0 - version 400
  User module #1 - version 6, main location, 131072 bytes max size
    UUID: FF158...
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 1406
type or paste code here

I would like to emphasise that DeviceOS 1.5.0-rc.1 is working fine for me. It is just the upgrade process which was problematic.

I haven’t tested OTA updating but will do that as soon as I can (likely Tuesday).
I know my STARTUP code isn’t running as I have it set to deep sleep the device if the power switch is off (WKP read low), and this behavior isn’t happening when running the 1.5.0-rc1 compiled firmware.

@TrikkStar @UMD Thanks for providing additional info.

I think we are dealing with two separate problems here.

@TrikkStar I have reproduced your issue and it’s related to our changes to have better thread-safety for SPI between system and application on Gen 3, however some changes affected Gen 2 devices as well. The key problem on Gen 2 devices is due to the undefined order in which the global C++ constructors are called (STARTUP() macro creates an object in the global namespace with a constructor containing the code in the macro), the application-specific mutex for SPI class is not yet initialized at the point pinMode() is called in your STARTUP() macro and it needs to be for safety-checks on a pin configuration to be successfully made.

We’ll fix this in 1.5.0-rc.2. As a temporary workaround, please move any code that interacts with GPIO out of STARTUP() macro and perhaps into setup().

@UMD, I will still need a bit more information about your application to figure this out, unfortunately. It’s also somewhat surprising that according to serial inspect you are running an application built for 1.4.4, even though supposedly your device went through a normal OTA update. If your device is experiencing hard faults in this state (1.4.4 application, 1.5.0-rc.1 system parts), we have something that broke backwards compatibility. I would appreciate if you could provide some minimal application that reproduces the behavior.

@avtolstoy,

Not sure what is going on with the app built for 1.4.4 - would have been my testing. Starting again to ensure that we are on the same page, apologies for this.

FYI: here is my startup:

// ----------------------------
// Startup code that runs early
// ----------------------------

STARTUP(
        Keyboard.begin();           // Allows the HID device attach for the first time after boot with *both* Serial and Keyboard
        Serial1.begin(115200L);     // 2017-10-25 To stop, or at least reduce, spurious output upon start up
                                    // High baud rate seems to help too.

        // If EVT_II is using GPIO P1S6 for KEYPAD_ROW_1, must add
        // But we are not, but leave it in anyway
        System.disableFeature(FEATURE_WIFI_POWERSAVE_CLOCK);

        // Refer https://community.particle.io/t/application-softap-http-pages-issue/22499/4
        // Be sure to initialize the softAP pages in a STARTUP() macro so they
        // are setup *before* the device connects to the internet.
        //
        // If it is initialized in the setup() method, then SoftAP pages
        // won’t be available until the device has connected to the cloud.
        softap_set_application_page_handler(myPage, nullptr);
        );

Here is what I am doing right now on the errant device:

Serial inspect

Platform: 8 - P1
Modules
  Bootloader module #0 - version 400, main location, 16384 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #1 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 207
  System module #2 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #1 - version 1500
      Bootloader module #0 - version 400
  User module #1 - version 6, main location, 131072 bytes max size
    UUID: FF158A...
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 1406
  • Device is now at a different location (ie different Access Point), was breathing white (expected)
  • Could not enter Listening mode… (bad)
  • Put it into Safe Mode
  • Could enter Listening mode (good)
  • Programmed in the new Access Point credentials
  • Safe Mode could connect with Particle - witnessed connection in Particle console (good)
  • Reset
  • Program started ok (good)
  • Device would NOT connect with access point, breathing white (bad) <-- Is this a clue?
  • Put the device back to Safe Mode so that I could OTA
  • Problem - flashing green - so could not connect with access point (but it did before)… (bad)
  • Could enter Listening mode (good)
  • Programmed in the new Access Point credentials again
  • Safe Mode could connect with Particle - witnessed connection in Particle console (good)
  • Did not reset this time as suspect that the program is doing something bad to the WiFi credentials
  • Compiled my app to target 1.5.0-rc.1
  • OTA
  • Immediate SOS #1 (bad)
  • Reset to Safe Mode
  • Entered Listening Mode
  • Serial Inspect now shows:
Platform: 8 - P1
Modules
  Bootloader module #0 - version 400, main location, 16384 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #1 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 207
  System module #2 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #1 - version 1500
      Bootloader module #0 - version 400
  User module #1 - version 6, main location, 131072 bytes max size
    UUID: 13BB56...
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 1500
  • Reset (so as to go back to the application)
  • Immediate SOS #1 (bad)
  • Recompiled my app targetting v1.4.4
  • Reset to Safe Mode
  • App now works ok.
  • Back to safe mode
  • Back to listening mode
    Serial inspect
Platform: 8 - P1
Modules
  Bootloader module #0 - version 400, main location, 16384 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #1 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 207
  System module #2 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #1 - version 1500
      Bootloader module #0 - version 400
  User module #1 - version 6, main location, 131072 bytes max size
    UUID: 05B40464D3B67D8FAF37C21203099EA8CBFA38E35E08B241DAC9D11366172F73
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 1406

Conclusion

My app compiled for 1.5.0-rc.1 will SOS #1 with DeviceOS: 1.5.0-rc.1.
My app compiled for 1.4.4 will work okay with DeviceOS: 1.5.0-rc.1.

Hope this amount of detail is helpful.

I will now work on experiments with the app source code to see if I can get it to start okay when targetting for 1.5.0-rc.1…

@avtolstoy, I commented out via #ifdef blocks

  • STARTUP() - made no difference…
  • setup() - made no difference…
  • loop() - made no difference…

You mentioned SPI - this is used by the OLED display and SDCARD interfaces.

I will see if I can reproduce the fault using minimal code.

@avtolstoy as I could not reproduce the fault, am thinking that it is best that I give you the following two binaries via PM (if you are still curious):

  • app compiled for 1.5.0-rc.1 which SOS #1 with DeviceOS: 1.5.0-rc.1.
  • app compiled for 1.4.4 which works okay with DeviceOS: 1.5.0-rc.1.

Of course my app won’t do much for you as all the I/O is different, but it should allow you to see the immediate panic fault for yourself. Hopefully you can apply some debugging tools to see what is going on.

Will send the code in a couple of hours.

PS - Am going to try updating the bootloader from version 400 to 501 as per Particle Photon Flash successful but nothing happening and report back on this too.

@UMD, I had this happen on an upgrade of an Electron 3G. For some reason, everything but the bootloader seemed to update. When I flashed the new bootloader and tinker via USB (the device is local) everything worked as expected.

HOWEVER, when I flashed my app compiled to v1.5.0-rc.1 it when straight to hard fault. Like you, when I flashed the SAME code but compiled to v1.4.4, it worked just fine. The code in this case is @rickkas7 AssistNow example.

1 Like

@peekay123, this is why we all need to get stuck into this issue - “breaking” upgrades need to be understood and squashed asap otherwise we could be stuck at a particular version!

The evidence (app does not event start) could well be the compiler when it is targetting 1.5.0-rc.1…

Here is the list from Particle Device OS Updates Thread of the compiler/environment changes:

  • Enables C++14 chrono string literals for wiring APIs #1709
  • [Gen 3] Implements persistent antenna selection ( Mesh.selectAntenna() ) #1933
  • GCC 8 support #1971
  • Implements EnumFlags class for bitwise operations on C++ enum classes #1978
  • Adds missing platforms to manifest.json #1959
  • Prevents expansion of EXTRA_CFLAGS variable #2012
  • Fixes a call to objdump when the compiler is not in PATH #1961
  • [gcc] Ensure GPIO pinmap is initialized on first use #1963

Is it possible that one or more of these changes has an undesirable side effect?

Have just PM’s the my two binaries to @avtolstoy for research.

@peekay123 I don’t know how complicated “AssistNow” is (nor what it does), but it strikes me that we may have an opportunity with its source code to work out the cause of the issue…

Updated bootloader from version 400 to 501:

Platform: 8 - P1
Modules
  Bootloader module #0 - version 501, main location, 16384 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #1 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 207
  System module #2 - version 1500, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #1 - version 1500
      Bootloader module #0 - version 400
  User module #1 - version 6, main location, 131072 bytes max size
    UUID: 4962D89....
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 1500

Issue remains…