Flash via Serial or OTA fails but OK via DFU


#12

Hard to say.
I guess you haven’t set LOG_LEVEL_ALL in the current firmware?
Having that on would be key to also see what’s going on behind the scenes in system during OTA too.


#13

@ScruffR, drats, have set default to LOG_LEVEL_WARN… but will change moving forward!


#14

@ScruffR, (am still ingesting the mesh kit announcements… way to go!)

The errant Photon has come in from the customer, and confirmed that it could NOT be flashed OTA.

Performed the following experiments:

  • It has my app v3.02.5 installed
  • OTA failed <=== BAD
  • Flashed the same app v3.02.5 via SERIAL
  • OTA failed <=== BAD
  • Flashed new app v3.09.0 via SERIAL
  • App v3.02.5 still installed <=== BAD
  • Flashed new app v3.09.0 via DFU
  • Long time cycling through flashing magenta then momentarily flashing cyan
  • Eventually went into safe mode <=== BAD
  • Flashed new app v3.09.0 via DFU again
  • Went into safe mode <=== BAD
  • Flashed earlier app v3.06.0 via DFU
  • Long time cycling through flashing magenta then momentarily flashing cyan
  • Eventually went into safe mode <=== BAD

So that is the story so far.

To me it looks like the Photon is stuffed as we say in the vernacular and that it is not a misbehaved application per se.

What next?

I really want to recover this Photon if possible so as to develop a procedure for the next time this happens (which looking back on it, have suffered before a few times).


#15

How about flashing some other firmware (e.g. Tinker) and then try the OTA?
Also adding a LOG_LEVEL_ALL logging and posting the output would be good.

BTW, is this a product device? Is it marked as developer device in the product?
If “Yes” to first and “No” to second question, I’d guess the magenta phase may be reverting to the product firmware after flashing “unauthorised” firmware.


#16

Updating its system firmware might not hurt either, should you not have tried that already.


#17

@Moors7,

Flashed firmware 0.8.0-rc.1 via DFU
Flashed new app v3.09.0 via OTA
Safe mode <=== BAD
Flashed tinker-0.6.0-rc2 via DFU
Safe mode <=== BAD
Flashed firmware 0.7.0-rc.7 via DFU
Safe mode <=== BAD
Flashed tinker-0.6.0-rc2 via DFU
Safe mode <=== BAD

Flashed tinker-0.6.0-rc2 via DFU
Safe mode <=== BAD but this time looked at the console:

{“data”:"{“f”:[],“v”:{},“p”:6,“m”:[{“s”:16384,“l”:“m”,“vc”:30,“vv”:30,“f”:“b”,“n”:“0”,“v”:100,“d”:[]},{“s”:262144,“l”:“m”,“vc”:30,“vv”:30,“f”:“s”,“n”:“1”,“v”:206,“d”:[]},{“s”:262144,“l”:“m”,“vc”:30,“vv”:26,“f”:“s”,“n”:“2”,“v”:206,“d”:[{“f”:“s”,“n”:“1”,“v”:206,"":""},{“f”:“b”,“n”:“0”,“v”:101,"":""}]},{“s”:131072,“l”:“m”,“vc”:30,“vv”:0,“d”:[]},{“s”:131072,“l”:“f”,“vc”:30,“vv”:0,“d”:[]}]}",“ttl”:60,“published_at”:“2018-02-13T23:47:52.341Z”,“coreid”:“xxxxxxxxxxxxxxxxxxxxxxxx”,“name”:“spark/status/safe-mode”}
spark/device/last_resetpin_resetFebruary 14th at 10:47:51 am
{“data”:“pin_reset”,“ttl”:60,“published_at”:“2018-02-13T23:47:51.077Z”,“coreid”:“xxxxxxxxxxxxxxxxxxxxxxxx”,“name”:“spark/device/last_reset”}
device came onlineno dataFebruary 14th at 10:47:50 am
{“data”:“online”,“ttl”:60,“published_at”:“2018-02-13T23:47:50.541Z”,“coreid”:“xxxxxxxxxxxxxxxxxxxxxxxx”,“name”:“spark/status”}

Flashed tinker-0.6.0-rc2 via OTA and got this console entry:

{“data”:“8BBF72A9B422C8599918B1B7B15FE5B13CB8F1125BDB13BF32629490A39CB99A”,“ttl”:60,“published_at”:“2018-02-3T23:55:07.065Z”,“coreid”:“xxxxxxxxxxxxxxxxxxxxxxxx”,“name”:“spark/device/app-hash”}

Does these console entries give us any clues? It is smelling very much like a flash memory fault…

@ScruffR, in answer to question, it is not part of a product. Witness lack of flashing magenta above.

Note am not persisting to save $ but to find a troubleshooting method!


#18

Hi @UMD,

I cant downgrade the redbear duo as its on 3.0.0 and downgrading to 2.4.0 requires access to dfu-util, My devices are in production, in the real world.

I used to be able to update them, so i’m still not sure whats changed.

Im going to try and update the firmware to 3.1.0 and see if that makes any difference.


#19

A bit more info,

i got stack trace logging on the device now , so see the image, The error is logged after the flash status of success is

    {  
   "data":{  
      "r":"error",
      "u":{  
         "s":524288,
         "l":"t",
         "vc":94,
         "vv":30,
         "u":"FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF",
         "f":"u",
         "n":"1",
         "v":8,
         "p":88,
         "d":[  
            {  
               "f":"s",
               "n":"2",
               "v":8,
               "_":""
            }
         ]
      }
   }
}

spark/trace/device-event-spark/device/ota_result

Any ideas ???


#20

This is a very peculiar behaviour. These status reports do bear some info, but whether or not they indicate some incompatibility is difficult for me to say as I don’t know of a comprehensive doc that lists good vs. bad combinations of firmware modules (@rickkas7?).

Maybe you could also provide the output of particle serial inspect (forgot serial before :blush:) with your device in Listening Mode.

Another thing you could/should try is flashing the bootloader for 0.8.0-rc.1 (and 0.7.0-rc.4+) which has to be done via particle flash --serial not --usb.

However, 0.8.0-rc.1 (and I guess rc.2 too) has an open issue reported by me about connection loss detection


Electron OTA Flash Times Out, Even in Safe Mode
#21

@ScruffR,

Spooky thing is that I had spotted this gem from @bko (the bit that starts “One thing that trips …” last night:

which got me thinking - the evidence (returning to safe mode) was pointing to what @bko was saying.

So tried this:

Flashed firmware 0.8.0-rc.1 via DFU (ie back to a known position)
Safe Mode
Recompiled app against 0.8.0-rc.1 to and flashed new app v3.10.0 via OTA
Safe mode <=== BAD

Then tried @ScruffR suggestion:

Flashed bootloader-0.8.0-rc.1-photon via SERIAL
It reset (good) to breathing CYAN! But app was not running
Flashed flashed new app v3.10.0 built against 0.8.0-rc.1 via OTA
Breathing CYAN and app running! <=== SUCCESS

Lessons:

  • Beware of system and app mismatch
  • Beware of system and bootloader mismatch
  • Back to safe mode is indicative of a mismatch
  • When saving app binary files, include the system version, eg “0.8.0-rc.1+APP-3v100.bin
  • Persistence
  • Ask for help!

So, case closed with thanks to the team!


Redbear OTA update not persistant
#22

-PS

@ScruffR, you mentioned “particle inspect” - I had never hear of it and just tried it from the command line, but “inspect” was not recognised. Wondering how I get to use it?


#23

Have you made sure the CLI is on the latest version?


#24

So ,

The ONLY way i can get OTA updates to work on the red-bear is to Put it into DFU mode, Upload the latest firmware (0.3.1) and then it will work.

However , this means I can to visit every site we have installed at , remove the units and reinstall them (obviously a very expensive exercise)


#25

Sorry, it should be particle serial inspect (corrected in original post too)


#26

@ScruffR, got it.

Moving right along, I now have another opportunity to chase the issue on a different Photon… This time it is not in production, so have an opportunity to dig in even more now.

I was able to run the serial inspect once on this unit:

    Platform: 6 - Photon
    Modules
      Bootloader module #0 - version 200, main location, 16384 bytes max size
        Integrity: PASS
        Address Range: PASS
        Platform: PASS
        Dependencies: PASS
      System module #1 - version 300, main location, 262144 bytes max size
        Integrity: PASS
        Address Range: PASS
        Platform: PASS
        Dependencies: PASS
          System module #2 - version 204
      System module #2 - version 300, main location, 262144 bytes max size
        Integrity: PASS
        Address Range: PASS
        Platform: PASS
        Dependencies: PASS
          System module #1 - version 300
          Bootloader module #0 - version 101
      User module #1 - version 5, main location, 131072 bytes max size
        UUID: 00DF321072740C53930XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX96871
        Integrity: PASS
        Address Range: PASS
        Platform: PASS
        Dependencies: PASS
          System module #2 - version 300
      empty - factory location, 131072 bytes max size

@ScruffR, did you notice this line in the above:

System module #2 - version 204

Whereas all the other lines are:

System module #2 - version 300

Anyhow thereafter, “particle serial inspect” was returning “serial: Serial timed out” and could not program via serial nor OTA, just DFU.

I think this is issue with serial flashing may be a piece in the puzzle…

Thinking about the serial issue… the unit is soft configured to use USBSerial1 for logging. I disabled logging and reset, which meant that only Serial is exposed. I can now get output from “particle serial inspect”.

Next, without changing anything else, I *enabled logging on USBSerial1 and reset; “particle serial inspect” was returning “serial: Serial timed out” and, I assume, would not be able to program via serial.

I disabled logging and reset, which meant that only Serial is exposed. I can now get output from “particle serial inspect”. Unfortunately serial and OTA still FAILED, but DFU worked (of course). So drats, there goes that theory.

Anyhow, will raise a bug report regarding USBSerial1 and access to the Serial port in listening mode.


#27

The dependencies block just states what the minimum requirement for this module is.
So module #1 - version 300 needs at least module #2 - version 204 and since the installed version of #2 is 300 that dependency is satisfied.

The user module version 5 seems a bit old tho’

What’s your CLI version?
What OS are you running?
If you are using Windows and had serial drivers installed before USBSerial1 was published, then you might have a driver problem.


#28

@ScruffR,

The PC at work is Windows 7. Not at work today, so can’t reply re particle-cli version, but it should be pretty well up to date.

I wrote a quick program to test the “listen mode” + USBSerial1 no go theory at home using my MacBook. It looked like the same issue was raised, but then realised that I may have not reset after opening and closing a terminal. So, let’s leave the “serial theory” aside for now…

Can you expand on this comment:

The user module version 5 seems a bit old tho’

Is that part of a system update?


#29

I just checked and you can forget that comment about v5. I get the same version with my device running 0.8.0-rc.2 for system and application and 0.8.0-rc.1 bootloader (no rc.2 available).

This is my output

PS C:\Particle\Photon> particle serial inspect
Platform: 6 - Photon
Modules
  Bootloader module #0 - version 200, main location, 16384 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #1 - version 301, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 204
  System module #2 - version 301, main location, 262144 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #1 - version 301
      Bootloader module #0 - version 101
  User module #1 - version 5, main location, 131072 bytes max size
    UUID: A0EBA9C53D14AFEF5EA6BE4ECE11DEC775FA80938B959499F70E323C5AF702B3
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #2 - version 301
  empty - factory location, 131072 bytes max size

And I can confirm that OTA still works.

But with Windows 7 I’d recommend you remove the serial drivers from the system (even for non-present devices) and reinstall the most recent drivers (not the ones in the linked post)
https://github.com/particle-iot/windows-device-drivers/releases/download/v6.1.0.51/particle_drivers_6.1.0.51.exe

Also flash a really minimal app firmware to avoid any trouble from that side
e.g.

void setup() {
  pinMode(D7, OUTPUT);
}

void loop() {
  digitalWrite(D7, (millis() >> 2) & 0x88);
}

BTW, do you call USBSerial1.begin() in STARTUP() or in setup()?

I also had some problem with USBSerial1 activated where it killed the default Serial connection. But that too was a driver problem as both drivers requested the same COM port, once I uninstalled the device-port relation and let Windows assign the ports anew, I got two seperate ports for the two interfaces which worked as expected.


#30

@ScruffR, FYI on Window am using particle_drivers_6.1.0.68.exe because it handles two USB serial ports.

Interesting finding: with the (second) errant device, I could not get it to connect to my home WiFi… (and could not clear WiFi credentials ascertained by debug).

It was running 0.8.0-rc.1 bootloader and system:

Platform: 6 - Photon
Modules
Bootloader module #0 - version 200, main location, 16384 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #1 - version 300, main location, 262144 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #2 - version 204
System module #2 - version 300, main location, 262144 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #1 - version 300
Bootloader module #0 - version 101
User module #1 - version 5, main location, 131072 bytes max size
UUID: B952EEB1E22D62FB4E517307803CD6FF57C4E3EAB512C62EA63728EBFB7F3D76
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #2 - version 300
empty - factory location, 131072 bytes max size

To overcome the WiFi credential issue, I decided to downgrade to 0.6.3:

Platform: 6 - Photon
Modules
Bootloader module #0 - version 11, main location, 16384 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #1 - version 109, main location, 262144 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #2 - version 109, main location, 262144 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #1 - version 109
User module #1 - version 5, main location, 131072 bytes max size
UUID: B952EEB1E22D62FB4E517307803CD6FF57C4E3EAB512C62EA63728EBFB7F3D76
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: FAIL
System module #2 - version 300
empty - factory location, 131072 bytes max size

Then set the WiFi to my home AP via the serial console and it was only now that I was able to connect successfully (good).

Upgraded back to 0.8.0-rc.1 bootloader and system. And it could still connect to my home WiFi (good).

It was now back to being able to be programmed OTA (good).

Thinking that we might have a clue to the issue. Is the inability to set a new WiFi credential a side effect or root cause of the OTA/Serial flashing issue?

This ticket is getting quite convoluted - am not expecting an answer to “Why?”. Will continue to report back on findings the next time I get the issue so as to gather evidence.


#31

Have good news.

Code was using I2C routines for an NFC module. As I could enable and disable use of this module via remote config, I was able to see if OTA worked with or without the reader routines.

Findings: OTA failed when the reader was enabled and worked when disabled, ie the NFC routines were not playing well with the OTA process - no doubt starving the main loop.

I fixed the issue by simply calling Particle.process() in a wait loop which would have been entered a lot during I2C comms with the NFC module.

Case closed!