I’ve been working on automating the flashing and testing of electrons. I’m having a problem where every device consistently is stuck flashing cyan until I power cycle the device.
Here is the manufacturing process leading to this:
Plug Electron in to computer via usb (device also connected to sensor peripherals needed for testing)
Electron is automatically put into DFU mode via po-util
Electron is flashed with the three parts of System Firmware v0.6.4
Electron is flashed with my Application Firmware
Electron is automatically put into Serial mode via po-util
Particle Identify is automatically ran and parsed for device ID and SIM ICCID
SIM is activated
Electron is automatically pulled back out of serial mode via po-util
Script waits for the device to send valid data to my MQTT server
More things…
In the Waiting For Validation state (#9), the device connects to MQTT just fine and starts sending data.
However, the device enters and remains in the flashing cyan state (not connected to particle cloud, but trying in theory). The interesting thing is that the timestamps being sent to my server are all negative integers.
Normally, before the device syncs for the first time since power-down, the time stamp begins at an epoch time that is in the vicinity of 944006421. However, the timestamps I receive are along the lines of -1141883740
If I reset the device, this state doesn’t change. However, once I power cycle the device, it immediately begins working as expected, returning times such as 1531520655
So, what I’m thinking is that somehow the time isn’t syncing properly when the device is hitting the network, and it’s causing the devices to get rejected from beginning the handshake with the Particle cloud. This happens with every single device using my procedure above. Any thoughts on what might be happening to cause this and how I could ensure a good handshake without a power-cycle?
Still interested in learning more about why this happens, but for now repurposed a photon as a cloud power reset switch for whatever device I have connected so it’s not an immediate issue anymore. This allows me to just add a full power reset into my python programming/testing script.
It’s strange though because normally when flashing manually this doesn’t seem to happen, but my automated process seems to be influencing the presence of the issue.
Resuscitating an old thread, I have found a similar issue with Device OS 2.0.1 slightly different process, but same issues.
I updated the firmware on one of my production devices remotely via the product’s interface, after which the Time.now() function started returning negative and 0 values, which messed up the timestamps of the data uploaded by it.
I tried reflashing and restarting remotely with no success, calling Time.isValid and Particle.syncTime() did not fix it, even after resetting the device.
Finally I had to get someone to go to the device, disconnect-> reconnect power and then everything worked well with the same firmware.
Is there any way to delete/reset the Time Sync registers of the OS to something that actually forces the device to update the Time.now from the cloud? Why is it that only disconnecting power fixed the issue?
I’d guess that the STM32 RTC gets into a bad state and can’t be set. The time should be sent by the cloud after connecting, it doesn’t check the RTC first. Getting a log with LOG_LEVEL_TRACE would indicate if the time is being received from the cloud. But in any case, manually synchronizing the time should do the same.
The reason I’m thinking the RTC is confused is that the RTC is powered in all sleep modes and across resets. It only gets reset when removing power completely.
I don’t know of a way to do reset the RTC under program control, and it appears to be a rare occurrence.
Yeah as said above, sounds like the power reset was able to get the RTC out of the bad state it was in. If you’re really worried about this issue I’d imagine your only real production “fix” would be to do an external hardware watchdog connected to a power switch to power cycle the Electron completely (on 3rd gen you can simply pull down the EN pin instead).
But as stated, should be a super rare issue - I don’t think I’ve ever seen that happen in well over 1000 devices.
That said, I actually implemented my own basic timeserver to use as a backup in case the particle cloud went down - not sure if you validated if setting the time manually worked or not, but you can even set your own value as an offset between your millis() value and millisecond epoch and choose to use your personal time source if the timestamp doesn’t seem valid (I generally make sure it’s within the past year since firmware release date, and that Time.isValid returns true, else use my own service to sync)