So, for a week or so (possibly since 1.0.0 release) we noticed some inconsistencies on both handshaking and OTA update confirmations with electrons.
Handshaking:
on Particle console, the device’s connections do not always appear, despite the fact it connects and sends messages, which makes the console pretty inefficient at showing a proper state of the devices.
OTA updates:
we update devices, they reset, connect and informs their current version (which has updated) in our DB, but Particle keeps trying to update them anyway, thinking they haven’t updated… so the updates keep failing and the device resetting.
Detail: we’re using MANUAL and SYSTEM_THREAD enabled.
Additional questions:
is firmware update causing a reboot or should we handle that ourselves?
is resetReason still RESET_REASON_UPDATE (if caused by system)?
The OTA updates seem to have always been a sensitive topic with quite some inconsistencies (specifically with system_thread enabled, which we HAVE to use in order to avoid getting blocked on connection-related operations), is there any plan to prioritize this and have it working consistently and efficiently? I fear Particle credibility on the market may be seriously impacted by consistent upgrade failures…
Firmware update will call a System reset from the system thread context. So unless you have resets disabled or do something funky in a reset handler for the system reset event, it should reset as normal.
However, the resetReason for firmware updates has always been RESET_REASON_USER in my experience. I would call it a bug, but never impacted me enough to complain about it.
I also use System Thread enabled. When devices are getting multiple updates, are they already in a Product, or is this after you add them to a Product? I’ve experienced adding a device to a Product causing a reflash even if the firmware version is already correct, at least recently.
I also have experienced handshakes showing up less consistently on the console.
Unfortunately, without being able to be more specific about your OTA issues, I think it will be hard for Particle to prioritize fixing any of them. Most people seemingly don’t use System Thread unfortunately, so folks like you and I end up being slightly on the edge case side. It’s still not an officially / fully supported use case to my knowledge.
What other connections are you using? Do you have a TCP connection to other services or do you only use Particle cloud? Other services using the modem can sometimes present challenges in OTA stability.
I agree that not showing code will be taken as not being precise enough about my issue, but I was not really looking for help here, just mentioning a defect in the product, mostly.
This said, it’s a bit regrettable that Particle doesn’t consider System Thread as a major goal, since it’s imperative for any real time design. My electrons pilot sensors that are sampled every minute, but data is only sent once an hour in general, and saving energy between the sampling is essential… were working with the large companies in the oil and water industry and efficiency+reliability are key. The issues with the system so far have brought some shadows on the Particle brand…
The problem/dependency on Particle is so annoying at times that I’ve already thought several times about the possibility to keep the hardware but rebuild the whole system firmware to be more open and less or not at all dependent on Particle… the only blocking point so far behind the lack of time… but it may change in the future…
Since you used the category for Troubleshooting I had assumed you were hoping to find some solutions to those things.
System Thread is currently in a solid spot, but it requires you to have a reasonably deep understanding of the System Firmware to use it effectively in a production environment (which is the intent of not formally fully supporting it as far as I know).
You talk about dependency on Particle being problematic, and that is something I don’t really understand given the context you have provided. Is it just OTA updates? If you are having meaningful, repeatable problems with your OTA updates, the root cause is very likely to be something in your Application Firmware.
If you have other problems they may be solvable as well. If you can provide some context and details I and others can help out. I have hundreds of devices in industrial environments that are using System Thread and OTA updates and, while it’s not perfect, I’ve learned how to use it in an effective way.
The community can be very powerful in making progress towards solutions, and Particle Support is always there too if you want to work directly with them on a specific issue.
System Thread is definitely a supported mode and I agree essential for any real-time system. You may find additional Timer() or Thread() objects necessary to hit that goal as certain functions you might call from the Application Thread can still block waiting for the System Thread to become available.
For your devices which are stuck in an update loop, please see the following link.
The firmware version that you enter into this screen must match what you just compiled into your binary. Madness will ensue otherwise!
You may be in the madness case. If the underlying firmware binary is coded as V10 but listed as V11 in the product console could explain behavior you are seeing.