OTA update fails in Argon

So, several months later I'm back with this issue. We are moving to a P2-based product, and we are seeing the same issue, only more often.

Sometimes after flashing a few different versions over USB (when developing and testing), a device will recover and start taking OTA updates for a while and then fail again.

Like I mentioned before, on startup we check for a flag from an internal (Flash) config file, and depending on that flag, we disable (or not) the updates with System.disableUpdates(). If the flag indicates that we have to update firmware, we do not call disableUpdates() and most of the code that we have in the main loop does not execute (we skip it). This way, we create an "Update Mode" where our product does not perform any function, but just waits for the update to complete.

In this case, the LED blinks magenta (as it should while it is updating), and in the console I see
spark/flash/status started
and a few seconds later
spark/flash/status success

The P2 reboots, but it starts up in the old version. It shows that it has an update pending, and if we enable the update on the unit (either by our normal method or by Force Enable OTA), the cycle repeats. This time, the blinking magenta during the "update" is just a blip (less than 0.5 seconds), and then reboots, but back to the original version. I still get the spark/flash/status started and then spark/flash/status success, but they are at most 1 second apart (as per the timestamps).

So it looks like the firmware downloads, it somehow verifies OK (I'm assuming that's what "success" means), but then upon reboot, some other check fails and it decides to boot to the older version. When doing it again, it doesn't look like it's even trying to download, but checks something (a hash?) and decides that it is OK ("success") and reboots. But again, after rebooting it seems to ignore the newly downloaded image and starts the old one.

If I comment out the disableUpdates() line, the device gets stuck in a boot loop (it reboots almost as soon as it connects to WiFi), in a similar way that is described here:

In that post, the problem mysteriously fixed by itself (as I have seen occasionally happen to me too). Because this product has several design changes, I am expecting that some code tweaking will be necessary, and therefore I will have to be pushing OTA updates. Needless to say, I need this to work.