OTA Repeatitly updating device when device is already at latest version

Hi, I’ve setup a simple OTA test product to test the OTA capability and have stumbled on a issue where the Cloud and Device continually update to the latest version even when the latest version is already on the device.
I’ve put the latest firmware (Version 12) on a marked as development device and then released the firmware.
Then the other production device connects and gets an event that it need to update and OTA’s successfully.
I see on the serial output that the device is now version 12. The device disconnects pretends to do other stuff and the reconnects.
When it reconnects I can see in the event viewer and the Product console that the cloud thinks its the older version and starts a new OTA.
Is there an issue with the Cloud or Device System firmware? The device is currently running version 1.2.1-RC2

Below is the firmware that I have loaded

PRODUCT_ID(9668);
PRODUCT_VERSION(12);

SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(MANUAL);

bool isUpdating=false;
uint32_t startUpdate;
uint8_t cnt=0;
uint8_t a=10;

void doingUpdates(system_event_t event, int param){
    if (!isUpdating) {
      startUpdate=millis();
      isUpdating=true;
      Serial1.println("\nSystem updating\n");
    }
}

void setup() {
  Serial1.begin(57600);
  System.on(firmware_update, doingUpdates);
  Serial1.println("\rVersion 12");
}

String cycle[]={"-","\\","|","/"};

void displayDelay(int dly){
  int id=0;
  for (int l=0;l<dly;l++){
    if (l%5==0) Serial1.printf("\r%s",cycle[id++%4].c_str());
    delay(100);
    Particle.process();
  }
}

void loop() {
  
  displayDelay(80);
  Particle.connect();
  a=0;
  while(!Particle.connected()) {
    Particle.process();
    if (a%10==0)Serial1.print(">");
    delay(100);
    a++;
  }
  Serial1.println("\rConnected");
  if (cnt==8) {
    Particle.publish("spark/device/session/end", "", PRIVATE);
    Particle.connect();
    while(!Particle.connected()) {
      Particle.process();
      delay(100);
    }
  }
  a=100;
  Serial1.print("\r    ");
  while (!isUpdating && a>0) {
    if (a%10==0)Serial1.printf("\r%d ",a/10);
    Particle.process();
    delay(100);
    a--;
  }
  if (isUpdating) Serial1.println("\rNow Updating");
  a=0;
  while (isUpdating && (millis()-startUpdate)<120000) {
    if (a%10==0) Serial1.print(".");
    Particle.process();
    delay(100);
    a++;
  }
  displayDelay(10);
  Particle.disconnect();
  while(!Particle.disconnected()) {
    Particle.process();
    delay(100);
  }
  Serial1.println("\rDisconnected");
  WiFi.off();
  isUpdating=false;
  startUpdate=0;
  cnt++;
}

I think someone else just posted about your issue a few days ago.

It's currently being tracked internally and I've added your post to the issue. Nothing to report yet but if you have any more details feel free to post here or send me a PM. I am most interested if you can let me know when a device is actively in this state in case we can get some useful diagnostic information out of it.

Device is currently running with v12 firmware installed OTAing every 20seconds. PM with Device ID.

Latest observations.
So last night the Firmware settled down and stayed at Version 12 all night without repeatedly updating. Thinking maybe Particle fixed something in the background, I proceeded this morning to release Version 13. The production unit OTA successfully from V12 to V13, but have now returned to the repeating update state, where the cloud is reporting that the device is still on V12 and wants to update to V13, but the device itself has V13 installed and running.
Seems that there is some caching going on or something.

I’ve left the Device running again and will observer if it stabilizes.

Adding SerialDebug output maybe something in there might help.

0000000220 [system] INFO: Device exxxxxxxxxxxxxxxxxxxxxxb started
0000000221 [system] TRACE: Last reset reason: 20 (data: 0x00)
0000000228 [net.ifapi] TRACE: LwIP started
0000000250 [ot.api] INFO: OpenThread version: OPENTHREAD/0.01.00; Xenon; Jun  4 2019 13:26:22
0000000251 [ot.api] INFO: Max transmit power: 8
0000000265 [net.th] INFO: Creating new LwIP OpenThread interface
0000000266 [net.ifapi] INFO: Netif th1 added
0000000274 [net.ifapi] INFO: Netif dm2 added
0000000287 [net.ifapi] INFO: Netif dm2 deleted
0000000295 [net.esp32ncp] INFO: Creating Esp32NcpNetif LwIP interface
0000000310 [net.ifapi] INFO: Netif wl3 added
0000000318 [hal] TRACE: Heap: 76/170 Kbytes used
0000000332 [net.th] TRACE: OpenThread state changed: e0200
0000000336 [system.ctrl.ble] INFO: Device name: AxxxxxxxxxxX
0000000340 [system.nm] INFO: State changed: NONE -> DISABLED
Version 13
-0000001638 [ncp.at] TRACE: > AT
0000001639 [ncp.at] TRACE: < OK
0000002639 [hal] TRACE: NCP ready to accept AT commands
0000002639 [ncp.at] TRACE: > AT+CMUX=0
0000002641 [ncp.at] TRACE: < OK
0000002643 [gsm0710muxer] INFO: Starting GSM07.10 muxer
0000002665 [gsm0710muxer] INFO: Openning mux channel 0
0000002666 [gsm0710muxer] INFO: GSM07.10 muxer thread started
0000002738 [gsm0710muxer] INFO: Resuming channel 0
0000002738 [gsm0710muxer] INFO: Openning mux channel 1
0000002839 [gsm0710muxer] INFO: Resuming channel 1
0000002839 [gsm0710muxer] INFO: Resuming channel 1
0000002841 [ncp.at] TRACE: > AT
0000002891 [ncp.at] TRACE: < OK
0000002892 [ncp.at] TRACE: > AT+CWDHCP=0,3
0000002941 [ncp.at] TRACE: < OK
0000002941 [hal] TRACE: NCP state changed: 1
0000002942 [net.esp32ncp] TRACE: NCP event 1
0000002942 [ncp.at] TRACE: > AT+GETMAC=0
0000002991 [ncp.at] TRACE: < +GETMAC: "xx:xx:xx:xx:xx:xx"
0000002992 [ncp.at] TRACE: < OK
0000002992 [gsm0710muxer] INFO: Stopping GSM07.10 muxer
0000003014 [gsm0710muxer] INFO: Gracefully stopping GSM07.10 muxer
0000003014 [gsm0710muxer] INFO: Closing all muxed channels
0000003036 [gsm0710muxer] INFO: Closing mux channel 1
0000003036 [gsm0710muxer] INFO: Muxed channel 2 already closed
0000003039 [gsm0710muxer] INFO: Muxed channel 3 already closed
0000003058 [gsm0710muxer] INFO: Muxed channel 4 already closed
\0000003161 [gsm0710muxer] INFO: Sending CLD (multiplexer close down)
0000003211 [gsm0710muxer] INFO: Received response to CLD or timed out, exiting multiplexed mode
0000003211 [gsm0710muxer] INFO: GSM07.10 muxer thread exiting
0000003222 [gsm0710muxer] INFO: GSM07.10 muxer stopped
0000003238 [hal] TRACE: NCP state changed: 0
0000003238 [net.esp32ncp] TRACE: NCP event 1
/
>0000010693 [system.nm] INFO: State changed: DISABLED -> IFACE_DOWN
0000010697 [system.nm] INFO: State changed: IFACE_DOWN -> IFACE_REQUEST_UP
0000010700 [net.ifapi] INFO: Netif wl3 state UP
0000010713 [system.nm] INFO: State changed: IFACE_REQUEST_UP -> IFACE_UP
0000012018 [ncp.at] TRACE: > AT
0000012019 [ncp.at] TRACE: < OK
0000013019 [hal] TRACE: NCP ready to accept AT commands
0000013019 [ncp.at] TRACE: > AT+CMUX=0
0000013021 [ncp.at] TRACE: < OK
0000013023 [gsm0710muxer] INFO: Starting GSM07.10 muxer
0000013045 [gsm0710muxer] INFO: Openning mux channel 0
0000013046 [gsm0710muxer] INFO: GSM07.10 muxer thread started
0000013117 [gsm0710muxer] INFO: Resuming channel 0
0000013117 [gsm0710muxer] INFO: Openning mux channel 1
0000013218 [gsm0710muxer] INFO: Resuming channel 1
0000013218 [gsm0710muxer] INFO: Resuming channel 1
0000013220 [ncp.at] TRACE: > AT
0000013270 [ncp.at] TRACE: < OK
0000013271 [ncp.at] TRACE: > AT+CWDHCP=0,3
0000013320 [ncp.at] TRACE: < OK
0000013320 [hal] TRACE: NCP state changed: 1
0000013321 [net.esp32ncp] TRACE: NCP event 1
0000013324 [gsm0710muxer] INFO: Openning mux channel 2
0000013420 [gsm0710muxer] INFO: Resuming channel 2
0000013421 [hal] TRACE: Connecting to "particle"
>>0000015322 [ncp.at] TRACE: < WIFI CONNECTED
>0000016322 [ncp.at] TRACE: < OK
0000016322 [hal] TRACE: NCP connection state changed: 2
0000016323 [net.esp32ncp] TRACE: NCP event 2
0000016326 [net.esp32ncp] TRACE: State changed event: 2
0000016339 [net.ifapi] INFO: Netif wl3 link UP
0000016349 [system.nm] INFO: State changed: IFACE_UP -> IFACE_LINK_UP
>>0000017872 [hal] INFO: DNS server list changed
0000017873 [net.ifapi] TRACE: Netif wl3 ipv4 configuration changed
0000017874 [system.nm] TRACE: br_enabled=1
0000017879 [system.nm] INFO: State changed: IFACE_LINK_UP -> IP_CONFIGURED
0000017968 [system.nm] INFO: Checking gateway status with the device cloud
0000017968 [system] INFO: Cloud: connecting
0000017971 [system] INFO: Read Server Address = type:1,domain:$id.udp-mesh.particle.io
0000017995 [system] INFO: Loaded cloud server address and port from session data
0000018000 [system] TRACE: Address type: 1
0000018017 [system] TRACE: Cloud socket=0, family=2, type=2, protocol=17
0000018017 [system] INFO: Cloud socket=0, connecting to 34.237.176.160#5684
0000018039 [system] TRACE: Cloud socket=0, connected to 34.237.176.160#5684
0000018045 [system] TRACE: Updating cloud keepalive for AF_INET: 30000 -> 30000
0000018062 [system] TRACE: Applying new keepalive interval now
0000018083 [system] INFO: Cloud socket connected
0000018084 [system] INFO: Starting handshake: presense_announce=0
0000018089 [comm.protocol.handshake] INFO: Establish secure connection
0000018312 [ncp.at] TRACE: > AT+MVER
0000018349 [ncp.at] TRACE: < 5
0000018350 [ncp.at] TRACE: < OK
0000018352 [comm.dtls] WARN: session has 0 uses
0000018358 [comm.dtls] INFO: (CMPL,RENEG,NO_SESS,ERR) restoreStatus=0
0000018369 [comm.dtls] INFO: out_ctr 0,1,0,0,0,0,17,181, next_coap_id=3e6d
0000018375 [comm.dtls] INFO: app state crc: cached: fc057c31, actual: d0226356
0000018392 [comm.dtls] INFO: restored session from persisted session data. next_msg_id=15981
0000018414 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 2
0000018420 [comm.protocol.handshake] INFO: Sending HELLO message
>0000018800 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
0000018801 [comm.protocol.handshake] INFO: Handshake completed
0000018806 [system] INFO: Send spark/hardware/max_binary event
0000018817 [system] INFO: Send spark/device/last_reset event
0000018829 [system] INFO: Send subscriptions
0000018840 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
0000018849 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
0000018861 [comm] INFO: Sending TIME request
0000018873 [ncp.at] TRACE: > AT+CWJAP?
0000018900 [ncp.at] TRACE: < +CWJAP:"particle","xx:xx:xx:xx:xx:xx",1,-44
0000018901 [ncp.at] TRACE: < OK
0000018902 [comm.protocol] INFO: Posting 'M' describe message
0000018915 [comm.protocol] INFO: rcv'd message type=1
0000018927 [system] INFO: Cloud connected
0000018979 [system.nm] INFO: Checking gateway status with the device cloud
Connected
10 0000019080 [comm] INFO: Forcing a cloud ping
0000019181 [comm.protocol] TRACE: Reply recieved: type=2, code=0
0000019181 [comm.protocol] INFO: message id 15983 complete with code 0.00
0000019186 [comm.protocol] INFO: rcv'd message type=13
0000019308 [comm.protocol] TRACE: Reply recieved: type=2, code=0
0000019308 [comm.protocol] INFO: message id 15984 complete with code 0.00
0000019312 [comm.protocol] INFO: rcv'd message type=13
0000019428 [comm.protocol] TRACE: Reply recieved: type=2, code=0
0000019429 [comm.protocol] INFO: message id 15985 complete with code 0.00
0000019433 [comm.protocol] INFO: rcv'd message type=13
0000019555 [comm.protocol] TRACE: Reply recieved: type=2, code=0
0000019555 [comm.protocol] INFO: message id 15986 complete with code 0.00
0000019560 [comm.protocol] INFO: rcv'd message type=13
0000019673 [comm.protocol] TRACE: Reply recieved: type=2, code=0
0000019673 [comm.protocol] INFO: message id 15987 complete with code 0.00
0000019677 [comm.protocol] INFO: rcv'd message type=13
0000019800 [comm.protocol] TRACE: Reply recieved: type=2, code=0
0000019800 [comm.protocol] INFO: message id 15988 complete with code 0.00
0000019805 [comm.protocol] INFO: rcv'd message type=13
0000019916 [comm.protocol] TRACE: Reply recieved: type=2, code=69
0000019916 [comm.protocol] INFO: message id 15989 complete with code 2.05
0000019920 [comm.protocol] INFO: Received TIME response: 1561062491
0000019939 [comm.protocol] INFO: rcv'd message type=12
9 0000020042 [comm.protocol] TRACE: Reply recieved: type=2, code=0
0000020042 [comm.protocol] INFO: message id 15990 complete with code 0.00
0000020065 [comm.protocol] INFO: rcv'd message type=13
0000020369 [ncp.at] TRACE: > AT+MVER
0000020401 [ncp.at] TRACE: < 5
0000020401 [ncp.at] TRACE: < OK
0000020402 [comm.protocol] INFO: Posting 'S' describe message
0000020405 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
0000020633 [ncp.at] TRACE: > AT+MVER
0000020651 [ncp.at] TRACE: < 5
0000020651 [ncp.at] TRACE: < OK
0000020652 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
0000020655 [comm.protocol] INFO: rcv'd message type=1
0000020778 [comm.protocol] INFO: Posting 'A' describe message
0000020779 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 4
0000020783 [comm.dtls] INFO: session cmd (CLS,DIS,MOV,LOD,SAV): 3
0000020800 [comm.protocol] INFO: rcv'd message type=1
\0000030005 [system] INFO: Cloud: disconnecting
0000030005 [comm] INFO: Waiting for Confirmed messages to be sent.
0000031006 [comm] INFO: All Confirmed messages sent: client(yes) server(yes)
0000031006 [system] INFO: Cloud: disconnected
Disconnected
-0000031105 [net.ifapi] INFO: Netif wl3 state DOWN
0000031106 [ncp.at] TRACE: > AT+CWQAP
0000031107 [system.nm] TRACE: br_enabled=0
0000031110 [system.nm] INFO: State changed: IP_CONFIGURED -> IFACE_REQUEST_DOWN
0000031122 [system.nm] INFO: State changed: IFACE_REQUEST_DOWN -> IFACE_DOWN
0000031144 [system.nm] INFO: State changed: IFACE_DOWN -> DISABLED
0000031154 [hal] TRACE: NCP connection state changed: 0
0000031166 [net.esp32ncp] TRACE: NCP event 2
0000031167 [ncp.at] TRACE: < WIFI DISCONNECT
0000031176 [ncp.at] TRACE: < OK
0000031188 [net.ifapi] INFO: Netif wl3 link DOWN
0000031189 [gsm0710muxer] INFO: Stopping GSM07.10 muxer
0000031199 [gsm0710muxer] INFO: Gracefully stopping GSM07.10 muxer
0000031211 [gsm0710muxer] INFO: Closing all muxed channels
0000031221 [gsm0710muxer] INFO: Closing mux channel 1
0000031233 [gsm0710muxer] INFO: Closing mux channel 2
0000031243 [gsm0710muxer] INFO: Muxed channel 3 already closed
0000031255 [gsm0710muxer] INFO: Muxed channel 4 already closed
0000031365 [gsm0710muxer] INFO: Sending CLD (multiplexer close down)
0000031415 [gsm0710muxer] INFO: Received response to CLD or timed out, exiting multiplexed mode
0000031415 [gsm0710muxer] INFO: GSM07.10 muxer thread exiting
0000031421 [gsm0710muxer] INFO: GSM07.10 muxer stopped
0000031442 [hal] TRACE: NCP state changed: 0
0000031442 [net.esp32ncp] TRACE: NCP event 1
\

This is a capture from the console event. You can see that the cloud can OTA successfully and reports that it’s the latest version, then suddenly after the HASH it reverts back to the previous version on the device. Maybe the HASH generated is not matching what the system expects?

spark/status
Published by eXXXXXXXXXXXXXXXXXXXb on 6/21/19 at 9:21:53 am
Firmware version: 13
Private Event
auto-update

spark/device/app-hash
Published by eXXXXXXXXXXXXXXXXXXXb on 6/21/19 at 9:21:29 am
Firmware version: 15
Private Event
11111F364E24DFEC04D4F7D51A6657512E9DDB673B30CF174B4ACF4214A4C57C

spark/device/diagnostics/update
Published by eXXXXXXXXXXXXXXXXXXXb on 6/21/19 at 9:21:27 am
Firmware version: 15
Private Event
{json}

particle/device/updates/forced
Published by eXXXXXXXXXXXXXXXXXXXb on 6/21/19 at 9:21:27 am
Firmware version: 15
Private Event
false

particle/device/updates/enabled
Published by eXXXXXXXXXXXXXXXXXXXb on 6/21/19 at 9:21:27 am
Firmware version: 15
Private Event
false

spark/device/last_reset
Published by eXXXXXXXXXXXXXXXXXXXb on 6/21/19 at 9:21:27 am
Firmware version: 15
Private Event
user

spark/flash/status
Published by eXXXXXXXXXXXXXXXXXXXb on 6/21/19 at 9:21:08 am
Firmware version: 13
Private Event
success 

spark/flash/status
Published by eXXXXXXXXXXXXXXXXXXXb on 6/21/19 at 9:21:07 am
Firmware version: 13
Private Event
started