Interrupts causing problem with flashing

I have some interrupts set up on hardware digital pins. It seems that this interferes with the flashing process. When I have interrupts triggering, it will impact the flash process that never successfully completes. But when the interrupts aren’t triggering , the update happens normally. The interrupts don’t really do much (they do not block).

Is there some way to know that the flash is taking place so that we don’t do anything in the loop or the interrupt handler?

When working on interrupts and the OTA deep update, I imagined this scenario might occur, and so I have been kind of waiting for someone to hit this problem.

Ideally the firmware should disable all user interrupts before attempting an OTA update (those added via attachInterrupt and via other interrupt sources, e.g. @peekay123’s timer library etc…)

Having an active interrupt handler shouldn’t disrupt the flash in itself, but it’s the actions performed by the handler are what’s causing the problem. Can you say a bit more about what kind of things your interrupt handler is doing?

Agreed. Our interrupt reads digital I/O from the D lines into memory. This in turn could cause further processing and serial/network IO from the “main” loop() function. To be honest, I don’t know that the interrupt is the problem, so much as the processing later done. To be safe, in my opinion, while flashing, the “OS” should pre-empt everything else:

  1. disable all interrupts
  2. stop calling the loop()

On a related note, the high level socket I/O classes are littered with 5 second timeouts (on select and such) which sounds like suicide as it often kicks the core off the cloud due as a result. I would strongly push to switch to some asynchronous, interrupt driven mechanisms for I/O.

Totally agree. Total separation of user code and the system code would be ideal, but getting there will take some effort, since any shared resources used by both system and user code then need to be appropriately guarded and managed.

loop() isn’t called while OTA is happening, but there’s nothing to stop interrupts.

Disabling all user interrupts is what’s needed, but this can’t be done simply by shutting off interrupts, since interrupts are also needed by the system. Instead, if the user ISRs are appropriately managed, then we can effectively stop any user ISR from running. I sketched an outline for better management of user interrupts by the system as issue 257.

Is there perhaps some global or method that can be used to know that an OTA upgrade is taking place? That way I would do literally nothing, or remove my interrupts?

I have a feeling that the simple fact that an interrupt happens is the problem, not what is actually done in the interrupt (unless digitalRead, micros, or memcpy cause problems?).

Hah! Great success:

extern volatile uint8_t SPARK_FLASH_UPDATE;

void interruptHandler() 
{
    if (SPARK_FLASH_UPDATE)
    {
        detachInterrupt(KEYBUS_CLK);
        writeDebug("FOTA in progress, disable interrupts\r\n");
        return;
    }
    ....
}

Great! I was going to point you in that direction, but with the caveat that these flags are most likely not part of the “official” wiring api - it just happens there’s no clear separation between the user api and the system implementation - so you see everything! :blush:

Can I ask you to try something - rather than disabling the interrupt, please try just returning immediately from the interrupt handler if SPARK_FLASH_UPDATE is true. That way, we can know if it’s simply the presence of the ISR or what it is doing that causes trouble. Thanks!

Works if I blindly return, even with an interrupt that fires @ 2khz. So it seems one of these functions in interrupt handler cause the FOTA to fail:

  • digitalRead()
  • delayMicroseconds(200)
  • micros()
  • memset()
  • memcpy()

Thanks for that. That confirms that my proposed management layer will solve the problem by not delegating to the user ISR, but leaving the interrupt still active.

However, for completeness, I may extend the proposal to also allow the system to unregister the user handlers, so the system has full control of it’s interrupt resources.

How can this fix be used in the 0.4.7 firmware version?
Currently this method leads to the error “undefined reference to `SPARK_FLASH_UPDATE”.

@Melx, are you having problems with flashing and interrupts?

Sorry for the late reply.

Current flashing devices which have system 0.4.8-rc6, multithreading and custom interrupt routines, will often lead to a hard fault ( This is probably due to some user application code ). I was able to reduce the fault probability to a manageable level by utilising the following snippets:

void usr_isr() {
    if(System.updatesPending()) {
        System.enableUpdates();
        sensorProcessTimer.interrupt_SIT(INT_DISABLE); // SparkIntervalTimer
        return;
    }    

    /.../
}

void setup() {
    System.disableUpdates();
    sensorProcessTimer.begin(usr_isr, SENSOR_PROCESS_PERIOD, uSec, TIMER4);
}
1 Like