Orange Flashing after leaving listen mode


#1

I think we have read all the recent (2016 or more recent) posts on this, so my apologies if we missed something. We had a photon flashing cyan (fast) and then flash orange twice (slow) and repeating this sequence. I had not seen flashing orange before so we started trying to figure out what was going on. We found some threads (thanks community) and executed "particle keys doctor " then particle “key server” in dfu mode. I figured the issue would be solved, and it sort of was…

We have a feature on our product to enter and exit listen mode. Now this problem seems to consistently after exiting listen mode, but never on system reset (or power cycle).

We tried key doctor again once I found that key doctor may be case sensitive we tried the deviceID with both lower and upper case the behavior continues as above. Our exit code from listen mode is
WiFi.listen(false);
delay(50); //probably not needed anymore
Particle.connect();
firmware is 6.1
powered from a 5W adapter and computer USB issue still persists.

I would like to track this issue down, as things I don’t understand or can’t explain scare me a lot. We thought maybe not all particle servers got the key update, and maybe it depended on what server it connected to… But the fact that so far (we have cycled at least 20 times and counting) that this issue only happens when we exit listen mode using code (above) has me concerned that there is bug, or we are not existing listen mode properly. Just to restate, this NEVER happens on system reset, only when we enter and exit listen mode. It only happens on this particular photon, we have lots of other photon running this code not exhibiting this issue.


#2

That’s useful information. We have seens reports of keys getting corrupted but I don’t recall having a way to reproduce that.

When you say consistent, how often does that happen? Every time?

Also, do you have a test code that will reproduce the behavior?

@BDub


#3

Hi @jerome, thanks for the info. Just to clarify, are you finding that you need to run particle keys server to restore the connection after your particular photon starts the orange flashing, or can you just press reset and have it connect again?


#4

Not sure I would say every time but it may be. It seemed to do this every time for fifteen minutes before I posted this. I will get more data on Monday. Right now it is resetting and staying connected for 3 minutes, going into listen mode, exiting listen mode, trying to connect, then starting the process over. It will do this all weekend.

I will post the code on Monday if it helpful, exiting listen mode is as written above.

I don’t think the code is the root cause, not saying it isn’t part of the cause but this code works fine on other photons. Maybe code plus whatever is goin on with this particular photon.


#5

I’m not entirely sure that the two orange flashes actually do relate to corrupted keys.
https://docs.particle.io/support/troubleshooting/troubleshooting-support/photon/#error-codes

There are other reasons that could cause a bad handshake too.
I guess it might be related to what’s happening with the WiFi module at the very time you call WiFi.listen(false).
What is your code doing while in Listening Mode?

While this should be investigated further, as a temporary workaround you could try

  WiFi.listen(false);
  WiFi.off();
  delay(100);
  Particle.connect();
  waitFor(Particle.connected, 30000);

#6

My code is talking to a control board of a dehumidifier. Every few seconds (non blocking interval timer) It grabs data and puts it in a cloud variable, so it is not doing a whole lot.

I will give your workaround code a whirl and let you know what happens. It kind of defeats the benefit of threading though.

What do you mean by temporary, temporary until what?

Thanks for the response. This is the reason I have stayed with particle, the community😀


#7

If there is an intrinsic issue with falling out of Listening Mode that way, it ought to be a bug to be squashed - until then.
If it’s something in connection with your own code, until that’s located and corrected.

In what way?
waitFor() is just a precaution to make sure there is nothing else causing issues till you’re reconnected. Once you can be sure of that it can also go.
BTW, Software Timers and interrupt driven SparkIntervalTimer will even run while waitFor() blocks the application thread.


#8

Thanks scruff, your a ninja.

Just to clear, if someone is tracking down a bug, this only happens on one of my photons. 4 others don’t have this issue. And a few hundred have the same “exit listen mode code” but perform other things in the normal application.


#9

So we applied this code and we are still getting failures, but the failures seem to have diminished in frequency. I don’t have anything quantitative, but for they guy who implemented this change he was seeing the failures often, when I was looking at it I saw one out of 7 provide the orange led failure.

bool did_listen = false;
int nc_time = 0;

#define RESULT_STR_LEN 100
char result_str[RESULT_STR_LEN];
char web_str[RESULT_STR_LEN];

void second(void);
Timer timer_second(1000, second);

void setup()
{
    Serial.begin(9600);
    delay(1000);
    Particle.variable("test", web_str, STRING);

    WiFi.on();
    delay(50);
    Particle.connect();

    timer_second.start();
    Serial.println("Initializing.");
}

void loop()
{
    if(WiFi.listening()) {
        return;
    }

    char new_character;
    while(Serial.available() > 0) {
        new_character = Serial.read();
        Serial.print(new_character);
        if(new_character == '#') {//Put photon into listening mode.
            nc_time = 0;
            Serial.println("Entering listen mode.");
            WiFi.listen();
        } else if(new_character == '+') {//Reset Photon.
            System.reset();
        }
    }
}

//name: second
//desc: used to encapsulate things that need to be executed once per second
void second(void)
{
    if(WiFi.listening()) {
        if(nc_time > 10) {//Stay in listen mode for 10 seconds.
            did_listen = true;
            Serial.println("Exiting listen mode.");
            WiFi.listen(false);
            WiFi.off();
            delay(100);
            Particle.connect();
            //waitFor(Particle.connected, 10000);
        } else {
            nc_time++;
        }
    }

    for(int j = 0; j < RESULT_STR_LEN; j++) {
        result_str[j] = 0;
    }
    sprintf(result_str, "{did_listen: %d}", did_listen);
    memcpy(web_str, result_str, RESULT_STR_LEN);
}

I am struggling to think of next steps since we are still seeing this failure. Any thoughts on how to get to the bottom of this one?

-Jerome


#10

I can’t see SYSTEM_THREAD(ENABLED) in your code, but in order to keep loop() running, you should need that, I thought :confused:
Otherwise I’d expect your code to just return back to the instruction after WiFi.listen().

Also avoid anything that might cause the timer callback to take longer than 1000ms

As side note:

  memset(result_str, 0, sizeof(result_str)); // short for 
//    for(int j = 0; j < RESULT_STR_LEN; j++) {
//        result_str[j] = 0;
//    }


#11

Ah man, I will check on this tomorrow thanks


#12

I’m currently trying to reproduce the orange flash, but I can’t.
I’ve entered and dropped out of Listening Mode at least 30 times without any orange flash.
It might be interesting to have a memory dump of that one device.
Are all your devices running the same system and application firmware?


#13

@ScruffR thanks for the advice so far, we are still activity working on this and this is our current findings. On the unit that flashes orange we have found all the credentials live in the DCT2 memory region (DCT1 is filled with 0xFF). On all other units that we looked at the credentials live in the DCT1 region(DCT2 is full of 0xFF). In the datasheet DCT2 is called a “swap” region. If it is helpful I can share the .bin files of working and non-working.

We found the memory region definition on the photon datasheet

We tried to copy DCT2 to DCT1 using the dfu-util but it is not allowed. “last page page at 0x0800xxxxx is not writable” so we figured this is read only memory… So how did we screw it up in the first place? In an ideal world I would answers to the following questions.

  1. Is this the root cause of the orange flashing LED?

  2. How did this corruption occur? Is it something we did, factory issue, don’t really care that much I am just trying to understand the issue

  3. How can I test for this issue? In our example the unit connects briefly, and then stops connecting to the cloud (flashing cyan and flashing orange).

  4. How can I fix this issue? Does this require programming via JTAG?

Any help would be great.


#14

Ping @BDub


#15

Normally you don’t want to mess with DCT memory unless you are specifically setting something that you know the offset to, like adding new device keys, public server key, country code, antenna selection, etc…

DCT1 and DCT2 are swapped back an fourth when a write needs to change bits from a 0 to a 1, that requires erasing the sector. So we erase the non valid DCT, and copy everything up to the new data, write the new data, and then everything after the new data, and mark that DCT valid and the previous one invalid. Usually they both contain remnants of the the same data, and one is not normally completely erased. One is marked valid, and one not. If you see one completely erased that might be a good indicator that your Photon has a damaged sector that can’t be written to. I typically attribute this to ESD because that’s the most likely cause.

If you’ve been trying to DFU things to DCT, then all bets are off on the state of DCT naturally… so I don’t think it would be worth looking at a flash dump of this unit. But just in case, you might as well do this, handy to restore your Photon with JTAG as well. This is a complete 1MB image.

dfu-util -d 2b04:d006 -a 0 -s 0x8000000:0x100000 -U photon_backup.bin

If you want to use JTAG, you can erase DCT1/DCT2 sector 1 and 2, not 0.
Then reset into DFU mode and put the public server key back on "particle keys server"
Then put into Listening Mode and enter your Wi-Fi credentials.

If flash memory is not corrupted, things should work normally. You could try to force different reset reasons (deep sleep vs hard reset) and see if those reasons change in your code. If not, it might be an indicator that DCT is not swapping.