Listen Mode (possible bug)

I have discovered what must be a bug in our firmware, but I would like it confirmed by others. For a while now, we’ve had problems related to setting up our P1 based boards. We did not have this problem before and I have confirmed it’s not a hardware issue.

The core problem is that if I’m running Tinker, I can setup wifi using either CLI, app on iOS/Android, via commandline by passing credentials or setup.particle.io. If however I’m running our firmware, the setup will timeout or fail on all of the above methods. So this MUST be related to our firmware in some way, but how? What in our firmware could possibly prevent wifi setup from working?

Some project facts: we’re using quite a bit of memory, but I think we are below the limits:

text	data	bss		dec		hex		filename_
92620   2148	46388	141156	22764	/workspace/target/workspace.elf_

We’re using SYSTEM_MODE(MANUAL) in the code to decide when to start wifi (or to disable it on startup if desired). We’re using the 0.70.0 firmware. I initially thought this had to be related to SYSTEM_MODE, but I’ve tried making a file for reproducing this, but I can’t seem to do so. I guess the problem is deeper than just that. Could it be that wifi setup requires a certain amount of free memory?

I suppose you are refering to SoftAP setup (either via mobile app or custom SoftAP HTTP page).
If so, yes engaging the SoftAP will require some RAM to work with.

You memory stats do tell what the static memory usage will be (at least) but doesn't reveal anything about heap utilisation as this will only be happening at run time.

1 Like

I actually do not mean the SoftAP in any way. I just mean “Listen mode” (blue blink, setup mode). Does this require a lot of memory?

Listening Mode does engage SoftAP on the device. This is how the mobile app would contact the device before it has made it to the cloud - so yes it does use some memory, but what is “a lot”?

Aha. I didn't know that Listen Mode = SoftAP. I've trimmed the code down by commenting out bits and bobs, and I'm now down to "bss" being only 30k and it still happens.

I was told that this could be related to a firmware issued that had now been resolved, but the problem exists with the 0.8.0-rc.4 also. It's very obvious that something that has to do with Serial communications is affected since when setting up via CLI you'll get random (not repeatable) errors like this:

I have detected a P1 connected via USB.
? Would you like to continue with this one? Yes
| Getting device information...Serial err: Error: Error: Resource busy, cannot open /dev/tty.usbmodem14511
Serial problems, please reconnect the device.
Potentially unhandled rejection [3] Serial problems, please reconnect the device. (WARNING: non-Error used)
\ Getting device information...

Here's something crucial that I discovered: clearing credentials does not work and the module shows this clearly:

Instead of clearing the credentials and returning to setup mode, the device blinks fast red. I presume this could be key information that someone might know what means?

That’s an iteresting find :+1:
Could you file a GitHub issue for this, please?

BTW, does reverting back to Tinker cure that behaviour?
How about first engaging Safe Mode and then perform the clear credential procedure?

1 Like

Flashing Tinker (or other firmware) works well both OTA and via USB. With other firmware, it all works as normal.
I’ve logged it as a bug https://github.com/particle-iot/firmware/issues/1550

If I first engage Safe Mode and then perform the clearing procedure, the clearing works! It appears that I can also set the wifi credentials this way, so that’s actually great?!? Now we have a workaround while the bug can be solved!

I’m sitting here diff’ing various commits. I know that the problem was introduced after April 5th. It’s a lot of commits there, so I’ll check out each of them, compile and test. When I know exactly when it was introduced, I’ll analyse the diff to try to understand what is causing this.

5 Likes

@jenschr are you using Serial USB logging by any chance? I believe the bug could be related to this potentially.

4 Likes

I am. The error is there if a USB serial is connected or not though. What should I test? Turn off all Serial?

I would say it affects the CLI setup more over USB, but it would be interesting to know what you find when you disable Serial USB debugging log handlers.

I’ve now removed ALL references to Serial. We do not use the LOG class. This produced an interesting effect: the device was no longer capable of deleting all passwords by holding Setup-button for 10 seconds? It could enter Setup, but holding the button did not trigger the deletion of the passwords. This was using 0.8.0-r4 on the device.

I then downgraded to 0.7.0 and now clearing the passwords by holding Setup button displayed the exactly same bug as before - fast blinking blue followed by fast blinking red. Setup still hangs (through both iOS/Android and CLI as before) :-/

@jenschr Hmm, that’s odd, can you check if you have the bootloader installed with the CLI:

particle serial inspect

Also, one nice thing about the logging API is that if you use Log.info("my debugging %s", "is awesome!"); in your code, you can simply comment out the Logging handler to disable it SerialLogHandler logHandler(115200, LOG_LEVEL_ALL); and that’s just the most basic benefit.

If I do “particle serial inspect”, I get this:

Could not get inspect device: Serial timed out

?

You need to be in listening mode. If you were, try safe listening mode. Go into safe mode, then while in breathing magenta, hold the SETUP/MODE button to go into safe listening mode (also blinking blue). If you need to go into safe listening mode with your firmware installed to do a particle serial inspect, that’s interesting in itself because your firmware is preventing those commands as well.

So if I go into Listening mode by holding Setup for 3 seconds, I get this:

$ particle serial inspect
Platform: 8 - P1

That's it. If I go into Listen mode from Safe mode, I get this:

particle serial inspect
Platform: 8 - P1
Modules
Bootloader module #0 - version 101, main location, 16384 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #1 - version 207, main location, 262144 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #2 - version 207, main location, 262144 bytes max size
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #1 - version 207
Bootloader module #0 - version 7
User module #1 - version 5, main location, 131072 bytes max size
UUID: C2D5F69A8BA2C7403A159519014D6DCFF9E8620968EDE0CEA9E5D36F2FB16FB0
Integrity: PASS
Address Range: PASS
Platform: PASS
Dependencies: PASS
System module #2 - version 207
empty - factory location, 131072 bytes max size

This is repeatable. So yeah... I think @rickkas7 will find this interesting?

It seems your firmware is using SYSTEM_THREAD(ENABLED) and is probably still doing stuff to Serial (or other resources shared with system) while the device is in LM - since your firmware is not running in Safe LM, things work as expected.

@ScruffR SYSTEM_THREAD(ENABLED) is not used.

I know because we tried using it about a year ago and it caused stability issues. It does not exist in our code.

Any idea? Do you need code to replicate? Cc @BDub

Hey @jenschr! Can you throw a call to System.freeMemory() in your firmware and let us know what you get back? It doesn’t tell the whole story since it’s reporting heap size and not accounting for fragmentation, but it’s at least another clue. In the meantime, I’ll investigate how large a chunk of memory listening mode needs.

1 Like

Weekend caught me here, but I now did a test and System.freeMemory() reports free memory to be 17296. This is after removing a 10k buffer that we suspect to be related to the problem. The error still persists with this much free memory though.

I’ve gone back in our GIT repository and found that the error started occurring late April. At this point in time our freemem reported 23704 bytes and the compilation stats were like this:

text data bss dec hex filename
74220 2124 27028 103372 193cc /workspace/target/workspace.elf

On purpose, our code does very little dynamic allocation. If we remove features in our firmware so that we approach 21Kb free memory, it’ll start to work. With 20448 bytes, it still exhibits the problem, so it looks like there’s a limit somewhere near that number in our case. Does that fit with what you’ve found @bsatrom?

This is the compilation output at the point in time when things start working:

text data bss dec hex filename
80716 2144 29500 112360 1b6e8 /workspace/target/workspace.elf