I have discovered what must be a bug in our firmware, but I would like it confirmed by others. For a while now, we’ve had problems related to setting up our P1 based boards. We did not have this problem before and I have confirmed it’s not a hardware issue.
The core problem is that if I’m running Tinker, I can setup wifi using either CLI, app on iOS/Android, via commandline by passing credentials or setup.particle.io. If however I’m running our firmware, the setup will timeout or fail on all of the above methods. So this MUST be related to our firmware in some way, but how? What in our firmware could possibly prevent wifi setup from working?
Some project facts: we’re using quite a bit of memory, but I think we are below the limits:
text data bss dec hex filename_
92620 2148 46388 141156 22764 /workspace/target/workspace.elf_
We’re using SYSTEM_MODE(MANUAL) in the code to decide when to start wifi (or to disable it on startup if desired). We’re using the 0.70.0 firmware. I initially thought this had to be related to SYSTEM_MODE, but I’ve tried making a file for reproducing this, but I can’t seem to do so. I guess the problem is deeper than just that. Could it be that wifi setup requires a certain amount of free memory?
I suppose you are refering to SoftAP setup (either via mobile app or custom SoftAP HTTP page).
If so, yes engaging the SoftAP will require some RAM to work with.
You memory stats do tell what the static memory usage will be (at least) but doesn't reveal anything about heap utilisation as this will only be happening at run time.
Listening Mode does engage SoftAP on the device. This is how the mobile app would contact the device before it has made it to the cloud - so yes it does use some memory, but what is “a lot”?
Aha. I didn't know that Listen Mode = SoftAP. I've trimmed the code down by commenting out bits and bobs, and I'm now down to "bss" being only 30k and it still happens.
I was told that this could be related to a firmware issued that had now been resolved, but the problem exists with the 0.8.0-rc.4 also. It's very obvious that something that has to do with Serial communications is affected since when setting up via CLI you'll get random (not repeatable) errors like this:
I have detected a P1 connected via USB.
? Would you like to continue with this one? Yes
| Getting device information...Serial err: Error: Error: Resource busy, cannot open /dev/tty.usbmodem14511
Serial problems, please reconnect the device.
Potentially unhandled rejection [3] Serial problems, please reconnect the device. (WARNING: non-Error used)
\ Getting device information...
Here's something crucial that I discovered: clearing credentials does not work and the module shows this clearly:
Instead of clearing the credentials and returning to setup mode, the device blinks fast red. I presume this could be key information that someone might know what means?
If I first engage Safe Mode and then perform the clearing procedure, the clearing works! It appears that I can also set the wifi credentials this way, so that’s actually great?!? Now we have a workaround while the bug can be solved!
I’m sitting here diff’ing various commits. I know that the problem was introduced after April 5th. It’s a lot of commits there, so I’ll check out each of them, compile and test. When I know exactly when it was introduced, I’ll analyse the diff to try to understand what is causing this.
I would say it affects the CLI setup more over USB, but it would be interesting to know what you find when you disable Serial USB debugging log handlers.
I’ve now removed ALL references to Serial. We do not use the LOG class. This produced an interesting effect: the device was no longer capable of deleting all passwords by holding Setup-button for 10 seconds? It could enter Setup, but holding the button did not trigger the deletion of the passwords. This was using 0.8.0-r4 on the device.
I then downgraded to 0.7.0 and now clearing the passwords by holding Setup button displayed the exactly same bug as before - fast blinking blue followed by fast blinking red. Setup still hangs (through both iOS/Android and CLI as before) :-/
@jenschr Hmm, that’s odd, can you check if you have the bootloader installed with the CLI:
particle serial inspect
Also, one nice thing about the logging API is that if you use Log.info("my debugging %s", "is awesome!"); in your code, you can simply comment out the Logging handler to disable it SerialLogHandler logHandler(115200, LOG_LEVEL_ALL); and that’s just the most basic benefit.
You need to be in listening mode. If you were, try safe listening mode. Go into safe mode, then while in breathing magenta, hold the SETUP/MODE button to go into safe listening mode (also blinking blue). If you need to go into safe listening mode with your firmware installed to do a particle serial inspect, that’s interesting in itself because your firmware is preventing those commands as well.
It seems your firmware is using SYSTEM_THREAD(ENABLED) and is probably still doing stuff to Serial (or other resources shared with system) while the device is in LM - since your firmware is not running in Safe LM, things work as expected.
Hey @jenschr! Can you throw a call to System.freeMemory() in your firmware and let us know what you get back? It doesn’t tell the whole story since it’s reporting heap size and not accounting for fragmentation, but it’s at least another clue. In the meantime, I’ll investigate how large a chunk of memory listening mode needs.
Weekend caught me here, but I now did a test and System.freeMemory() reports free memory to be 17296. This is after removing a 10k buffer that we suspect to be related to the problem. The error still persists with this much free memory though.
I’ve gone back in our GIT repository and found that the error started occurring late April. At this point in time our freemem reported 23704 bytes and the compilation stats were like this:
text
data
bss
dec
hex
filename
74220
2124
27028
103372
193cc
/workspace/target/workspace.elf
On purpose, our code does very little dynamic allocation. If we remove features in our firmware so that we approach 21Kb free memory, it’ll start to work. With 20448 bytes, it still exhibits the problem, so it looks like there’s a limit somewhere near that number in our case. Does that fit with what you’ve found @bsatrom?
This is the compilation output at the point in time when things start working: