WiFi reconnection issue - what does this trace mean?

@Elco, now that is very interesting indeed! Makes sense.

Have you raised a support ticket for your finding as per @ScruffR comments here: WiFi.ready() == FALSE but WiFi is connected ?

@avtolstoy, any comment on @Elco’s statement:

I think the interaction with SPI is that the SPI class uses a recursive os mutex. If there is no free memory, allocation fails, the mutex cannot get a lock and a deadlock occurs.

Re accessing the HAL functions directly, how is this done? I did not think that this could be done from the application. If so, great!

Yes, I also sent in a ticket and got a reply.

Well, I don’t know how you develop, but I use vscode and makefiles. So I have the entire particle device-os repo as a dependency in the same directory. So I just browse the device-os files to see how they implemented the SPI class (“spark_wiring_spi.h” / “spark_wiring_spi.cpp”) and re-used code from it.

But this is only for experienced embedded software developers that can fully understand what the particle code is doing. This is not documented and particle doesn’t expect you to use it this way.

The documentation often leaves out crucial details, like the exact function prototypes with the type of arguments and possible overloads. So I end up browsing the code anyway.

I think my next hardware revision will not use a Particle device anymore, but a bare ESP32 instead. I don’t use their cloud and the easy to use framework often gets in my way more than it helps me. I’m just not their target market. I’m an early adopter that got in during the Spark Core days, when ESP32’s were not even a thing. I have the skills to develop on bare metal and don’t need Arduino style libs.

1 Like

@Elco, I now understand how you are linking to the HAL layers - you are compiling the whole code base ie DeviceOS and Application, hence have access to all the DeviceOS functions.

I use Particle Dev as I don’t want the overhead of maintaining the environment, so this avenue is not available to me.

I understand your frustration…

I still use the system layers, not a monolithic build. But yes, I go deep into the framework.
Our code is also built in the cloud by Azure and automatically tested and deployed, so that’s why we use makefiles too.

1 Like

Can you elaborate what kind of maintance effort you are anticipating that keeps you from transitioning to Workbench?

@ScruffR, good question. The inertia is there due to convenience (you know, devil you know vs the devil you don’t).

I am by no means versed in the using repositories, and so have been assuming that there was an overhead in keeping in sync with the latest DeviceOS incarnations.

From your question I take it that Workbench is the preferred option for serious development. I shall give it a go in a few months.

2 Likes

@no1089,

Thought it appropriate that I respond to the communications that we had under the now closed post:

The issue that I was complaining about in the above post, and this post:

has been (re) solved today.

In short:

  • Implement a WiFi connection state machine
  • Ensure that there is enough heap memory

It seems that DeviceOS 1.5.2 has larger memory requirement and that is what unexpectedly threw up my recent woes (hence the frustration).

I believe that Particle are working on reducing DeviceOS memory requirements and if this can be improved, that would be great moving forward.

Thanks!

1 Like

I have seen the same issue you noted, a Photon not reconnecting to an AP if the AP is power cycled, and I agree, it’s frustrating to deal with.
I will check internally if this is specifically being worked on.

Yes, the 2.0 LTS release thread memory reduction will hopefully address many issues.

From your post, you are unblocked for the moment?

@no1089, confirming that am all good.

The implemented “WiFi state machine” works a treat when I don’t run too low on memory.

2 Likes

Thanks for your results @UMD!
We have come across this problem recently, so I’m looking at implementing the WiFI state machine (from [SOLVED] TCPCLIENT intranet connection fails if no cloud connection right?) and I wondered if you came to a rough figure of the available heap memory to keep free?

@dan.s, good question re memory.

Memory is especially an issue with later versions of DeviceOS.

Here are some posts for you to ponder:

etc…

Mention was made here that 10K of free memory was required when using SoftAP:

My guess - at least 10K of free memory is required. You need to instrument your code using freeMemory() to test.

IIRC you shouldn’t use SoftAP with less than 20KB free memory (before first time entry into Listening Mode).

Yes the softAP memory issue is a particular thorn for us, you can see my easy fix at Listening mode on the Photon cannot work reliably in current implementation

Is this reconnection issue related to softAP?

@dan.s, suggest that you answer the question by disabling SoftAP and seeing if this resolves the problem. Please report back!

I had so many issues with SoftAP with my specific environment that I disabled it in the end.

I workaround I have used successfully is to check the free memory is >38800 before starting SoftAP wifi setup. Otherwise, if the memory is less then odd things happen - it is variable where it happens but generally it will hang at some point. The exact value of free memory changes with device OS - as I understand 17K (heap and stack) is required to keep the device OS working correctly and 21-22K is needed by SoftAP to load the pages.
Device OS 1.5.X just uses 10-11K more memory than 1.4.4.
Fortunately, 2.0.0-rc.4 fixes this issue (but increases App flash space by 1800 bytes).

@armor,

Regarding your comment:

Which issue is it fixing? Is is specifically in the rc.4 release? I admit to not having tried the latest DeviceOS because it did not seem to have any great relevance to the Photon/P1 operation.

2.0.0-rc.4 fixes the memory issue that was introduced in 1.5.X and fixes other issues with WiFi behaviour that have been broken since 1.0.1
I can only attest to rc.4 behaviour.

How do you disable softAP?

I don’t really understand how it could be affecting it. When running normally, until the user clears credentials, it should never reach softAP again.
It is during this normal operation that I’m seeing strange connection/reset issues due to memory, even though free memory is at 26kB (running v1.5.0).
As @armor suggests, maybe there is a solution in v2.0 when the production release comes out…

@dan.s,
To disable SoftAP I used #ifdef’s - here are the snippets of code.

#ifdef SOFTAP_ENABLED
#include "softap.h"
#endif

STARTUP(
...
#ifdef SOFTAP_ENABLED
    // Refer https://community.particle.io/t/application-softap-http-pages-issue/22499/4
    // Be sure to initialize the softAP pages in a STARTUP() macro so they
    // are setup *before* the device connects to the internet.
    //
    // If it is initialized in the setup() method, then SoftAP pages
    // won’t be available until the device has connected to the cloud.
    softap_set_application_page_handler(myPage, nullptr);
#endif
);

#ifdef SOFTAP_ENABLED
    //
    // Set up SoftAP pages
    //
    softap_setup(); // NOTE - calls WiFi.on()
#endif              // SOFTAP_ENABLED

Ah it looks like you’re setting up your own custom softAP page. Not sure this would work for us as we just use the default, so no explicit calls to softAP are ever mentioned - I guess the references would be in WiFi.listen()