THEORY is that it has something to do with the SPI interface. Am now using the SDFAT 0.7 library, but before was not… More to follow on this as I progress
In the mean time, to assist with the investigation, can anyone respond to what causes the following [hal] and [hal.wlan] traces?
The issue is when a connection drops for whatever reason.
I am simulating by removing the ethernet cable from the access point (ie loss of internet) and also turning off the access point (loss of WiFi), then re-instating. Problem for me is that the cloud connection does not come back, even though the access point is operational.
Am now circling my coding being at issue (isn't that the case in 90+% of the time!) because I resurrected this solution which works with the tests above:
I will report back once I have more to say. Any insights are always appreciated!
I tried devices this last weekend (they are on Photon and have SPI to a TFT and SD card) on 1.5.0 and I had a number of issues related to SPI so have reverted to 1.4.4 and will sit out 1.5.0 until it is clear the SPI and connection issue have been solved or at least it is clear what has changed to invalidate the program models that were working and how they need to be remediated.
@armor, I did say that the SPI / SD CARD was a theory in relation to my issue, but too early to be definitive. I note that my issue pre-dates DeviceOS 1.5.0.
Wrt your SPI issues, refer to Updating from DeviceOS 1.4.4 to 1.5.0-rc.2 -> Broken SPI Functionality. There was an issue of Display vs SD CARD in the release candidates, but with DeviceOS 1.5.0 I have no had an issue with issue with these two peripherals, so not sure what is going on. Suggest that you raise a separate topic for this.
I think the interaction with SPI is that the SPI class uses a recursive os mutex. If there is no free memory, allocation fails, the mutex cannot get a lock and a deadlock occurs.
I have seen it before when hardware debugging. The code hangs on a mutex in the app thread, but the real issue is running out of memory, which makes the mutex deadlock.
I have removed any use of the SPI class now and use raw HAL functions instead. If you know how you use SPI, you don’t need all these mutexes in between on every SPI call.
The particle classes are very defensive and oriented at beginners, so in many places in my application I have just looked up what they do and implemented the same using HAL functions without all the bloat.
I prefer managing my own mutex using std::mutex and std::unique_lock<std::mutex> if I need one, to manage ownership and automatic unlocking on destruction.
I think the interaction with SPI is that the SPI class uses a recursive os mutex. If there is no free memory, allocation fails, the mutex cannot get a lock and a deadlock occurs.
Re accessing the HAL functions directly, how is this done? I did not think that this could be done from the application. If so, great!
Well, I don’t know how you develop, but I use vscode and makefiles. So I have the entire particle device-os repo as a dependency in the same directory. So I just browse the device-os files to see how they implemented the SPI class (“spark_wiring_spi.h” / “spark_wiring_spi.cpp”) and re-used code from it.
But this is only for experienced embedded software developers that can fully understand what the particle code is doing. This is not documented and particle doesn’t expect you to use it this way.
The documentation often leaves out crucial details, like the exact function prototypes with the type of arguments and possible overloads. So I end up browsing the code anyway.
I think my next hardware revision will not use a Particle device anymore, but a bare ESP32 instead. I don’t use their cloud and the easy to use framework often gets in my way more than it helps me. I’m just not their target market. I’m an early adopter that got in during the Spark Core days, when ESP32’s were not even a thing. I have the skills to develop on bare metal and don’t need Arduino style libs.
@Elco, I now understand how you are linking to the HAL layers - you are compiling the whole code base ie DeviceOS and Application, hence have access to all the DeviceOS functions.
I use Particle Dev as I don’t want the overhead of maintaining the environment, so this avenue is not available to me.
I still use the system layers, not a monolithic build. But yes, I go deep into the framework.
Our code is also built in the cloud by Azure and automatically tested and deployed, so that’s why we use makefiles too.
@ScruffR, good question. The inertia is there due to convenience (you know, devil you know vs the devil you don’t).
I am by no means versed in the using repositories, and so have been assuming that there was an overhead in keeping in sync with the latest DeviceOS incarnations.
From your question I take it that Workbench is the preferred option for serious development. I shall give it a go in a few months.
I have seen the same issue you noted, a Photon not reconnecting to an AP if the AP is power cycled, and I agree, it’s frustrating to deal with.
I will check internally if this is specifically being worked on.
Yes, the 2.0 LTS release thread memory reduction will hopefully address many issues.
Thanks for your results @UMD!
We have come across this problem recently, so I’m looking at implementing the WiFI state machine (from [SOLVED] TCPCLIENT intranet connection fails if no cloud connection right?) and I wondered if you came to a rough figure of the available heap memory to keep free?