Turning off the Wireless Access Point causes the grief which is the issue of this ticket:
It does not want to reconnect to the Access Point when it is turned back on (bit problem in itself)
more importantly in my use case, every time I get log lines: ERROR: wiced_join_ap_specific(), result: 1024
it blocks loop() and and I miss packets of data form the serial port (Serial1)
No doubt I am not the only one to suffer with this! Any suggestions?
There might be issues open in GitHub that touch on that. Make sure to report your findings there to to increase visibility of these issues.
If you donāt find a issue that fits good enough, file one yourself.
But to make sure similar issues havenāt been addressed and solved yours already try the latest version (0.8.0-rc.8).
Some explanation: I use a this NetworkSerialMuxer class to stream from either WiFi or Serial, whichever is available, with a preference for Serial.
To get it stable I had to:
Only look for a new TcpClient when the old one has dropped. I think there are some issues with how they are destroyed. TcpServer keeps a reference to the client. I think it should probably only create it and pass it on. Not keep a copy itself.
Donāt call Particle.connect() with no WiFi. It will trigger listening mode, unlike (WiFi.connect(WIFI_CONNECT_SKIP_LISTEN)). Listening mode sends messages over serial. This interferes with the applicationās use of Serial! Holy shit, I missed this and it totally messed up the applicationās Serial reliability.
So now the system accepts one TcpClient at a time and will only start looking for a new one if the client disconnects.
@Elco, agree with your strategy to check WiFi.connected() before calling Particle.connect(), this has worked well for me too!
I will have to digest your PiLink code and see if this will help. Note that my application is looking at serial and is also a TCP Client, not a TCP Server, so will have to see if your code can assist with this because your code is a TCP Server.
To reiterate, the issue happens when I purposely turn off the WiFi router, ie no WiFi. Is your (yet to be looked at) code helpful in this situation?
Asynchronous system functions do not block the application thread, even when the system thread is busy, so these can be used liberally without causing unexpected delays in the application. (Exception: when more than 20 asynchronous system functions are invoked, but not yet serviced by the application thread, the application will block for 5 seconds while attempting to put the function on the system thread queue.)
So, if you fire too many async functions and saturate the system thread, the main loop will be blocked.
WiFi.hasCredentials() is synchronous, Iām not sure whether this could cause problems. I have gone through so many iterations that itās hard to remember what caused unreliability in the past. Iām just glad that I found something that works.
But perhaps itās a good idea to remove the hasCredentials() call, because I think it is superfluous with SKIP_LISTEN.
@Elco, I think us two have been going down the same rabbit holes!
I always use (WiFi.ready() && WiFi.localIP()), have not used WiFi.hasCredentials().
Another thing to note is that am using Serial1 (ie physical serial port) and not the USB virtual ports Serial nor USBSerial1.
I really wish that interrupt driven serial input was implemented because this would have circumvented the blocking issue!
It could well be that your āAsynchronous system functionsā¦ā paragraph could be the lead that I am looking for because am suffering from this pattern:
good
good
good
bad
bad
wait some time...
good
good
bad
bad
ie it looks like the system clogs up after some activity (which points to your theory) or it could be a regular thing that is causing the blocks.
Try rate limit your system thread calls, how often you try to reconnect to WiFi or the tcp server. My guess itās that it is indeed caused by overloading the system thread.
I have found that WiFi.ready() is enough nice the bug fixes in 0.8.0.
I just confirmed that removing WiFi.hasCredentials in my code prevents a 4 second block on WiFi loss.
Got it, missed the WiFi.disconnect() (hence the need to WiFi.connect() again). Nice move and nsymmetry - I am using a similar strategy with Particle.connect() and disconnect(), but not WiFiā¦ onto it.
@Elco, have neatened up the WiFi strategy code and moved it out to function which is now only called every 200 mS so as to overcome possibly overwhelming the system thread.
During this change, I added some extra logging lines which has shown that loop() is NOT being blocked as I had first thought. This is good news (but embarassing)!
Still left with the tricky situation of Serial1 (ie physical port) not receiving characters reliably. Which got me thinkingā¦ am now pretty sure that I have a hardware issueā¦ I added test code which initiated a command with the device in play which elicits a serial response on a regular basis - no issue found, with or without WiFiā¦
I will confirm the hardware issue next week some time with my trusty logic analyser to confirm the serial traffic is ok theory.
Apologies to both you and @ScruffR for this wild goose chaseā¦
Great, thanks for the update.
Depending on the baud rate and data rate you are using, you have to ensure that you are reading the serial port often enough. The serial buffer is small and easily overrun.
I have used this code as a quick and dirty USB to RS485 transceiver:
I can run it at 256000 baud rate.
If I print all of the output to a python terminal though, python cannot keep up and the data is lost there. Itās the printing to terminal thatās too slow. You have to make sure that you empty the buffers regularly. If they fill up, data is discarded. Even the USB buffer of the desktop is tiny.
@Elco, I will add some clarity to my āserial reliabilityā comment, it is the fact that no packet data is received for a transaction, not missing chars within the packet.
My initial thoughts were that serial1 reception was blocked, but it now looks very much like the transaction (actually an event indication) was simply not happening due to the hardware issue.