Hey guys - we’ve been doing some WiFi testing on the photon with the v0.4.8-rc1 tag prior to deployment and we have seen a couple if interesting issues. I wanted to raise them here to see if anyone has experienced the same? Or just to track insight in case this is something to be fixed.
The method for testing was as follows:
- Setup the Photon on the programmer shield and connect with GDB, run our codebase.
- Connect the Photon to an Access Point that we can bring up and down at will.
- Repeatedly shutdown and restart the Access Point to simulate reconnects.
While carrying out this test there were two different problems occurred:
Problem 1
After a lot of reconnects, the Photon will disconnect and not be able to reconnect even when the Access Point appears again. Our code is still running but it never reconnects. We have been seeing this a lot with a test running in an area with bad WiFi as well.
This issue may possibly be related to the new threading capability of the Photon - we have inferred this for the following reason:
From what we can see on the debugger there is an important function manage_network_connection
which is called from the function Spark_Idle_Events
which should be handled by the system thread.
Previously, we think this was handled by the Particle.process
function, but now following down the function stack we see that when the platform has threading enabled, it will process the application thread but no longer process the idle events (as they should be handled on the system thread).
This works normally, but after a seemingly large number of reconnects this seems to stop working. Attaching a breakpoint to the manage_network_connection
shows that it is never called. As a test, a colleague built a small wrapper function into the particle firmware to call manage_network_connection
just after the application thread was processed BUT not calling the full Spark_Idle_Events.
This seemed to stop the issue as we could not reproduce it thereafter - this is unlikely a solid fix though so was hoping to garnish more info here if possible.
Problem 2
An SOS. This sometimes occurs after multiple repeated disconnects. I have very little information about this right now but we have managed to reproduce it 3 times. Just to note, this was vanilla firmware, not including the ‘fix’ we added in point 1. Sometimes, rather than seeing the issue in point 1, it would simply SOS.
We have noted a distinct pattern though - the previous re-connection before the reconnect causing SOS will rapidly flash green, rapidly flash red (or orange?) twice then immediately go to breathing cyan. We saw this in all three cases of the SOS.
While it can’t be explained it is a tell for when it is about to happen, so we can maybe attach GDB to a re-connection function just before we reconnect again, this might lead to a bit more information - I’m trying to get back to this investigation at the minute but thought I’d add this here to see if it generates any insight.
As always - thanks