We have a real tricky problem with setting WiFi credentials via softAP on about 100 units…
So at present we have about 1,000 Photons installed working great on site, however a new customer with a rather large market (think millions) wanted a product custom made to their own enclosure, so we simply used the same components re-arranged to suit a different shape board. Simple right?
After producing 100 of these new products, early tests at our offices were positive so we sent them off (Feb 2020).
Next thing we knew they were getting all sorts of problems with the WiFi settings, so we went down to take a look thinking it would be something trivial; not quite. Here we reproduced the problem on several randomly chosen units:
Power on for first time - blinking blue - connect to Photon via WiFi shows “Connecting” -> “Obtaining IP address” -> “Connected, no internet”.
Enter credentials works fine. Lets say the WiFi password was entered incorrectly, the Photon enters listening mode again - blinking blue. No change in firmware.
Attempt to connect to Photon via WiFi and now the connection switches between “connecting” and blank, sometime gets to “checking internet connectivity” for a second, before blanking again. Never able to connect.
Swapping WiFi antenna (several different ones tried with firmware which specified using external antenna)
Loading updated firmware and entering safe mode
The weird thing is, the problem cannot be reproduced at our offices, only at the customer!
The project, due to Corona, has been put on hold but we will need to solve this as soon as lockdown restrictions are lifted which will be soon.
This is evidently a hardware problem (because we have the exact same components running fine elsewhere) so I’m not expecting an absolute solution, but please help as I’m at a loss!!
Next time we test, what is the best way to accurately record what is going on to pinpoint the problem?
If it is an environment dependent problem, you may want to check their wifi bands. A lot of people never address the difference in 2.4 vs 5ghz because their equipment (phones, computers, printers, TVs, etc) don’t care and can connect to both. If they have a 5ghz ssid broadcasting, make sure its a different name than the 2.4.
What device OS version are you targeting?
What SYSTEM_MODE mode are you running?
Are you using SYSTEM_THREAD(ENABLED)?
Are you using retained variables?
Have you tried clearing the WiFi credentials and then tried adding them again?
@ScruffR Not sure what the default OS of the out of box photons, but the OS downloaded (if WiFi details correct) is currently 0.6.2 running automatic mode, no multi threading or retained variables.
I may be wrong but don’t think it’s a firmware issue because we have made 100s of others (slightly different PCBs) using the same batch of photons without these problems.
We did try clearing the WiFi credentials and still unable to connect to the softAP second time.
@Mjones the problem isn’t due to connecting to the customer WiFi but with connecting to the photon WiFi (softAP). But that’s not to rule out the potentially interfering impact of multiple broadcasting frequencies in the environment!
The question is how best to test next time we are there witnessing the issue?
I have got circa 1000 photons in products - we use SoftAP to setup the WiFi credentials plus other things around the setup to make the process a little more understandable and predictable.
I am still not clear from your answer which device OS you are using. If 0.6.2 then I think you need to upgrade to 1.4.4 (avoid 1.5.X as there is a memory issue). 1.4.4 properly supports WPA Enterprise which is a must for us.
Again not clear of your answer to @ScruffR question - I use SYSTEM_MODE(SEMI_AUTOMATIC); and SYSTEM_THREAD(ENABLED);
I don’t want to tell you something if you already know but SoftAP uses a lot of RAM and if your free memory is less than 23K + 15K (minimum heap) when you start the setup process then odd shit happens. What I do is check for free memory before entering setup and if it is too low then tell the user and advise a restart. The other thing I do is only ever store 1 credential and before entering a new credential I clear all credentials. I know up to 5 credentials can be stored but with 1 it is much easier to determine whether the user has entered bad data. From what @Mjones has mentioned it could also be that the customer’s WAPs might be setup in a way that does not help the connection process.
You said the new products are in a different box - but you use a external antenna - is that actually working? Do you measure the signal strength and data quality.
I recently had to create a little WiFi’o’meter - Photon + SSD1306 OLED display on a breadboard powered by a powerbank. It just constantly tries to connect to the entered WAP credentials and reads out the signal dB and % quality and whether WiFi or cloud connected (the RGB LED works as well but for a non-technical user not so clear). It highlighted some odd things where “nothing had changed” but a microwave oven was really screwing the signal in one spot. Possibly a different issue - I would try the memory thing first.
@armor we’re using the default system mode and no multi threading in our current firmware. We kept to 0.6.2 precisely because of the RAM issues with softAP which we understand all too well - see Listening mode on the Photon cannot work reliably in current implementation
I just checked and the batch of Photons we have come programmed with firmware v0.5.5 by default.
On a separate note, can you point me in the direction of the 1.5.x memory issue details please? We are currently developing on top of this firmware and haven’t noticed any memory issues yet (apart from what has been experienced before).
A WiFi’o’meter is good way to test, as well as dB and % quality, what other specific details would be useful about the environment?
The problem is not related to connecting to the customer WiFi, we are not getting that far. To be clear:
The problem is connecting to the Photon itself i.e. Photon-ABCD123.
The problem persists across multiple firmware versions, including the factory default firmware.
The problem does not appear the very first time you enter WiFi details but all subsequent times. I.e. out of the box, you can connect, disconnect and reconnect many times, but as soon as those WiFi details are sent over, regardless if they are correct or not, you will not be able to connect to the Photon softAP again.
The problem only appears at the customer premises (so far). Which is why I’d like to do a wide range of tests while there so we might be able to reproduce the problem back at the lab.
So after PMing with @armor and Particle enterprise support, we went back to the customer with Photon Cloud Debug firmware modified with SerialLogHandler (LOG_LEVEL_ALL).
We tried multiple things but kept getting wiced_join_ap_specific(), result: 1006 on the required WiFi. The Photon softAP stopped responding afterwards without a complete clearing of WiFi.
With a network analyser, we realised the problem was embarrassingly simple… The required WiFi had both 2.4GHz and 5GHz bands on the same SSID.
As we all know 5GHz does not play nice.
This poses a bit of a problem, because even though it is rare for an AP to have both bands with the same SSID, the perception for this customer is that the device does not work. I will request them to change the SSID of one of the bands but I know what the answer will be (as it’s a large company with a few hundred phones/laptops connected at any one time).
Best solution: Is there a way to isolate and connect to the specific AP MAC address?
And that was proposed as possible cause in the first reply to the opening post.
BTW, the 2.4GHz vs. 5GHz issue seems more to be an AP issue than an issue with the Photon itself. I am running multiple mixed networks with one SSID for both bands without issue.
However, I've also seen some "well meaning" APs that aggressively try to force all devices to 5GHz irrespective of their capabilities and hence causing problems.
So some (potentially superfluous) testing is always better than assuming wrong.
Fair comment but we didn’t rule anything out, all suggestions were welcomed and tested when we finally were able to get to the customer today. My question to the community was how to best record and test for all potential problems. Thank you for helping.
So… Is there a way to isolate and connect to the specific AP MAC address?
We tried multiple things but kept getting wiced_join_ap_specific(), result: 1006 on the required WiFi.
Just interested to know - did you always get a 1006 error or sometimes other errors? [Edit] A reflection overnight - the hypotheses I am trying to prove is that a Photon (or P1 but both using WICED) can sometimes (but not consistently) bomb out when trying to connect to a WAP with the same SSID for 2.4 and 5 GHz WLANs so frustratingly but consistently with some WAPs it won't connect at all, for some it consistently doesn't seem to matter but for others it is totally inconsistent. Illustration, I have a DrayTek AP in the office and my customer used to have these in their showroom - then when they had the 5GHz band enabled by their new IT company we started seeing issues with devices that sleep-wake cycle to reduce power use - devices are battery powered. They have now replaced the DrayTeks with Ubiquiti Unify APs - problems have gone away.
Glad to hear that you have found the cause of the problem. WICED does seem to struggle with certain makes and device OS on certain WAPs and having the 5GHz and 2.4GHz with the same name does appear to be one thing to avoid but often that isn't under our control!
I tried using WICED error codes to determine whether the credentials entered were incorrect but found it too unreliable/inconsistent so moved to only having one SSID stored on the device which of course removes the possibility of a backup WAP. In reality though, most customer sites have many WAPs all with the same SSID but on different channels and WICED seems to handle that OK.
Yes, I have about 20 mins worth of connection logs and it was always a 1006 error.
And yes, the password was definitely correct! Although it did have a few “!” characters but alas this had nothing to do with it.