On new wifi network, Photon briefly connects then turns solid cyan (unreachable)

In an attempt to mimic what a customer receiving my hardware would go through, today I took my Photon to a local customer’s location, put it into listening mode, and then used the Particle iOS app to apply the new wifi credentials (skipping authorization without claiming the device). The process with the app was actually incredibly smooth with no errors, I was very impressed and optimistic.

After the setup was complete, however, my Photon was stuck on solid cyan. I would unplug / replug the power to get it to reconnect and start breathing cyan. However, after 10-20 seconds, it would then get stuck on solid cyan again (I attempted this several times). When I brought the Photon home and plugged it in, it immediately started breathing cyan continuously, no issues.

One thing I will say is my experience of the wifi connection at this customer’s location (on my laptop) was that it was quite slow, and with my other app (Android) they’re using, there’s been at least one weird wifi issue in the past where only resetting the router resolved it.

This is from my console, does anyone know what the issue might be?

{
  "device": {
    "network": {
      "signal": {
        "at": "Wi-Fi",
        "strength": 76,
        "strength_units": "%",
        "strengthv": -62,
        "strengthv_units": "dBm",
        "strengthv_type": "RSSI",
        "quality": 67.74,
        "quality_units": "%",
        "qualityv": 30,
        "qualityv_units": "dB",
        "qualityv_type": "SNR"
      },
      "connection": {
        "status": "connected",
        "error": 0,
        "disconnects": 0,
        "attempts": 1,
        "disconnect_reason": "none"
      }
    },
    "cloud": {
      "connection": {
        "status": "connected",
        "error": 0,
        "attempts": 1,
        "disconnects": 0,
        "disconnect_reason": "none"
      },
      "coap": {
        "transmit": 0,
        "retransmit": 0,
        "unack": 0,
        "round_trip": 0
      },
      "publish": {
        "rate_limited": 0
      }
    },
    "system": {
      "uptime": 15,
      "memory": {
        "used": 39512,
        "total": 82944
      }
    }
  },
  "service": {
    "device": {
      "status": "ok"
    },
    "cloud": {
      "uptime": 0,
      "publish": {
        "sent": 2
      }
    }
  }
}

followed by:

{"service":{"device":{"status":"unreachable"},"cloud":{"uptime":22,"publish":{"sent":2}},"coap":{"round_trip":null}}}

followed by “offline”. I’ve also attached a photo of logs.

There are a few more things you should try

  • How does the device behave on the customers WiFi when in Safe Mode?
  • What happens when you clear your stored WiFi credentials before connecting it to the new network?

And when it starts breathing cyan but then stops chances are that your code is doing something funky which hurts the connection there but not on your desk.
However, without seeing what your code does it’s hard to tell :wink:

2 Likes

Hi @policenauts -

I have had this issue before, but it was not limited to a specific WiFi network. The problem was in my code.

I think a quick way to determine whether it is in your code or not, might be to flash the Tinker App (or use another Photon on the same network but with Tinker flashed) and see whether it behaves the same. If it behaves the same, error is on the network., If it functions normally, well then there is a glitch in the code somewhere.

Good luck!
Friedl.

1 Like

@ScruffR @friedl_1977 thanks both! I will try Safe Mode and removing previously stored wifi credentials. I’ll also try to flash Tinker.

My code is quite simple as I’m just streaming weight values from a scale to Particle Cloud and using readStringUntil() with a known terminator character (in this case ETX). I’m a beginner so here it is in its ugly, raw form (I know I should to migrate away from String to char array, I just haven’t yet quite figured out how to do that).

I have experienced this same issue before where if I don’t include the terminator character, the data streams in and the String size quickly becomes overwhelmed, resulting in solid cyan and locked up Particle device. However, in this case at the customer site, it was happening even with no cable plugged in, so I was confused as to how the code was impacting things.

bool isAvailable = false;
char c;
String string;
String string1;

void setup(){
    Serial1.begin(9600);
    Particle.variable("isAvailable", isAvailable);
    Particle.variable("c",c);
    Particle.variable("string",string);

    Particle.publishVitals(5);
}

void loop(){
    isAvailable = Serial1.available();
    c = Serial1.read(); // i know this doesn't do anything, this was me beginning to try and play with char
    string = Serial1.readStringUntil('\x03'); // etx terminator character
}

Hi @policenauts

Hhhhm, ok. I am testing you code one a Photon trying to obtain even weaker vitals than the ones you posted as a worse case scenario.

It has been running smooth for just short of 30 mins, so much longer than the 10-20 seconds you described. Of course this is also without anything connected but if I understand correctly you encountered the problem regardless of whether the sensor is connected not??

From here I would guess the problem then lies on the device or on the network. Another easy test to eliminate the WiFi is to use another network, maybe your phone's hotspot?? If the problem occurs even while on your another network, chances are it lies on the device.

Of course I will also follow the advice given by @ScruffR and clear the device and other 'usuals' i.e. Update Firmware, re-flash firmware/bootloader files, clearing all stored WiFi credentials and so on.

Let me know how it goes!

Regards,
Friedl.

Thank you @friedl_1977! I took your advice and repeated the same process where I set up a new wifi connection, but to the hotspot on my mobile device - it connected without issue and breathed cyan and I was able to successfully query the variables via Particle Cloud.

I’ll try next week to do another test where I clear the credentials and try again. I’ll also try and do the same thing with my Argon (though I won’t be able to bypass authentication in that case). If that doesn’t do the trick, is there any known issue with respect to Photon and certain routers / configurations? Maybe the 20 vs. 40hz issue previously identified by @mterrill?

Edit: I’d forgotten that the Photon doesn’t work on 5ghz and I didn’t think to check with my customer regarding their wifi. But would it have successfully connected via the Particle app in the first place?

Edit #2: After more research, I think my issue is similar to this. I’m confirming with my customer, but I did only see one wifi SSID and I’m guessing / hoping it’s the issue where they have both 2.4 and 5ghz bands on the same SSID (per other threads).

Hi @policenauts -

Apologies for the delay, I was writing exams :slight_smile:

You will have no problem connecting to the Photon from the Particle app as it will be a connection between you phone and the device, not utilising to WiFi network you are connecting to. After connecting to the device though, I would assume 5GHz might present some problems, but my guess would be that it would be immediate and not give you the 20s time you have been experiencing. I will set up my router and confirm.

EDIT:// As per my suspicion, if you connected to a network at your client's premises, it would have been to the one broadcasting on 2.4GHz. The photon is not detecting the 5GHz networks, obviously making it impossible to connect to all-together.

From the image you can clearly dee two SSID's. Both are 2.4GHz. The one broadcasting at 5GHz is not being detected.

Usually on the SSID will differ slightly on 2.4GHz and 5GHz bands i.e.

  • My_home_WiFi 2.4GHz
  • My_home_WiFi 5.0GHz

These of course can be edited but in my experience, not to the exact same SSID. Again, I will test this and revert back.

EDIT:// Also as per my suspicion, I was not able to rename both exactly the same. Well, at least not on one of the two routers I have. (Netgear and TCP Link). A little besides the point anyway as the Photon is unable to detect the 5G one :slight_smile:

Regards,
Friedl.

Thank you Friedl! I did see in other threads in this community (here and here) that there are definitely cases of SSIDs having the same name which allows for the initial connection, but then the router will force the Photon to the 5G which will cause the disconnect. I am still waiting to hear back from my customer if that’s the case with their setup, but I strongly suspect this applies to my situation as well. Will update when I find out!

@policenauts I’ll bet you a case of beer that its very simply the customer’s wifi router setting for channel width / bandwidth on the 2.4ghz network. It needs to be set to 20mhz, otherwise the router will try to increase bandwidth by using two wifi channels (40mhz or ‘auto’). When the router wanders past the 20mhz that the old particle understands then the conversation is completely missed.

It’s a known hardware limitation, official response years ago was that they didn’t see it as an issue (which is crazy talk considering devices will just fall quietly offline but keep breathing cyan every 20-60 mins on most modern wifi routers) and it’s not easily found anywhere in documentation or common issues. I seem to be the only person who ever talks about it, perhaps it’s my background as a network engineer, but mostly because of the continual and frequent support headache that it is.

The other company of note that gets the issue is LIFX, and their support forums are littered with folk recommending for people to buy an old BGN router and attach all their lights and IOT to that. What they haven’t quite figured out is the real cause use the channel width / bandwidth setting. The older BGN only could do 20mhz, so they quietly inherit the setting as there was no other possible setting.

Our recent blog article walks folk through some wifi optimisation tips and how to find / change their 20mhz setting: https://smartfirebbq.com/blogs/news/improving-your-home-wi-fi-for-smartfire-and-everything-else

2 Likes

Can I get in on this case of beer?? Would come in handy especially as Alcohol sales in SA has been prohibited due to lock down measures :rofl:

I have my home router set to Auto and have never has a Photon or Argon presenting this behaviour. Very curious now, I will force it to 40mhz and see what happens :see_no_evil:

Alcohol sales are prohibited in SA? Surely not! Here in Melb we’re now in Stage 4 and I can assure you all our bottleshops etc are fully functional!

If you’re absolutely sure, I’m happy to accept your wager as I’m a big fan of Prancing Pony! I do feel however it’s a bit unfair as I know the answer after having spent a decade as a network engineer.

To explain a little bit:

  • Router manufacturers pushed the spec to bond two adjoining 20mhz channels so they could have double the speed.
  • From the early days the bgn specs clearly defined the behaviour that routers were meant to follow. IE if a 20mhz constrained client showed up, the router would shuffle comms back to 20mhz for all clients. They called it the fat bit!
  • Your screenshot looks like a tplink (great routers). Auto means if the router feels like it or sees higher activity in one 20mhz channel it’ll push comms across a 40mhz range and/or simply has a client that’s wanting to push 40mhz.

Very simply, this is all a horror show for little old particle’s. They get left there holding the tin can and string expecting to hear the full bidirectional conversation over the single channel width, whereas the router is happily broadcasting across a full 40mhz spectrum as it sees fit.

This means the behaviour is erratic and dependent on what is happening in the network with the mix of clients and also whether you have neighbours who are broadcasting in overlapping channels! The router will try to optimise within the 40mhz, but the particle is stubbornly constrained to the stated channel width of their allotted channel.

So, expect to see a particle sitting there glowing cyan like a grinning idiot who has no idea they’re offline. You’ll be able to watch a particle subscribe log trail of the events going quiet, and possibly a green light flash sequence after about 5 minutes when the particle finally realises it’s not connected. Generally this will happen once every 50 minutes, sometimes as often as every 5 minutes. It’s up to the network traffic.

By the way, the newer boron platform etc has a 40mhz capable wifi chip. Don’t use the P1 for new production projects. Just like the major chipset companies like Texas Instruments and Espressif, Particle would ideally be listing the P1/P0/Photon as ‘not recommended for new designs’

Lastly, what really makes this into a train smash is the latest generation of home routers that are doing channel and band steering of clients! A lot of the major ISPs are rolling out routers that have taken the enterprise wifi approach of band steering and baked it into consumer grade stuff. We’ve had a lot of support dramas with them. Basically need to tell the customer they have to turn off the fancy new features on their shiny new router and explain why they can’t have the nice things because of the product you sold them.

Hi @mterrill -

haha yes indeed... for coupe of months now. So are cigarettes. The latter does not bother me at all, but it is a severe tax loss of Billions to the country and wine industry is hanging by a thread. Anyway, not the place for politics I suppose :joy:

I am sure you are correct, I just love solving puzzles so are now intrigued by the fact that I have not encountered this on my (or one of my clients) networks in the last three years. I have several devices posting data in 1s intervals, so should have noticed this. I also have products using Blynk and from Blynk's side it informs my client if the product goes offline (when Blynk server loses connection to device), so not relying on the Photon to "think" it is online.

You are correct, the screengrab is that of TPLink firmware, I managed to destroy my Netgear Nighthawk thanks to Load Shedding :rage: I also have a smaller Netgear router I used before, will try that as well and will force my TPLink to 40mhz to be sure it is broadcasting at that. We have 15+ devices connected so am sure at leats one of them should be "asking" for 40mhz, but lets make sure :slight_smile:

After being in IT (wireless network security) myself previously for just shy of 16 years, I am also aware that anomalies can happen more than you would like. I once "went to war" with a stupid Nashua printer for an entire day to get it to simply allow incoming connections :exploding_head:

If this is indeed the case, I agree, it should be made more obvious! I have however decided some time ago that my MC's of choice would be Argon/Boron due to similar footprint and pinout. This should allow for single PCB design to allow either WiFi or Cellular product. Having said that, I have quite a coupe of Photons in the field, so this is good info to have, thanks!!

I will post my findings... :nerd_face:

@mterrill thanks for chiming in - still no word from my customer (I’m still prototyping with Particle so not super urgent), but will try and confirm the next time I’m there.

I’d be happy to use Argon over Photon, but my understanding is there’s no way for a customer to add wifi credentials via the app (without claiming the device) nor can you add a Soft AP page - so absent getting their wifi credentials in advance, I think I’m stuck with Photon for the time being!

You probably do not have a touch screen on your project. One of ours that I am working on will have a nextion and I made a page on there to set wifi info. Still do not have a good way of doing it on a device without a screen.

Thank you. You’re right, no touchscreen unfortunately.

Don’t worry, you’ll find the symptoms. Surprised none of your clients haven’t called yet going ‘why isn’t it online and responding?’

We use 15s intervals on posting. That’s something different. Maybe yours are keeping an active connection?

ps, sorry, only just realised you meant South Africa! I incorrectly jumped to thinking you meant South Australia.

I will check this out today, Calculus is keeping me busy, have a test coming up this Friday :slight_smile:

Well, here is is front page news as well, guys in charge just don't give a damn :see_no_evil: