BRN404X with 3rd Party SIM DNS problem

We are trying to use a 3rd party SIM with the BRN404X in North America, the SIM provided is for this region.

We have switched to the external SIM and set the APN to the one provided but we are not getting a connection, it looks to be having DNS issues, see debug output below.

We have successfully used 3rd party SIMs in the European version of the Boron so are familiar with the process of switching, but there seems to be something else happening here.

Are there some other settings relating to DNS we need to change for this setup?

0000004041 [ncp.client] INFO: Using external Nano SIM card
0000007748 [mux] INFO: Starting GSM07.10 muxer
0000007749 [mux] INFO: Openning mux channel 0
0000007750 [mux] INFO: GSM07.10 muxer thread started
0000007753 [mux] INFO: Openning mux channel 1
SIM ICCID = 89882390000580768771

Module Serial: P044AD309TLBPD8
Module BRN404X
LTE Modem
Vodafone SIM supported
Active SIM : 1
UMNOPROF = ((0,1,2,3,4,5,20,21,28,32,38,39,41,43,47,90,100,102,199,201,206))

UBANDMASK = ((0-1),185481375,1048642)

0000012743 [system.nm] INFO: State changed: IFACE_UP -> IFACE_LINK_UP
0000012744 [system.nm] INFO: State changed: IFACE_LINK_UP -> IP_CONFIGURED
0000012745 [system] INFO: Cloud: connecting
0000012748 [system] WARN: Failed to load session data from persistent storage
0000013718 [system] ERROR: Failed to determine server address
0000013719 [system] WARN: Cloud socket connection failed: -230
0000014201 [system] WARN: Internet test failed: DNS
0000014201 [system] WARN: Handling cloud error: 2
0000014801 [system] INFO: Cloud: connecting
0000014803 [system] WARN: Failed to load session data from persistent storage
0000015806 [system] ERROR: Failed to determine server address
0000015807 [system] WARN: Cloud socket connection failed: -230
0000016273 [system] WARN: Internet test failed: DNS
0000016273 [system] WARN: Handling cloud error: 2
0000016873 [system] INFO: Cloud: connecting
0000016875 [system] WARN: Failed to load session data from persistent storage

Is the message Active SIM : 1in your firmware? 1 is internal SIM, 2 is external SIM if the result is from Cellular.getActiveSim().

Yes the Active SIM : 1 is from our firmware, 0=internal, 1=external

There are no settings for DNS.

It's not completely clear from the log, but typically that occurs when the SIM is not fully configured for IP. Though the odd part is that it does appear to have successfully set up the PDP session (TCP/IP layer), and usually at that point you can use UDP and DNS. In rare cases, you might be able to establish the PDP session but the operator won't allow IP traffic on your SIM.

In any case, since it's a 3rd-party SIM it can't be debugged from the Particle side.

1 Like

One thing I forgot about: Make sure the APN is set correctly. If incorrect, it would cause the symptoms you are seeing. The PDP session will come up, but all data transmission will disappear in both directions with an invalid APN.

Have checked the APN.
Carrier is asking for a list of all the destinations the device needs to communicate with, would you be able to supply this?

The list of IP addressses is available here, however it is subject to change.

We are still talking with our SIM provider about the connectivity problem.
On the BRN404X we are seeing the led flashing cyan and then every about 16, 17 seconds we get a single flash of green, followed by two flashes of yellow and then back to flashing cyan.
We does this led sequence mean?

I think that might be 2 orange blinks, which look a lot like yellow. That's "Could not reach the internet." which is consistent with the behavior you are seeing. The device is not able to send (or maybe receive) UDP packets on the PDP session.

I'm having what appears to be the same issue. Using an external SIM on the BRN404X with OS 4.2.0. Most of the day it connects to T-Mobile's tower, and then proceeds to connect to the Particle cloud (I'm running SYSTEM AUTOMATIC mode). All runs well. I have Particle.keepAlive(120); Then all of a sudden I get flashing CYAN. I can see in my Log.Info stream that it's trying to reconnect to Particle. Eventually it flashes the orange and prints
[system] WARN: Internet test failed: DNS
[system] WARN: Handling cloud error: 2
[system] INFO: Cloud: connecting
[system] WARN: Failed to load session data from persistent storage

This can go on for some time all the while staying connected to Cellular, and sadly eating 3rd party data allowance.

BTW I have code that runs in the loop() that will reset the Boron
via System.reset(RESET_NO_WAIT); after 20 minutes of this.

All this can go on for 30-60 minutes or so before it finally connects to Particle cloud and runs fine sometimes for hours.

Any insight? Do you need me to run some tests/logs for you?

The logs of the period while it's working, through the transition to it not working would be helpful. Also a log of it working normally with your SIM.

I changed the keepalive interval from 120 to 60 seconds and it’s been running for over 12 hours with zero disconnects. I’ll be experimenting more today with this, but suspect this was my issue.

I know with 3rd party SIMs this data activity counts against my monthly allowance (roughly 5MB @ 60s int). I’m trying to work with my SIM card provider to somehow minimize this.

My guess is I won’t get very far.

I suppose the cellular providers have to make money, even if it’s a couple of bucks extra a month off IoT hobbiests like me.

Kudos on Particle for not counting connection overhead against allowance. Maybe one day the BRN will work outside of North America…

It's not surprising that you need a keep-alive of 60 seconds. Some carriers require a 30-second keep-alive!

All mobile devices use carrier-grade NAT. Your Particle device and even your phone are not assigned an IP address on the Internet. When your Particle device sends to the cloud, it uses UDP and the carrier NAT creates a port forwarding pathway back to the device transparently.

Since ports are a finite resource (fewer than 64,000 per IPv4 public IP address), carriers like to return unused ones as quickly as possible. Typically it's in the range of 30 seconds to a couple minutes of no data transmission (sending or receiving) will release the port forwarding.

The Particle SIM is unusual in that we negotiated an period of 23 minutes to reduce data usage.

1 Like

Thanks for that feedback. Interesting. Since ublox reports an IP4 address I thought that was being used. Not really sure how cellular companies will handle millions of these IoT devices should they start getting installed in, say, water meters.

This morning I changed the keepalive to 360 seconds. All has been running with no disconnects for hours. I’m wondering if they dynamically adjust their timeouts based on load. I’ll leave it running and see what happens.

Another thought I had was that yesterday I was reprogramming the Boron (via USB/DFU) a lot during some firmware dev, and as a result had dozens of sudden disconnects from T-Mobile. Maybe their algorithm sees “misbehaving” devices and acts a bit stricter. We shall see since I’m letting it run today.

Any chance you know the cost in terms of data usage per reconnection? I’m assuming it’s based on how many cloud variables I expose, but maybe just a ballpark? I might consider a once or twice a day 1 hour operational window for my device.


So I've let it run with keepAlive set to 4 minutes. I haven't lost connection, but I just noticed these on the Console. What does this mean? Am I close to losing connection?

You may or may not be losing the connection. It looks like it's happening every 4 or 8 minutes, which does not seen coincidental. When you get an offline event at the same time as an online event, it's because the Particle cloud did not realize the device was offline when it got an online indication. This is not surprising because it requires two missed default keep alives (46 minutes) for a device with no contact to be marked as offline.

One less-bad possibility is that your carrier is changing your IP address or port every 4 or 8 minutes, which would briefly make the device not able to be communicated to from the cloud, but because your keep-alive is short, it would be a pretty small window of time, so it probably wouldn't affect operation that much.

Thanks for your reply. I appreciate very much you looking at this.

I've run more tests with varying keepAlives. During the tests I made a point of not interacting with the Boron. What happens then is the Offline/Online pairs as seen above are exactly the keepAlive period apart. My conclusion is that they appear as the result of the Boron sending out a keepAlive ping.

At no time did the cell drop the connection nor did the Boron think the particle connection was gone (I had consistent breathing cyan).

Next, I try asking for a variable via the Console. keepAlive was at 10 minutes. I would not get a response unless it asked for data right after one of those pairs showed up.

I have contacted my SIM provider, the support person was very knowledgeable and helpful, and they told me a PDP (packet data protocol) cellular session typically doesn't time out for several hours. She thought the timeout had to be in a different layer perhaps on a router/firewall/server etc.

To me that appears to be correct, based on the seen behavior. Any suggestions for tests?

I plan now to find the largest keepAlive that doesn't produce these pairs.

It's not a PDP session disconnection issue, that would cause the status LED to change.

It's UDP packet return port forwarding that's being removed, and the symptoms match exactly what you are seeing. With almost all SIM cards other than the Particle SIM it must be between 30 seconds a 2 minutes. Anything longer the device will breathe cyan, but you won't be able to communicate from the cloud side until the keep-alive occurs again.

If you set the keep-alive too long you will also get an offline and online at every keep-alive interval.

Ok thanks, I think I get it. The timeout appears to be 150 seconds btw. After that, you are very correct, I get no response via the Console until next keepAlive comes along. (Of course, each request I send from the Console restarts that 150s timeout).

If I understand you correctly, these UDP packet return ports are managed by the Cellular co.

Again, thanks so much for your help. I have a better understanding. I might be OK with setting the keepAlive fast during "business hours", and slow it way down at off-peak hours.

Yes, you are correct, and adjusting it would work. The only catch is that when the keep-alive is too long you won't be able to do any cloud-initiated options like functions, variables, and OTA, except in the 150 second interval after the device pings the cloud.