Boron 1.5.2 long reconnect interval issues

If you search my history, you will find that at another time it occurred that a decisively software-only issue with a Particle Device OS update caused me extreme trouble with my Borons (relating to PMIC charging).

Now, with 1.5.2, I am witnessing a semi-return of the “never reconnects” “flashes green ad-infinitum” software issue that rendered the Boron totally useless until 1.3.1-rc1.

I have a Boron just updated to 1.5.2 that whenever I flash, it never reconnects. I must do a manual power cycle. It is different from the old issue, because during program execution, if it loses the cell signal, it will reconnect eventually (however even this works incredibly unsatisfactorily, sometimes taking long intervals of time to reconnect when brought back into areas of perfect signal where the connection by power cycle would be established in mere 20 seconds).

This is confined to flashing an update, and then subsequent reconnect.

I understand it’s my responsibility to post my code. But before I become an unpaid Particle debugger by flashing an empty program, testing, flashing the old OS, testing my current code, and then testing a blank program (the logically proper control tests), can I ask if anyone else is experiencing this?

The Boron will be such a great product once it works reliably. In the meantime, it is extremely frustrating to see the potential utility of the product plagued by issues like this.

Thank you.

This is the first time I’ve heard of this.
Is this an OTA update or via USB?

Honestly, the best way to help you is if you create a support ticket. We’ll need to see what the device is doing on the backend, and therefore need the DeviceID/signal levels etc.
Without logs we are blind.

Thank you, I appreciate this. Perhaps I used the wrong terminology when I said “flash”. I am referring to use the Web IDE to deploy code changes over the internet to my Boron. I’ve never compiled C code and somehow USB programmed a Boron like Arduino.

I am lacking motivation and time, but if downgrading the Device OS doesn’t fix this I will post more details. I have great feedback about how excellent Particle support is. Respectfully though I am not a beta tester and would love to have a more stable product. But I am sticking with Particle for as much as I can.

There are many reasons for a device to misbehave, but as I said earlier, we would love to assist, but we need to see what the device is doing to narrow down the cause.

1 Like

Thank you. After about a half-hour, it finally reconnected after the new flash. Therefore I will amend the title of this thread. The issue is not never reconnecting after a flash; the issue is taking exorbitantly long to reconnect under circumstances where the cellular signal is so good that, with a power cycle in the same place and time, the connection would be established in less than 30 seconds.

I will see if I can collect more data and I invite others to comment on this.

Is there a way to control the aggressivity of reconnect attempts by Boron?

Can a low-level Particle firmware engineer please explain to me what is actually going-on under the hood with the modem when it loses connection and tries to reconnect?

Is there some sort of attempt interval?

Is it entirely controlled by the UBLOX firmware and not under the control of Particle?

Is there a way to use more power and bandwidth in order to restore the connection faster?

Again, 15 - 30 minutes being observed for a reconnect when and where a 1-second manual in-person power cycle would reboot Boron and connect to web within 30 seconds.

@Paul_M Just to add to Paul comments, been observing similar issues and it’s very frustrating. Any recommendation?

1 Like

Hi Paul,

I’ll answer what I can. Again, we need more information to properly debug this.

Is there a way to control the aggressivity of reconnect attempts by Boron?

Not without modifying DeviceOS, and this will create issues with our MVNO.

Is there some sort of attempt interval?

Roughly 10mins - again, to keep local towers from blacklisting your sim card.

Is it entirely controlled by the UBLOX firmware and not under the control of Particle?

As far as I know, DeviceOS does most of the connectivity management.

Do you see the same behaviour in 1.4.2?

Thank you Chris @no1089.

I write you at a time where my remote station has gone offline since three hours ago, where it would most certainly reconnect in an instant if power cycled.

While I do not have all the information you are looking for, I wanted to share that I have heavily reflected on and ruminated over the quotation “Roughly 10mins - again, to keep local towers from blacklisting your sim card.”.

I am upset to find myself looking for alternate cellular IoT solutions because of this.

  1. I am certain that in the Boron there is something going on that’s more than 10 minute equivalent to power cycle resetting. This is from a LOT of anecdotal experience with programming Borons over the last 1.5 years. The station I have which has currently been offline for 3 hours (it will eventually reconnect) has a high gain cellular antenna at a spot where it was possible for the Boron to connect with one of those flimsy paper antennae).

  2. I am certain that it is possible for the Boron to attempt reconnection more than once every 10 minutes and not get “blacklisted” because there are hundreds of thousands of cell-phones which go in and out of service (such as when I walk into Walmart and then go outside), and never does my Android phone get “blacklisted”, nor does it take more than a few seconds for it to regain LTE connection.

  3. I am certain that it is possible for the Boron to attempt reconnection more than once every 10 minutes because there are instance of losing the connection (e.g., remove antenna) and it reconnecting (antenna back on) almost instantly, within 10 seconds. This happens sometimes, and sometimes it goes on for hours without reconnecting. It does always eventually reconnect which is excellent, but the delay is the problem as compared to power cycling.

Hi Paul,

The Boron can certainly cycle a connection faster, but in order to keep the local carriers and MVNO happy, 10 minutes is the recommended timeout. >6 connections per hour is considered “aggressive” by the carriers which is why 10mis is a good ballpark.

If you repeatedly power cycle your device (<5mins usually), you can end up in a state where the local tower blacklists your sim - I’ve done it accidentally before. It’s a protracted process to resolve, as the local carrier must release the block.

This is an LTE Boron? R410M modem?

1 Like

@no1089 For what it’s worth, I can confirm these issues absolutely surpass the presumptively valid 10-minute minimum retry interval.

I have Borons going hours without reconnecting where, if power cycled, would instantly reconnect.

I have Borons that I manually power cycle 10 times in a 10 minute sitting to debug something, and they never get blacklisted.

Of course, I have Borons that never reconnect even using “stable” 1.3.1-rc1, and then I have two lucky, random Borons on 1.3.1-rc1 that have been reconnecting and behaving perfectly in the field for months. A 20% success rate is extremely frustrating.

What testing does Particle actually do with the Boron LTE connection stabilty? Does it not do any testing? Has Particle ever booted up a Boron without the antenna, added it 10 minutes later, and made sure it eventually connects?

I just did this with a 1.3.1-rc1 Boron. It is STILL flashing green, hours later, and will likely continue for eternity.

Hi Paul,

Without deeper context into what your code is doing, commenting on why your devices do not reconnect is tricky. I don’t doubt your experience with the frequent connection cycles, but that is the guidance we have received from our MVNOs, and as mentioned before, getting a SIM off of the blacklist takes time.

We have test fleets that run through various connectivity and publishing tests; any issues identified are addressed. Our cellular team does more detailed test with RF equipment to characterise behaviour in various RF conditions.

Physically removing and replacing the antenna is not recommended, you can easily damage the TX circuitry. But yes, tests along these lines are done.

DeviceOS 1.3.1-rc.1 is not a recommended version to be on. I would recommend nothing below 1.4.4, and highly encourage you to test on 2.0-rc.1 as it contains various improvements.

If you have a specific example, that you can reliably replicated and provide instructions for, on 2.0-rc.1, please create a ticket and I’ll personally take it to engineering to fix.

@no1089 Chris, I very much appreciate the reply containing this helpful information.

I very strongly disagree with you about 2.0rc1 vs 1.3.1rc1 and have performed the exact kind of controlled test you mentioned to prove my theory in the last post here:

Hi Paul,

Thank you for linking that post with your thoughts on 2.0. Without seeing your code, and how it handles reconnects, I cannot replicate, nor comment on the behaviour and failure to reconnect. Without understanding where the issue lies, we cannot address it in 2.0.

Did you observe this same behaviour while running Tinker on DeviceOS 2.0-rc.1? LTE-M is not available locally, so I cannot test this myself. My 2G/3G devices behave as expected on 2.0-rc.1. Same with the B523 on LTE CAT 1.

@no1089 Thanks again for following up. It’s quite a reasonable point you make about needing it to be on Tinker. It was not on Tinker, but I am 100% sure that is non-dispositive. It was a SEMI_AUTOMATIC program where loop() checks if(!Cellular.ready()) { Cellular.on() } elseif (!Particle.connected()) { Particle.connect() }
My understanding is that is no different than what tinker does.

Unfortunately I’m unwilling to put in more effort into testing this at the moment because I am a decently happy camper with 1.3.1-rc1 right now, despite the disappointment a few Borons being totally destroyed physically and unrecoverable for inexplicable reasons (see my last thread), and despite the frustration that your Boron product gets locked-up into a no-LED state if VUSB is power switched by a relay while RX/TX is connected to a powered microcontroller (other thread I made).

Hi Paul,

Well if you can send me some code to try it on 2G/3G I will take a look.
Tinker uses Automatic mode, and relies on that solely to maintain the connection.

While I understand your position - we are at a point where we want to fix all of the connectivity issues so that DeviceOS 2.0 can be relied on to maintain connectivity as much as physically possible, so it is worth everyone sending in reproducible bugs so that our team can address it, and that everyone has access to the latest features.

I can’t comment on the Borons being destroyed right now, I’ll have to look at the post.

I am aware of the other issue that cycling power via VUSB is not recommended due to the NRF being a low power microcontroller, the leakage current on the RX/TX pins is enough to sustain operation, then when you power the PMIC again, the device does not “boot” as it normally would.
All watchdog implementations should use the EN pin to power down the PMIC.

With low power microcontrollers leakage current is always an issue, I’ve seen large customers struggle with incorrectly placed pull-ups that keep their devices powered.
This type of operation is beyond Particle’s control and incumbent on the circuit designer to ensure they do not introduce errant sources of current to the device.

1 Like

Chris @no1089, thank you for also addressing the most critically important reset issue being that the hardware watchdog on the Boron causes it to boot into DFU mode permanently (see my thread) and being that an external hardware watchdog locks up the Boron (I can email you a video I took of the Boron in such a dead state at my site when I travelled there and instantly booting up again on power cycle).

Are you aware of the sentiments on this forum that have been expressed, that cycling EN is not always sufficient to fully reset the device out of certain very bad conditions? Do you agree with that sentiment? Are you aware of it, but disagree with it? If I though the EN pin were a sufficient manner of resetting the device, it would be much easier for me to have used/to use that instead of the full power switch anyways. But, after reading a lot of other experience on this forum, EN toggling is not a panacea.

This is a know problem due to the ESD diode on all GPIO pins. This is worth a read and suggests a current limiting resistor to prevent the MCU from powering off:

https://devzone.nordicsemi.com/f/nordic-q-a/18964/nrf52832-power-off-reset-with-uart-connected

2 Likes