Boron LTE Fails to Connect

This is in response to @mstanley’s post at Over 20 Devices just went down. I’m creating this thread to capture findings.

The symptom is a device that refuses to make a cellular connection (continuously flashing green). As such, initial setup may fail (if the issue occurs out of the box).

Cloud debug output looks like the following:

clouddebug: press letter corresponding to the command
a - enter APN for 3rd-party SIM card
k - set keep-alive value
c - show carriers at this location
t - run normal tests (occurs automatically after 10 seconds)
or tap the MODE button once to show carriers
starting tests...
turning cellular on...
deviceID=<redacted>
manufacturer=u-blox
model=SARA-R410M-02B
firmware version=L0.0.00.00.05.06 [Feb 03 2018 13:00:41]
ordering code=SARA-R410M-02B
IMEI=<redacted>
IMSI=u-blox
ICCID=<redacted>
0000020797 [app] INFO: enabling trace logging
attempting to connect to the cellular network...
0000118048 [gsm0710muxer] ERROR: The other end has not replied to keep alives (TESTs) 5 times, considering muxed connection dead
0000118048 [gsm0710muxer] ERROR: The other end has not replied to keep alives (TESTs) 5 times, considering muxed connection dead
0000129703 [hal] ERROR: Failed to power off modem
0000129703 [hal] ERROR: Failed to power off modem
0000150805 [hal] ERROR: No response from NCP
0000150805 [hal] ERROR: No response from NCP

Hey @trpropst

Thanks for posting. This particular issue actually sounds like it may be more towards the initialization issue resolved in our v0.9.1 (and future merged device OS releases).

The first thing I’d like to check on is the device OS version your unit is running on. Would you be able to provide that here?

This device is running v1.1.0-rc.2.

Hmm. That should have been resolved in that revision. It’s odd you’re still seeing it. And this is on setup in this case, correct?

I might see about having you grab some mobile setup logs after a failed setup to see if that’s able to give us any additional; information.

You can grab setup logs on the “Your Devices” screen in the upper right hand corner. Back out to this screen after you run into the issue in setup and feel free to provide them here or in a direct message.

In my case, with this device, the initial setup went fine and the device ran for about a week. The device stopped connecting to the cloud. I flashed it with the cloud debug tool and it has failed to connect to cellular ever since. That is the debug output you see in the initial post here. Of course now that cellular fails, trying the setup process always fails at the point where the cellular connection is attempted.

I tried to go back to 0.9.0 this week and the device would not operate (I suspect because of a bootloader version mismatch). I am now unable to get it to accept any firmware version and it is stuck in DFU mode. If you have recovery processes to try, I can possibly fix it and capture more information. Otherwise this device is a brick. I’m waiting for more to become available.

Are you able to flash anything to the device over CLI while it is in DFU mode? That would be my next go to recommendation here. It’s possible there’s a bootloader mismatch or somehow the device OS or user app got corrupted. It’s hard to say for certain.

I was only able to flash (and have a functioning device) if I used system part 1.1.0-rc.2. Today I attempted to walk back one version at a time, starting with the bootloader and then the matching system part. I got back to 0.9.0 so that I could run the pre-built cloud debug tool.

The only thing that changed was that the modem gets hard reset and then power cycled after failing to respond (also no modem information is returned where it was previously). Then it seems to respond to AT commands but does not connect and later fails to respond to a power cycle. The hard reset is repeated and on it goes. I’ve attached the output in case you are interested.

Serial monitor opened successfully:
clouddebug: press letter corresponding to the command
a - enter APN for 3rd-party SIM card
k - set keep-alive value
c - show carriers at this location
t - run normal tests (occurs automatically after 10 seconds)
or tap the MODE button once to show carriers
c0000011679 [hal] ERROR: Failed to power off modem
starting tests...
turning cellular on...
0000036005 [hal] ERROR: No response from NCP
0000036006 [system.nm] INFO: State changed: DISABLED -> IFACE_DOWN
deviceID=<redacted>
manufacturer=
model=
firmware version=
ordering code=
IMEI=
IMSI=
ICCID=
0000117155 [app] INFO: enabling trace logging
attempting to connect to the cellular network...
0000117158 [system.nm] INFO: State changed: IFACE_DOWN -> IFACE_REQUEST_UP
0000117158 [system.nm] INFO: State changed: IFACE_DOWN -> IFACE_REQUEST_UP
0000117161 [hal] TRACE: PPP netif -> 8
0000117161 [net.ifapi] INFO: Netif pp3 state UP
0000117161 [net.ifapi] INFO: Netif pp3 state UP
0000117163 [system.nm] INFO: State changed: IFACE_REQUEST_UP -> IFACE_UP
0000117163 [system.nm] INFO: State changed: IFACE_REQUEST_UP -> IFACE_UP
0000117163 [hal] TRACE: Modem already on
0000117167 [hal] TRACE: Setting UART voltage translator state 1
0000118168 [ncp.at] TRACE: > AT
<repeated many times...>
0000137174 [ncp.at] TRACE: > AT
0000138175 [hal] ERROR: No response from NCP
0000138175 [hal] ERROR: No response from NCP
0000138176 [hal] TRACE: Setting UART voltage translator state 0
0000138177 [hal] TRACE: Hard resetting the modem
0000149177 [hal] TRACE: Powering modem on
0000149327 [hal] TRACE: Modem powered on
0000149328 [net.pppncp] TRACE: Failed to initialize ublox NCP client: -210
0000149329 [hal] TRACE: Setting UART voltage translator state 0
0000149330 [hal] TRACE: Powering modem off
0000149331 [hal] TRACE: Setting UART voltage translator state 0
0000160932 [hal] ERROR: Failed to power off modem
0000160932 [hal] ERROR: Failed to power off modem
0000161033 [hal] TRACE: Modem already on
0000161035 [hal] TRACE: Setting UART voltage translator state 1
0000162036 [ncp.at] TRACE: > AT
0000162039 [ncp.at] TRACE: < OK
0000163040 [hal] TRACE: NCP ready to accept AT commands
0000163041 [ncp.at] TRACE: > AT+UGPIOC?
0000163046 [ncp.at] TRACE: < +UGPIOC:
0000163047 [ncp.at] TRACE: < 16,255
0000163047 [ncp.at] TRACE: < 19,255
0000163048 [ncp.at] TRACE: < 23,0
0000163049 [ncp.at] TRACE: < 24,255
0000163050 [ncp.at] TRACE: < 25,255
0000163051 [ncp.at] TRACE: < 42,255
0000163052 [ncp.at] TRACE: < OK
0000163053 [ncp.at] TRACE: > AT+UGPIOR=23
0000163059 [ncp.at] TRACE: < +UGPIOR: 23,1
0000163059 [ncp.at] TRACE: < OK
0000163060 [hal] INFO: Using internal SIM card
0000163060 [hal] INFO: Using internal SIM card
0000163062 [ncp.at] TRACE: > AT+CPIN?
0000163067 [ncp.at] TRACE: < +CPIN: READY
0000163068 [ncp.at] TRACE: < OK
0000163069 [ncp.at] TRACE: > AT+CCID
0000163075 [ncp.at] TRACE: < +CCID: 89014103271226607248
0000163076 [ncp.at] TRACE: < OK
0000163077 [ncp.at] TRACE: > AT+COPS=2
0000163090 [ncp.at] TRACE: < OK
0000163091 [ncp.at] TRACE: > AT+CEDRXS=0
0000163096 [ncp.at] TRACE: < OK
0000163097 [ncp.at] TRACE: > AT+CPSMS=0
0000163101 [ncp.at] TRACE: < OK
0000163102 [ncp.at] TRACE: > AT+CEDRXS?
0000163107 [ncp.at] TRACE: < +CEDRXS:
0000163108 [ncp.at] TRACE: < OK
0000163109 [ncp.at] TRACE: > AT+CPSMS?
0000163126 [ncp.at] TRACE: < +CPSMS:0,,,"01100000","00000000"
0000163127 [ncp.at] TRACE: < OK
0000163128 [ncp.at] TRACE: > AT+CMUX=0,0,,1509,,,,,
0000163137 [ncp.at] TRACE: < OK
0000163138 [gsm0710muxer] INFO: Starting GSM07.10 muxer
0000163138 [gsm0710muxer] INFO: Starting GSM07.10 muxer
0000163140 [gsm0710muxer] INFO: Openning mux channel 0
0000163140 [gsm0710muxer] INFO: Openning mux channel 0
0000163140 [gsm0710muxer] INFO: GSM07.10 muxer thread started
0000163140 [gsm0710muxer] INFO: GSM07.10 muxer thread started
0000163195 [gsm0710muxer] INFO: Resuming channel 0
0000163195 [gsm0710muxer] INFO: Resuming channel 0
0000163196 [gsm0710muxer] INFO: Openning mux channel 1
0000163196 [gsm0710muxer] INFO: Openning mux channel 1
0000163297 [gsm0710muxer] INFO: Resuming channel 1
0000163297 [gsm0710muxer] INFO: Resuming channel 1
0000163299 [gsm0710muxer] INFO: Resuming channel 1
0000163299 [gsm0710muxer] INFO: Resuming channel 1
0000163301 [ncp.at] TRACE: > AT
0000163351 [ncp.at] TRACE: < OK
0000163352 [hal] TRACE: NCP state changed: 1
0000163353 [net.pppncp] TRACE: NCP event 1
0000163354 [hal] TRACE: Muxer AT channel live
0000163355 [hal] TRACE: PPP thread event LOWER_DOWN
0000163356 [hal] TRACE: PPP thread event ADM_DOWN
0000163358 [hal] TRACE: PPP thread event ADM_UP
0000163360 [hal] TRACE: State NONE -> READY
0000163360 [ncp.at] TRACE: > AT+CIMI
0000163401 [ncp.at] TRACE: < 310410122660724
0000163402 [ncp.at] TRACE: < OK
0000163403 [ncp.at] TRACE: > AT+CGDCONT=1,"IP","10569.mcs"
0000163451 [ncp.at] TRACE: < OK
0000163451 [ncp.at] TRACE: > AT+CEREG=2
0000163501 [ncp.at] TRACE: < OK
0000163501 [hal] TRACE: NCP connection state changed: 1
0000163502 [net.pppncp] TRACE: NCP event 2
0000163503 [net.pppncp] TRACE: State changed event: 1
0000163504 [ncp.at] TRACE: > AT+COPS=0
0000163504 [hal] TRACE: PPP thread event LOWER_DOWN
0000163551 [ncp.at] TRACE: < OK
0000163551 [ncp.at] TRACE: > AT+CEREG?
0000163601 [ncp.at] TRACE: < +CEREG: 2,3
0000163602 [ncp.at] TRACE: < OK
0000178703 [ncp.at] TRACE: > AT+CEREG?
0000178751 [ncp.at] TRACE: < +CEREG: 2,3
0000178752 [ncp.at] TRACE: < OK
0000193852 [ncp.at] TRACE: > AT+CEREG?
0000193901 [ncp.at] TRACE: < +CEREG: 2,3
0000193901 [ncp.at] TRACE: < OK
0000209002 [ncp.at] TRACE: > AT+CEREG?
0000209051 [ncp.at] TRACE: < +CEREG: 2,3
0000209051 [ncp.at] TRACE: < OK
0000224153 [ncp.at] TRACE: > AT+CEREG?
0000260850 [gsm0710muxer] ERROR: The other end has not replied to keep alives (TESTs) 5 times, considering muxed connection dead
0000260850 [gsm0710muxer] ERROR: The other end has not replied to keep alives (TESTs) 5 times, considering muxed connection dead
0000260855 [gsm0710muxer] INFO: Stopping GSM07.10 muxer
0000260855 [gsm0710muxer] INFO: Stopping GSM07.10 muxer
0000260905 [gsm0710muxer] INFO: GSM07.10 muxer thread exiting
0000260905 [gsm0710muxer] INFO: GSM07.10 muxer thread exiting
0000260908 [gsm0710muxer] INFO: GSM07.10 muxer stopped
0000260908 [gsm0710muxer] INFO: GSM07.10 muxer stopped
0000260910 [hal] TRACE: Setting UART voltage translator state 0
0000260911 [hal] TRACE: Powering modem off
0000260912 [hal] TRACE: Setting UART voltage translator state 0
0000272512 [hal] ERROR: Failed to power off modem
0000272512 [hal] ERROR: Failed to power off modem
0000272514 [hal] TRACE: NCP state changed: 0
0000272515 [net.pppncp] TRACE: NCP event 1
0000272615 [hal] TRACE: Modem already on
0000272617 [hal] TRACE: Setting UART voltage translator state 1
0000273618 [ncp.at] TRACE: > AT
<this is repeated many times...>
0000292622 [ncp.at] TRACE: > AT
0000293623 [hal] ERROR: No response from NCP
0000293623 [hal] ERROR: No response from NCP
0000293624 [hal] TRACE: Setting UART voltage translator state 0
0000293625 [hal] TRACE: Hard resetting the modem
^CSerial connection closed.

At this point, I’m going to wait for the replacement device and assume this one has a true hardware failure unless you think there are other things worth trying.

In regard to the setup log, I did capture this and can see that everything succeeds until it starts waiting for the device to connect to the internet. As expected, it just cycles in this wait state because the device is not able to make a cellular connection. It finally times out.

Hmm,

I’d like to get @BDub’s input on this one to see what he has to say.

If it’s nothing immediately obvious for him, we can see about having you file a support ticket and swapping this unit out, so that we can do a bit more hands on testing on this ourselves.

Definitely looks like some kind of hardware failure and slightly resembling unactivated SIM. Can you put the device in listening mode and run particle serial inspect ? Let’s just make sure you have bootloader 301 and system 1101.

Here is the inspect output. I noticed when running this another time a few days ago that the bootloader was 201. I don’t recall what system reported. Now it looks quite different.

Platform: 13 
Modules
  Bootloader module #0 - version 300, main location, 49152 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  Monolithic module #0 - version 402, main location, 802816 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS

Thanks @trpropst, it looks like your Boron has v0.9.0-rc.3 monolithic firmware on it. Could you try upgrading to v1.1.0-rc.2 locally with the CLI and see if you get the versions I mentioned? You should be able to just flash this modular file v1.1.0-rc.2/boron-system-part1@1.1.0-rc.2.bin via CLI command particle flash --usb boron-system-part1@1.1.0-rc.2.bin

along with tinker boron-tinker@1.1.0-rc.2.bin via particle flash --usb boron-tinker@1.1.0-rc.2.bin

Unfortunately this will not let you see the logs since it’s not the monolithic version, but you should be able run particle serial inspect and try to let the device connect to the cloud.

@BDub, I can do this today. I have not used the “som” builds as I thought they were for the B Series SoM parts specifically and I am using the Boron prototyping device. Do I have a fundamental misunderstanding about the firmware builds? Once you confirm the build to use I will do the update.

I have used the 1.1.0-rc.2 builds during this troubleshooting exercise, I just neglected to update prior to posting my inspect results. Here is the inspect output with the Boron 1.1.0-rc.2 firmware (not SoM).

Platform: 13 
Modules
  Bootloader module #0 - version 301, main location, 49152 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
  System module #1 - version 1101, main location, 671744 bytes max size
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      Bootloader module #0 - version 201
  User module #1 - version 6, main location, 131072 bytes max size
    UUID: <long string redacted>
    Integrity: PASS
    Address Range: PASS
    Platform: PASS
    Dependencies: PASS
      System module #1 - version 1101

Oops, yeah I added the -som by mistake :slight_smile: edited above

Ok that last particle serial inspect looks good. Does that still give you logs like before where the modem won’t turn off and claims it’s already on? If so I’d say we should proceed with a replacement device for you. cc: @mstanley

If I flash the pre-built cloud debug tool at this point, the device goes into an error state, flashing red. I’m unable to get it in listen mode in this condition so I can no longer run the serial inspect. In order to recover, I have to flash the system part again for 1.1.0-rc.2. I assume the pre-built tool overwrites the system code with something that is incompatible with the bootloader. The pre-built tool is significantly larger than when I build it.

After recovering and flashing a version of the cloud debug tool built for 1.1.0-rc.2, I see the same log output.

deviceID=<redacted>
manufacturer=u-blox
model=SARA-R410M-02B
firmware version=L0.0.00.00.05.06 [Feb 03 2018 13:00:41]
ordering code=SARA-R410M-02B
IMEI=<redacted>
IMSI=u-blox
ICCID=<redacted>
0000021722 [app] INFO: enabling trace logging
attempting to connect to the cellular network...
0000138924 [gsm0710muxer] ERROR: The other end has not replied to keep alives (TESTs) 5 times, considering muxed connection dead
0000138924 [gsm0710muxer] ERROR: The other end has not replied to keep alives (TESTs) 5 times, considering muxed connection dead
0000150579 [hal] ERROR: Failed to power off modem
0000150579 [hal] ERROR: Failed to power off modem
0000171682 [hal] ERROR: No response from NCP
0000171682 [hal] ERROR: No response from NCP
0000305800 [gsm0710muxer] ERROR: The other end has not replied to keep alives (TESTs) 5 times, considering muxed connection dead
0000305800 [gsm0710muxer] ERROR: The other end has not replied to keep alives (TESTs) 5 times, considering muxed connection dead
0000317405 [hal] ERROR: Failed to power off modem
0000317405 [hal] ERROR: Failed to power off modem

With this configuration, I have the following additional issues:

  • The device can be put in listening mode but serial inspect usually times out.
  • The device will not go into safe mode but will go into DFU mode. When I attempt to go into safe mode the device just resets and begins running user code.

To be honest I'm not sure why, but maybe that is compiled as monolithic and not compatible with the newer bootloader (which would be unusual). Maybe @rickkas7 can straighten us out.

When you flash modular system 1.1.0-rc.2 and tinker, I know you can't see the logs but will the device connect at all?

This may likely be due to debugging interfering with the serial commands. This bug will be fixed in a future Device OS update.

Hmm, that is strange. When serial inspect doesn't timeout, what does it look like in this mode? You might have to disable logging and recompile with that firmware to get serial inspect to work.

No. This device has not connected under any circumstances since this began.

Are you asking if it will go into safe mode? I have not seen anything that allows it to go into safe mode.

I’d say we’ve done our dew diligence here, thank you for attempting some soft fixes. I’ll let you and @mstanley take it from here on that replacement device.

Thanks for checking into this @BDub :slight_smile:

And thank you Tom for taking the time to go through these steps.

If you’ve not already, please go ahead and file a ticket for a replacement, referencing this topic. We’ll be sending you a return label as well, as we’d like to get this unit into the hands of engineering to dig into this further and see what went wrong here.

Thanks everyone!

EDIT: As an added note, I’ll be updating the topic title to specify the Boron specifically, as I cannot say for certain this issue exists in our other R410M-02B devices. This will be helpful for others searching on this issue.

@mstanley, I recently became aware of an issue with the R410M-02B that can render the device unusable. There is a FW update available from u-blox. Has Particle looked at the relevance of this issue to the Boron devices?