Locked up Boron, ignores ~EN pin: how can this happen?

I’m having intermittent issues with a Boron that’s being controlled through the ~EN pin.

It works fine for hours, then at some point will simply stop responding/won’t boot up, even if I toggle ~EN manually.

At that point, it is completely unresponsive, no LED, no reaction to manually hitting the reset button… nothing but physically removing the battery and re-inserting will bring it back to life.

I’ve verified that, while it is in this state:

  • ~RESET is at 3v3
  • ~EN is (left floating) high and
  • the Boron’s 3v3 output actually at 3v3

Which would seem to indicate that the XC8107 is in fact turned on and feeding VSYS to the on-board switching regulator, and the MCU should be running.

Manually shorting ~EN to ground and releasing has no effect.

I am stumped as to what could cause this behaviour and how bringing ~EN to ground can be different from removing the battery, from the system’s perspective.

Any ideas/input appreciated.
Thanks,
PatD

Is there any chance that external wiring could be providing a back feed to the power rail (even Capacitance) while you pull EN Pin to ground ?

Well, the Boron’s 3v3 out pin isn’t actually connected to anything and the only thing still held high (other than the actual power inputs) are 2 DIO pins–and they’re both configured as inputs prior to ~EN going low.

This issue is intermittent and happens on a couple of my test devices, whereas others never seem to experience it (I have a Boron that’s been running for 10 days without issue, and another that can’t seem to get through the night).

The external ~EN manager now has a watchdog that toggles the pin low for 5 seconds and back on every minute when it gets frozen like this… still doesn’t break me out of these occasional lock-ups.

I’ve caught a module in this odd state a few more times and did some more tests.

The situation is still the same: power applied, ~RESET (floating) high and ~EN floating at something like 3.8V, Boron’s 3v3 out is on but no activity from the module.

So I tried manually shorting out each I/O in turn on . No effect. Then I took a little risk and quickly shorted out the 3v3 output. That actually worked.

The Boron’s 3v3 output momentarily shorted seems like it finally forced the nRF52840 to actually reset. Not certain what this means in terms of figuring out how this is happening…

Does it rule out the XC8107 somehow being latched up? Does it rule out everything that’s on the VSYS rail (like the SARA)?

Kind of at a loss in terms of isolating this, and its killing the possibility of putting these in production. Any ideas welcome.

Ping @rickkas7

Additional info about the setup…

I’ve got an ATTiny841 running on an external 3v3 supply, acting as the power/sleep manager (bringing ~EN down) and an I2C slave to the Boron.

At the end of a wake/sense/publish cycle, the Boron requests a shutdown for X seconds from the power manager slave.

Should this call somehow fail, the main loop would just repeat – so if the Boron was staying on, I’d get a bunch of publishes and see the LED do its thing.

The '841 sleeps the Boron by configuring its connection to the ~EN pin as an output and taking it low. It wakes the Boron after time X simply by releasing the pin (goes into high-Z as an input). When ~EN is released, the line sits at some high voltage (~3.8V). When everything is frozen, shorting it ~EN ground manually has no effect.

This '841 power manager also has two watchdog functions.

Post power-up, it expects an interaction from the Boron within 60 seconds. If this fails to happen, power is cycled through the ~EN pin (5 seconds pulled low) and we try again.

Assuming the Boron did power up and interact with the slave within the allotted time, the power manager has a second timer and expects to be asked to shutdown within 10 minutes. Should this fail to happen, this second watchdog is triggered and power is cycled.

The power manager keeps track of how often the watchdog is tripped and reports this to the Boron on request.

When it’s locked up, and I short the Boron 3v3 out to finally get it going again, the normal loop goes through and I get a report through AWS/MQTT.

I can then see the watchdog trip count, and it is indeed proportional to the amount of time the Boron has been locked up.

Put together, these would indicate that:

  • the boron isn’t simply running the whole time: it goes down, or at a minimum halts normal code processing, as the LED and connect/publish activity stops
  • the power manager is running correctly and continuously, keeping track of watchdog triggers, and actually bringing the ~EN pin low for seconds at a time without managing to reboot the nRF52840

Hopefully, this info can rule out the obvious solutions and help isolate what’s actually going on, or at least help you hint me on where to investigate next.

Gracias,
Pat D

I did not see if the Borons are LTE or 2G3G and how are they powered? Could it be that the enable pin is working, but the starting up sometimes fails due to peaks in power need, when Boron and external devices power up?

Any news on this? I'm pretty sure I'm having same problem...

I’ve got this same issue. Most of the time the EN pin can reset the Boron and the Boron will come back to life, but other times it does not work and I end up in a scenario where EN is pulled high, there is 3v3 power, but the device’s LED and nRF do not come on.

@psychogenic – we’ve got a very similar setup to yours.

We are running v1.5.0 device-os.

Is this related to I2C power being backfeeding to Boron? We use I2C in this application with traditional 4k7 pullup resistors. I’m wondering if it makes sense to use our watchdog uC to pull the I2C lines low for the period before and during the time it pulls the EN pin low.

Any other ideas on what might be the cause of this? Could it be related to the newish power_manager features?

Just wanted to follow up on my last post.

We found root cause on our PCB – a digital input circuit with a strong pullup was backfeeding power to the Boron during EN power cycles, which caused it to get stuck stuck in this state. Our fix for this was to disable the pullup before power cycling using EN.

In my opinion, Particle should make it clear in the Boron datasheet that the EN pin has this problem and provide some simple suggestions on fixing it. Would have been much nicer to know about this before designing a PCB for the Boron.

2 Likes

Would you be able to provide more details around this? Are your pull-up resisters your I2C 4.7K pullup resisters enabled via a GPIO pin and you simply turn it off before any type of EN reset occurs? After reading a different Forum, I am tempted to add the hardware for this in my current PCB design. Deep Reset Tutorial

For me this would be a proactive move as I do have remote Boron but personally haven't had issues with them disconnecting. My Borons do a sleep/wake/sleep cycle and routinely connect/disconnect themselves from the cloud due to being battery powered but was tempted to add this type of hardware. I'd like to make sure I'm preventing a possible future problem vs injecting a new problem as you experienced. In my design, all sensors are turned on for 50-250 ms or so to take sensor readings and then turned off before falling back asleep. Given the info here, I assume I shouldn't have any risk of back feeding power to the boron but not sure if I understand it fully yet. Any guidance or further details would be appreciated!

Our backfeed problem was not from I2C pullups. It was from a digital input that allowed for significantly more current.

When we redo this circuit, I will change it to a circuit whereby we can cut power to both the VIN and LI pins of the boron. We won’t use the EN pin as a watchdog because I’m not sure you can get to 100% reliability on it.

@hwestbrook could you or someone share some more thoughts on how this could be accomplished with an external watchdog and cutting the power to the VIN and Li pins? Would this require a seperate power manager for the watchdog? Any guidance would be appreciated.

For one of our devices, we do the following:

  • ATMEGA coprocessor acting as a watchdog controls two P-channel FETs
  • One FET for LI and one FET for 5V VIN (see schematic excerpt below)
    • To do the LI pin and still use a JST connector, you have to implement a JST connector on your board
  • Electron can reset the timer on the ATMEGA watchdog. In our case, they communicate via SPI
  • Our ATMEGA does a few other things as well, such as count pulses, which the Particle platform is not ideal for

We have two other devices we sell that use Borons in different configurations. On one for those devices we implement EN watchdog, which works, but its not 100% effective.

1 Like

Good stuff… Thank you very much for sharing! I assume there is a way to do this with a simple external real time clock right but you used an Atmega co processor for this functionality as well as the pulse counting functionality?

Yes, we needed the coprocessor anyway, so choose an ATMEGA with enough IO in order to have it act as a watchdog as well.

There are a couple of threads going that mention different ways to reset a Boron via an external watchdog. One via the RST pin, the other via EN, and the last via cutting power to the VUSB. Each seem to have their own set of issues. Which is the recommended “best practice?”

1 Like

Its a good question. I wish Particle would weigh in on the issue with an official support document detailing the different approaches and their shortcomings then allow community comment.

1 Like

There will be an application note on watchdog timers and resetting devices in a month or so.

3 Likes

@rickkas7 Thank you. I am moving away from the Boron because of these issues. I am currently evaluating the E-Series LTE evaluation kit as an alternative. It seems way more stable than Boron. It’s still difficult to justify staying with Particle given the 4-times-cheaper cellular $/mb that the competitor offers, but I am very happy to have E-Series LTE and potentially a future way of resetting the Boron as alternatives.