Boron 404X i2c issue with DS3231

I have a simple code I’m running on a Boron 404X (OS 4.0.0) to try and hunt down an error in my data logger code. I was bringing the full data logger code over from an Arduino so I could leverage the Boron’s cellular capabilities.

The code I am running sets up the DS3231 to trigger Alarm 1 at 1Hz. All the ISR does is set a trigger flag. In the loop, if the trigger flag is set, the program resets the trigger flag, turns off the alarm, prints the current time out via Log.info and Flips the LED state (so the LED blinks at the alarm rate). If have put the code below (I hope it is formatted correctly)

(I don’t actually need the alarm at 1Hz, but it speeds up catching the error)

#include <Particle.h>
#include "DS3231.h"
#include <Wire.h>

SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(SEMI_AUTOMATIC);

// Define Hardware Pins
#define ClockINT D4
#define LEDPin D7

// Set Up the Objects
SerialLogHandler logHandler(9600); // Serial log output
DS3231 cdRTC; // Clock Object for the chrono dot RTC

uint32_t sampleCounter = 0; // Used to track how many samples have been counted
bool blink = 0;

// DateTime variables
DateTime currentTime;

// Alarm 1 Variables: Used for triggering the start of a sample.
byte alarmBits1Hz = 0b00001111; // Alarm will trigger every second.
bool alarmDayIsDay = false; // alarmDay is the day of the month NOT day of the week
bool alarmH12 = false; // Clock hours in 00-23.  (true) if hour is 1 - 12, (false) if hour is 00 - 23
bool alarmPM = false; // Not used in 24 hour mode.  (true) if 12-hour time is PM, (false) if AM

// State Variables 
volatile byte tick = 0; // Interrupt signaling byte.  When an alarm is triggered, tick is switched to 1.

void setup() {
    
    // Wait for a USB serial connection for up to 30 seconds
    waitFor(Serial.isConnected, 30000);
    
    // Start the i2c
    Wire.begin(); // Begin I2C communication

    // Set up RTC and turn off Alarm 2
    cdRTC.setClockMode(false); // Set the clock mode to 24-hoour.
    // Set minutes to 255 so there can never be a match.
    cdRTC.setA2Time( 255, 255, 255, alarmBitsBurstStart, alarmDayIsDay, alarmH12, alarmPM);
    cdRTC.turnOffAlarm(2); // disable Alarm 2 interrupt
    cdRTC.checkIfAlarm(2); // clear Alarm 2 flag
    
    // Connect to the particle cloud
    Particle.connect();
    
    // Set the alarm to trigger at 1Hz
    cdRTC.setA1Time(0, 0, 0, 0, alarmBits1Hz, alarmDayIsDay, alarmH12, alarmPM);
    // The alarm time is set to all zeros because the alarmBits should trigger the alarm        
    // at 1Hz and esentially ignore the alarm time.
    cdRTC.checkIfAlarm(1); // Clear the Alarm 1 flag:  This brings the interrupt pin back high        
    cdRTC.turnOnAlarm(1); // Enable Alarm 1 interrupt
    
    // Setup the interputs
    // RTC interrupt
    pinMode(ClockINT, INPUT_PULLUP); // Interrupt pin brought high so it can be brought low by the RTC alarm.
    attachInterrupt(digitalPinToInterrupt(ClockINT), isr_TickTock, FALLING);
    
    // Set up the LED and turn it off
    pinMode(LEDPin, OUTPUT);
    digitalWrite(LEDPin, 0);

}

void loop() {
    
    if (tick){
        
        // Set tick back to zero
        tick = 0;
        
        // Clear the alarm
        cdRTC.checkIfAlarm(1);
        
        // Get the time
        currentTime = RTClib::now();
        
        // Increment the sample counter
        sampleCounter++;
                  
        // Print the time
        Log.info("%06d %04d/%02d/%02d %02d:%02d:%02d", sampleCounter, currentTime.year(), currentTime.month(), currentTime.day(), currentTime.hour(), currentTime.minute(), currentTime.second());
        
        // blink the led
        blink = !blink; // Cycle the LED state
        digitalWrite(LEDPin, blink);
        
    }
    
}

void isr_TickTock() {
  tick = 1;
  return;
}

After something like 12 hours the boron will stop. It’s like it either doesn’t catch the trigger or fails to flip the alarm flag to turn the alarm off. When it has stopped I can usually just trigger the alarm pin manually get everything going again.

I have tried a couple different DS3231 breakout boards and a couple different libraries for the DS3231.

I don’t have a ton of experience in embedded systems and I’m at a loss for how to debug this.

@Levi_Gorrell ,

Welcome to the community!

First, let me say that simplifying the code and adding the comments makes it much easier for folks to help you - great job!

I have run into issues like this and they can be frustrating. I don’t see anything wrong with your code (there are folks who are better at this than I am so please don’t take too much comfort in this) but there are a couple things you might try to debug.

Since it only fails after 12 hours or 43,200 tries, this is likely an edge case.

  1. Sometimes FALLING can fail if your pin does not fall fast enough. Likely a long shot but, you may want to see if your breakout board might have debouncing on this pin.

  2. You could add a test like this in your main loop:

if (tick == 0 and digitalRead(ClockINT)==1) Log.info("Something is wrong"); 

and then see if you can tie it to something. Particularly if you also add the next item.
3) The Boron only has one i2c bus (unlike the good old days with the Electron) so there could be something else tying up the bus and causing you to miss your 1 second window. Again this is a long shot but what if you added a little to your code (inside the tick conditional):

// Once you enter the conditional
unsigned long timestamp = millis();
// Then, add a the print statement at the end
Log.info("Completed in %i ms with tick = %d and alarm = %d",(mills()-timestamp),tick, digitalRead(ClockINT));

Perhaps this would give you a little more insight into what is going on.

Good luck!

Chip

3 Likes

Thanks for the reply and the suggestions!

At least most of the time the falling edge of the interrupt is nice and clean. My oscilloscope is showing a transition time on the falling edge of ~50 ns. I have used two different DS3231 breakout boards (Right now I am using the Adafruit DS3231 Precision RTC Breakout Board) and neither show debouncing on the INT pin. The breakout boards are pretty basic: a couple 10K pull up resistors for SCL and SDA, and a decoupling capacitor.

I have also probed the SCL and SDA lines. Both signals I think look good. Nice and square at least.

I have implemented your code suggestions and I'll run until it fails.

3 Likes

My setup quit working over the weekend, but with the advice from Chip I have found that the error is caused by the the DS3231 alarm flag not being reset. Why still remains a mystery. I’m not sure if it’s a momentary failure in the i2c or something blocking in the DS3231 library (though I have looked over that library and I don’t see anything crazy). I have pulled the relevant section of code out of the library that checks the DS3231 alarm flags and have pulled that into my loop. I am monitoring that output to see what might be going wrong. I’m also monitoring the i2c clock and data line with my oscilloscope to see if I can capture anything there.

1 Like

I'll keep posting my progress here just in case this proves helpful to some other poor, confused, and frustrated soul.

It's definitely a failure in resetting the DS3231 alarm. After catching several failures I implemented a simple do/while loop. The loop would first try to reset the alarm. Then a digital read of the alarm pin would check to see if the pin was actually back high. If not, I would just keep sending the reset byte (in actuality just one bit of a status byte) to the RTC and check to see if the reset took. I had a counter count how many times it would take to actually reset the clock's alarm bit. The vast majority of the time the RTC alarm trigger is reset on the first try. Every so often (about 12 hours or so) the reset byte needs to be sent again (a total of two attempts). Once in the past week the reset byte needed to be sent three times.

As I've been digging into this more I've also noticed that the time from the RTC gets messed up every now and then (again something like once or twice every 12 hours). I'll get a time stamp of 2165/156/165 165:165:165 (so hex 0xff 0xff 0xff 0xff 0xff 0xff). Digging some more I found that the alarm fails to reset when I get 0xff off a read of the alarm status register. I think there is a temporary failure of the i2c every now and then. I believe it's with the Boron. I have successfully used this exact same RTC breakout board with an Arduino Mega.

I'm trying to capture a failure of the i2c on my oscilloscope. I'll see what that shows.

2 Likes

I've finally caught the problem with the i2c on my oscilloscope. What I ended up doing was looking for a bad read (read 0xff) of the alarm status (0x0f) register on the DS3231 right after the alarm triggers. When I got the bad read, I shut down the ISR freezing my oscilloscope on that last trigger.

I saved the bad waveforms, captured an example good waveform, and plotted them up.

The first plot shows the signals from the SDA and SQW (Alarm signal) lines from the DS3231. I triggered off of the falling edge of the SQW line (Orange line in plots). The blue line is the SDA line.

Looking at the plot you see that I'm getting some pretty intense noise on both channels. The noise lasted for about 4 ms (I'm only showing part of the captured wave form here. I'm able to just see the full noisy section in the full waveform).

To figure out what I should be seeing I captured a good SDA waveform and matched it up in time using the falling edge of the alarm. Here orange is the good waveform and blue the noisy waveform. I have annotated the plot to help me see what's happening. Things don't line up after the first data byte (Send 0x0f), but I think the DS3231 is throwing a No Ack after the address. In a nut shell, the noise is killing the communication.

(Apologies if I got something wrong in the i2c analysis. I just learned all (err...really some) about the i2c protocol so I could decode this)

So......what's causing the noise? I'm powering the Boron with a 3.7 V, 1200mAH battery and 5V from a USB cable. The USB cable has, at various times during this adventure, been plugged into my desktop and two different laptops. Could I be getting noise from the USB? Hmmmm...Something new to monitor.

Thoughts or suggestions?

The voltage on your bad waveform SDA lines looks to be too low, but it's not obvious to me why.

In I2C, the bus is never driven high, it's only pulled high by the pull-up resistors. On the Boron, there's an internal pull-up of 13K that's enabled on SDA and SCL in the MCU, but this is actually too large of a resistance. If you're using a DS3231 breakout board, it may have its own pull-up resistors, 10K is common. That's an equivalent resistance of 5.65K which is close to reasonable. There are actual calculations of what it should be based on the capacitance of the wire, but it's usually between 2.2K and 4.7K.

But even if the pull-up resistance is too high, it mostly affects the shape of the waveform, not the voltage. I'd look to see if there's something that's pulling down on SDA under some condition, because the voltage shouldn't be 2-ish volts ever.

1 Like

Thanks for the advice! You are correct. The breakout board I'm using has 10K pull-up resistors on SDA and SCL.

My setup while I have been hunting down this issue is very simple. A Boron404X and DS3231 on a bread board with a few jumper wires for power, SDA, SCL, and SQW. I have tried a few different DS3231 breakout boards with the same results. I have not tried a second Boron, but I have one on order.

Those 13K pull-up resistors are internal pull-up resistors on the Nordic chip, correct? I wonder more and more if I have a bad chip on my Boron. The new Boron should be here soon.

For now I'm going to run on battery alone and see how far that gets me.

Since figuring out what's actually causing that will be tricky, here's my suggestion for what to try:

It's believed that the pull-up is 5.65K. The voltage looks to be around 2.1V in the bad state, and should be around 3.3V. That implies a pull-down of 9800 ohms, which is interesting because it's so close to 10K and I just eyeballed the 2.1V from looking at the graph.

The current pull-ups are 13K (MCU) and 10K (breakout) making 5.65K. Try adding an additional 4.7K pull-up to 3V3 in parallel on SDA, which will cause an equivalent pull-up resistance of 2.56K, within the acceptable range for I2C.

The reason this is useful is that is useful is even if there is some random 10K pull down occurring under some circumstance, this will be 2.56K on the high side and 10K on the low side which means the high signal will be 3.29V which is safely in the zone where it will definitely be detected as high, much better than 2.1V.

Of course I could be completely wrong about the random 10K pulldown occurring, but I can't think of any other reason you'd have 2.1V on SDA.

3 Likes

Two new developments. My new Boron 404x arrived and I tried yet another test, pairing down the setup and code even more.

First off, my new Boron is showing the same I2C issue as my original one: Periodic, noisy, low voltage on SDA.

My new test got rid of the DS3231 entirely. All I'm doing is sending an address and data packet to nothing and monitoring the SDA signal on my oscilloscope. I have code that looks for lower than expected voltage on the SDA line. When low voltage is seen, the code stops the pulse that my oscilloscope is triggering off of. This has allowed me to pretty reliably catch the bad SDA events (see my plots in a previous post for details).

In my mind, I think this is all coming to show that the Boron 404x has unreliable I2C.

I'm going to try one more test where I run without the cloud connection.

Hi @Levi_Gorrell

I would try putting a second scope probe on your +3.3V supply and then try to detect the fault again. The waveforms look like you are in a brown-out situation with low supply voltage, but in the first waveform (Waveforms_Zoom) above, things magically get better at about 1.2 ms and SDA snaps back up 3.3V. What else is happening in your system at that time? Is something pulling a lot of current out of the 3.3V regulator at that time?

Maybe you could post a picture of your setup and wiring for us to double check?

2 Likes

Thanks for the reply! I have a picture attached of my most recent setup. This is where I am testing the I2C with nothing attached. I'm running an overnight test and will post a picture of my setup with the DS3231 tomorrow.

The blue micro USB cable goes to a laptop that is showing the serial log. Battery is a 3.7V, 1200mAh Lipo battery from Adafruit. The yellow jumper wire goes to D4 and carries the signal that I use to trigger the oscilloscope. Blue wire up top is SDA. Blue wire at the bottom jumps SDA to A0 so I can monitor the voltage on SDA.

Below is a plot from the most recent bad SDA waveform I was able to capture from the setup shown above.

Finally, the code I am running to capture the plotted waveform is below.

// This #include statement was automatically added by the Particle IDE.
#include <Particle.h>
#include <Wire.h>

SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(SEMI_AUTOMATIC);

// Define Hardware Pins
#define LEDPin D7
#define MonitorPin A0 // monitoring the SDA pin
#define TriggerPin D4

// Set Up the Objects
SerialLogHandler logHandler(9600); // Serial log output

// variables
int val = 0;
bool blink = 0;
bool noError = 1;

void setup() {
    
    // Wait for a USB serial connection for up to 30 seconds
    waitFor(Serial.isConnected, 30000);
    
    // Start the i2c
    Wire.begin(); // Begin I2C communication
    
    // Connect to the particle cloud
    Particle.connect();
    
    // Set up the LED and turn it off
    pinMode(LEDPin, OUTPUT);
    digitalWrite(LEDPin, 0);
    
    // Set up the trigger pin and turn it on.  I'm triggering on  a falling edge, like the DS3231 Alarm
    pinMode(TriggerPin, OUTPUT);
    digitalWrite(TriggerPin, 1);

}

void loop() {
    
    // read the voltage on the SDA pin.  Under normal circumstances this should be pretty close to 3.3V
    val = analogRead(MonitorPin);
    
    if (noError){
        
        // bring trigger pin low to trigger the oscilloscpe
        digitalWrite(TriggerPin, 0);
        
        // Send I2C addredd and data: Nothing connected
        Wire.beginTransmission(0x68); // begin I2C to the DS3231 which has the address 0x68
        Wire.write(0x0f); // Status Register that contains the Alarm flags
        Wire.endTransmission(); // end the I2C
        
        // blink the led
        blink = !blink; // Cycle the LED state
        digitalWrite(LEDPin, blink);
        digitalWrite(TriggerPin, 1);
        
        // If the voltage on the SDA pin is too low, flip noError to stop.
        Log.info("%d", val);
        if (val < 3500){ // I think about 2.8 V
            noError = 0;
        }
        
        delay(500);
    }

}

OK that's helpful. I would try removing the jumper from A0 to SDA just in case. Does you error case (val<3500) ever happen? The time at which you are doing the analogRead and the i2c transmission are not really aligned here.

I would also set the scope to trigger on SDA falling at about 1V. Then I would put the second channel on 3.3V and see what's happening to the 3.3V power rail at that time.

If you have good bench power supply for +5V at 500mA, I would try disconnecting the USB cable and drive Vusb with +5 and connect a ground. The device is not meant to be driven from the 3.3V pin, but Vusb is fine. Alternatively a cell phone charger can work well here.

I might also try the test without the triple expansion board--just the Boron. This is a just in case thing but worth doing if all else doesn't work.

Also, in your first waveforms, the SDA line looks fine after 1.2ms, but in your subsequent waveforms you don't show it out that far from the trigger. Does it still recover at that time? If so, what is happening then?

Yes, the error case does happen. That is what shuts down the trigger and i2c so I can see the failure on my scope.

I'll try the power supply and plugging the boron right into my bread board. I'm not sure how to catch the bad SDA in this setup without the SDA to A0. This error is hard to catch since it happens only about once every 12 hours. I don't think my oscilloscope is fancy enough to pause acquisition under specific conditions. Hence the analog read workaround.

If I capture another bad waveform I'll make sure to capture a longer time. the bad SDA seems to persist for a 1-2ms time and then recover. I haven't been able to identify any reason for the recovery either.

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.