I2C - two slaves - slave 1 traffic kicks slave 2 to repeat

Have found a curious issue which the community might be interested in. Hopefully am not repeating prior findings.

Photon running system firmware 0.4.4. Application communicates with two I2C slave devices:

  • “A” is OLED display being communicated with via Adafruit_SSD1306 library
  • “B” is a home grown receiver of data

Devices “A” and “B” work a.okay individually within the application when only one of them is used between resets.

Problems occur when both devices are enabled for use: Sending data “BBBBB” to device “B”, “BBBBB” is correctly received by the device. Next, sending data “AAAAA” to device “A”, “AAAAA” is correctly received by the device (which in this case, is displayed).

PROBLEM is, data “BBBBB” is transferred again to device “B”.

The workaround was to first empty (?) the I2C buffer (selecting device “B”) prior to transferring data to device “A” by doing this:

    Wire.beginTransmission(DEVICE_B_I2C_BASE_ADDRESS);    
    Wire.endTransmission(true);

Would anyone like to comment on this unexpected behaviour?

Thanks - @UMD

Do actually consider this as [Solved] by the workaround?

If not, remove this part from the title. Maybe others (who know - unlike me :blush:) may answer here :wink:

@ScruffR, good point. Have removed [Solved] from the title. Let’s see what comes back!

1 Like

One thing that might also help to help you, would be to see your code.
If you could provide a minimal (copy-pasteable) runnable sketch that shows the erronous behaviour you see.

@ScruffR, just whipped up this code which reproduces the fault (it doesn’t display count for some reason…):

https://drive.google.com/file/d/0BzdAC5rcyb9kRXM1WEdQR1JSMXc/view?usp=sharing

The output from the I2C device is “ONCEONCEONCEONCE…” but it should be just “ONCE”.

Could you maybe add some serial debug statements to see whether your count does not get incremented (or overwritten) somehow or your SendData() performs multiple iterations at once?

Does your display.print(count); show the desired increment of count?

Could it be that the display library doesn’t end its transmission correctly?

Just a stupid one: The slave addresses of both devices are definetly different?

Unfortunately I haven’t got any means of testing any code directly (due to lack of hardware ;-)), so I just try to make more or less educated guesses.

1 Like

It sounds to me like device B needs a STOP (not all devices do and multiple starts can be good usually).

If you can change the library accessing device B to do

   Wire.endTransmission(true);

instead of just endTransmission() with no arguments, that would likely fix the problem.

Can you say what device B is so we can look at the specs?

1 Like

@bko, not sure if you have reviewed the supplied source code that reproduced the fault:

https://drive.google.com/file/d/0BzdAC5rcyb9kRXM1WEdQR1JSMXc/view?usp=sharing

this has device B ending as as you suggested, ie

Wire.endTransmission(true);

Looking at the Adafruit OLED library:

void Adafruit_SSD1306::ssd1306_command(uint8_t c) {
if (sid != -1)
{
// SPI
digitalWrite(cs, HIGH);
digitalWrite(dc, LOW);
digitalWrite(cs, LOW);
fastSPIwrite(c);
digitalWrite(cs, HIGH);
}
else
{
// I2C
uint8_t control = 0x00; // Co = 0, D/C = 0
Wire.beginTransmission(_i2caddr);
Wire.write(control);
Wire.write(c);
Wire.endTransmission();
}
}

we see that it ends I2C blocks with Wire.endTransmission();. Particle docs indicate that when there are no arguments supplied, the default is TRUE (agree, this could be incorrect).

Device B is a home grown one, so there are no specs. Anyhow, the issue is not in the device B because it is a slave which can only receive, ie it can only see repeated data streams because that is what is being sent. The source code provided shows that it is only being sent once (which @ScuffR is questioning).

@ScuffR, in answer to your questions:

The display library looks good to the eye, see above, and you can view yourself via Particle WebIDE.

My supplied source code answers your question "The slave addresses of both devices are definetly different?" (answer: yes).

The display did not show count at all for some strange reason, but what I can say is that the "ONCE" was repeated in clumps of two or three repeats, ie

"ONCEONCEONCE" [delay] "ONCEONCEONCE",

where delay matches the delay in calls to the display. So, this negates the requirement for count in any case for the test as it proves the assertion "slave A traffic kicks off transfer to slave B" if, by some weird way, count remained at zero,

As you can see, there is only one call to Send in between delays. If count remained weirdly at zero , then we should have seen:

"ONCE" [delay] "ONCE" [delay] "ONCE"

So, my assertion remains.

1 Like

I just checked the default for Wire.endTransmission(); and it is as you noted, true. Bummer.

I am sorry to hear there are no specs for Device B. It is not correct that a receive-only slave (only writeable) cannot screw things up since it still must ACK every byte by pulling SDA low. If a receive-only device gets a clock that is too fast for it and it gets confused, it can ACK in the middle of someone else’s transaction. It seems rare but it does happen.

Maybe you should try a really slow clock like Wire.setSpeed(20000);

If you have an oscilloscope or bus pirate or other monitor, now is the time to break it out and use it.

@bko, I get what you are saying, but the evidence is not there.

If I define WORK_AROUND in the following code:

void loop() {

#ifdef WORK_AROUND
    // "Empty" the I2C transfer buffer
    Wire.beginTransmission(DEVICE_I2C_BASE_ADDRESS);   
    Wire.endTransmission(true);             
#endif

    display.clearDisplay();        
    display.setTextColor(WHITE);
    display.setTextSize(2);   
    display.setCursor(0, 0);
    display.print(count);
    display.display();

    delay(2000L);

    if (count++ == 0)
    {
        SendData("ONCE");   // Only expect to see this data once in a blue moon....
    }
}

I see the following output from device B (ie it works as expected):

ONCE

I then UNDEFINE WORK_AROUND, recompile, and see device B outputting:

ONCEONCEONCE [delay] ONCEONCEONCE [delay] ONCEONCEONCE [delay] ONCEONCEONCE

Happy to be convinced that it is my code, but given the tests and the results, am 90% sure it is not the issue...

@ScruffR, count is now being displayed, it is incrementing!!

Just to be clear–I don’t think the problem is in your code. I think the problem is in mysterious device B.

@bko, unfortunately, one and the same coder, moi!!

That’s another reason why I am being so pedantic about the test regime - does the testing actually prove that there is a defect in the system firmware? I think that there is a high degree of “proof” that it is, but of course happy to be shown that it is not.

@UMD, I see @bko, who has way more I2C insight than me, has chimed in, so I'll let him take over.

But for this

Granted, the obvious to the eye behaviour of your code does suggest that it only should run once.
But - despite not seeing any evidence for it - variables and buffers can get messed up (e.g. due to "wild" pointers, buffer over-/underruns, ...) hence checking if what you expect is actually happening is never wrong.
Especially if the behaviour is somewhat odd.

And even if you have an explanation/thesis, it might be good to check for alternative explanations too.


Another absolutely non-educated question/suggestion: Would a Wire.endTransmission(true) inside the display lib maybe help something?

@ScruffR, I did not mention in the last round of testing: “count is now being displayed, it is incrementing!!”

Anyhow, I took your point on board, I could well be assuming too much with asserting that this works:

    if (count++ == 0)

So, as you suggested, I added serial debug to SendData(), and can confirm that it was only called once.

You said “Wire.endTransmission(true) inside the display lib” - the display library uses Wire.endTransmission() which is equivalent (according to docs). I admit to not having researched the library source much.

As everything adds up (from my perspective), let’s see if any others come across similar issues in the future. The main thing is that there is a work around that other’s might benefit from.

Will report back if there is anything further to report. Cheers @UMD.

1 Like

I was not thinking of this not working, but of some other code messing with the memory location of count

As this would do

char s[1];
int count;
...
{ 
  if (!count++)  // this definetly works 
  {
    ...
  }
  memset(s, 0, 5); // but this undoes it
}

Sorry for the terse response last night (my timezone)–here are my more complete thoughts:

  1. There is a known problem described in the errata for the STM32F205 hardware i2c with end of transmission. Various work-arounds have been attempted and the 0.4.4 firmware is better but others are still having trouble with certain devices, like the battery gauge that fails after some number of hours.
  2. i2c is a cooperative bus and a slave can mess things up for everybody else if the timing is off. Trying slower timing is easy and could help or not, depending on where the issue is.
  3. Using a scope or other bus monitor, it is usually easy to tell who is screwing things up in i2c. There are only two wires to monitor and the protocol is simple.
  4. Since Device B is a custom device, no one can reproduce your exact problem to help debug it, so if you want to be sure what is going on, probing the devices is probably the only way. If you check the battery gauge threads, that is really what helped the most.

@UMD Your work-around is interesting. I am not sure what it is doing to the hardware, if anything. @mdma and @BDub should probably be aware that doing wire.beginTransmission(addr);wire.endTransmission(true); helps in your particular case.

@bko, thanks for both "terse" and complete responses! They all help and very much appreciated!

Was originally assuming that it would not matter what Device A or B were, and therefore pointing to a potential system firmware issue. Now see that I should be more open to Device A or B being possible issues.

Q1 I wonder, has anyone else who has shared two or more I2C devices seen the same problem?

Have taken on your suggestion:

Trying slower timing is easy and could help or not, depending on where the issue is.

by trying both slow and fast I2C clock speeds:

Wire.setSpeed(20000); // i.e. 20 kHz
Wire.setSpeed(CLOCK_SPEED_400KHZ);

The slow clock output "ONCEONCEONCEONCE [delay] ONCEONCEONCEONCE" whereas the fast clock output "ONCEONCE [delay] ONCEONCE", i.e. the problem remained.

I suppose we just have to wait to see what other findings come back from the community.

Re the workaround, to me it looks like there is a 32 byte buffer set up for EACH addressed slave (which makes sense). Flipping from one slave to the other causes the other buffer to be erroneously output, but not cleared, leaving the data in place for repeated output. The work around clears the buffer which effectively circumvents the problem.

@ScruffR, yep, understood what you were getting at - weird things are 90% buffer overruns, pointers issues, etc. (and may well be the actual issue deep down).

All good.

The buffer index and length are cleared on Wire.beginTransmission(addr)

void HAL_I2C_Begin_Transmission(HAL_I2C_Interface i2c, uint8_t address, void* reserved)
{
    // indicate that we are transmitting
    i2cMap[i2c]->transmitting = 1;
    // set address of targeted slave
    i2cMap[i2c]->txAddress = address << 1;
    // reset tx buffer iterator vars
    i2cMap[i2c]->txBufferIndex = 0;
    i2cMap[i2c]->txBufferLength = 0;
}

And also IF the Wire.endTransmission() reaches the very end of the function (you’ll know it does if it returns 0:

    // reset tx buffer iterator vars
    i2cMap[i2c]->txBufferIndex = 0;
    i2cMap[i2c]->txBufferLength = 0;

    // indicate that we are done transmitting
    i2cMap[i2c]->transmitting = 0;

    return 0;
}

If we hit a timeout in the various routines… we should probably be clearing these buffer index and length in the HAL_I2C_SoftwareReset() routine and we are not currently.

Can you instrument your code to see if timeouts are occurring which might be keeping the buffers “full”.

Are you able to compile locally? If so it would be pretty easy to get you a version of this to try that does force clear the buffers on error.

1 Like

@BDub, excellent, thanks.

The code supplied answers my question in the positive:

it looks like there is a 32 byte buffer set up for EACH addressed slave

Have not added your suggested check of Wire.endTransmission() to the Slave A comms (OLED display) because this is in the AdaFruit display library; but no matter because it is the Slave B buffer that is being pushed out erroneously, not Slave A.

I have checked for the return code of the Wire.endTransmission() for my Slave B communications, and it is returning 0.

But, we know for a fact that the buffer is not being "emptied" because its contents is being continually pushed out.

The other issue is, why is it being pushed out at all?

Am not compiling locally, but can update firmware via DFU as need be.

All good. @UMD