[SOLVED] Potential UART/Serial Conflict?

Sure @ScruffR @G65434_2 any chance you could sum up what all of your expectations are for Serial4? A minimal code example would be very helpful, looks like that might be above? FYI, just scanning above I don’t believe full duplex will work via the arduino-style API’s since the read and write calls are blocking in the HAL. Also Serial4 and 5 do not support HW flow control.

1 Like

Huh? I was convinced this was how things worked (at least pre HAL)

I could swear when setting a slow baud rate and filling the TX buffer to the brim, I can still receive data while the TX buffer slowly gets pushed out over the wire.

Thanks @BDub, I’m after full duplex async operation (so HW flow control isn’t required - I understand this isn’t available on this peripheral). Interesting to hear these APIs don’t support full duplex. Why in this case can you ‘disable’ full duplex via the halfduplex(true) command?

Oh yes that will work, so in effect full duplex is happening behind the scenes via the ISRs.

However when you are TX’ing via Serial.write() or Serial.print() that call blocks until it’s complete, so you can’t TX and then immediate start a while loop calling Serial.read(); … at least I don’t believe so. Maybe if you created a separate Thread for the writes, but that doesn’t sound like the best solution. It sounds like you want a non-blocking call to Serial.write().

Ok thanks BDub, am I best to setup UART myself using the STM32F2xx standard peripheral library? Or can I make use of the existing usart_hal.cpp?

If you are sending and receiving just one byte at a time I think you should be able to keep up with full-duplex comms with the Wiring API. If you send a large buffer though you won't be able to keep up on the RX side. Is this not working?

The above doesn’t seem to be working no. When calling Serial4.availableForWrite() from the parent function Metro_HAL_UsartTxStart, I see my buffer decrease steadily from 64bytes down to 2. I’m not sure if this is part of the problem. I’ve also tried using Log.info to display ‘data’, called from Metro_HAL_UsartTxStart, but again this data is not what I see coming from the logic analyser.

There was a bug fixed in availableForWrite() in 0.5.3. Make sure you’re at least using 0.5.3, if not 0.6.0. If you are using Log though, it sounds like you are using 0.6.0 already.

How about a simple loopback test, shorting Serial4 TX to RX…

void setup() {
  while(Serial4.available()) Serial4.read(); // flush (can't remember if flush is implemented) 
}
void loop() {
  static uint8_t tx=0, rx=0;
  Serial4.write(tx);
  rx = Serial4.read();
  if (tx == rx) {
    Log.info("TX: %02x == RX: %02x, A4W: %d", tx, rx, Serial4.availableForWrite());
  } else {
    Log.info("TX: %02x != RX: %02x, A4W: %d", tx, rx, Serial4.availableForWrite());
  }
  tx++;
  delay(100);
}

// Should print 
// TX: 00 == RX: 00, A4W: 64
//        to
// TX: ff == RX: ff, A4W: 64

I called this from the main loop, not from with UARTWrp_SendAndReceiveByte which is what I assume you meant?

Ok so after demonstrating above (buffers self clearing as they should) I went back to my code and added a delay statement to see if it was a time issue (not enough time for buffers to clear before the next call) the following change is now allowing the buffers to clear (I have 64 bytes available now after every call as expected) Furthermore I think the data I am receiving is now correct, or at least very close to being correct.
Any ideas why this would help? What is a more elegant way of solving this?

static uint8_t UARTWrp_SendAndReceiveByte(uint8_t data)
{ 
  Serial4.write(data);
  data = Serial4.read();

  delayMicroseconds(2000);
  return(data);
}

To add to my above comment, despite allowing the buffers to clear, I think the delay statement may be introducing timing issues. A lot of the data I’m reading back is only half correct (an improvement as before the data I was reading back from registers was not even close).
Is there a way I can manually clear the buffer, or increase the size? Although probably a bit of a hack, I think I would get away with a >200 byte buffer.

Just for the records, it's to be expected that these readings are not the same, due to the asyncronicity between HW interface and SW.
The SW is that much quicker than the serial HW communication, that the RX buffer won't contain the just sent byte by the time the first read happens. So you'd expect the RX to lag behind at least one byte.

This may also (partly) answer this

But for a definetive answer we'd need to see how you're calling that function. If you call it in a tight loop, the SW will still outrun the HW despite the extra cycles needed for function calls.

About this

The possibility to allow for a user controlled (at least in size) buffer is is also a long standing proposal of mine and it got added for USB virtual coms, but not for HW interfaces (yet).
https://docs.particle.io/reference/firmware/photon/#acquireserialbuffer-

One way to ensure all your outgoing data has been sent already before proceeding is to call flush()
And in order to drain the RX buffer, I usually do this

  while(Serial.read() >= 0);  // read all bytes from RX buffer till it signals EMPTY (-1)

I was incrementing tx before the comparison actually… and the Serial4 buffer could start up with a glitch or something and require being flushed first to ensure things are in sync. Code example edited in my original post. If the Wiring calls are truly blocking then there shouldn’t need to be any delay added. Looks like the data sent, is being received properly though… so I’d say there is something wrong with your slave device @G65434_2.

1 Like

I'm convinced the calls are blocking, but only for the SW side of things (place all - e.g. when providing a string - the data in the TX buffer), but I doubt they are blocking till (all) the bytes actually got clocked out to the wire including the stop bit(s).

But this could easily be checked by taking the execution time of a Serial4.write() command for 2400 and 115200 baud.
If the blocking time is in fact affected by the baud rate, I stand corrected.

I'll go and do that, the clumsy way because I can't be bothered to follow the paper trail through the sources to find the actual code that does the "bit banging" :blush:


Update:
I've now done that test with this code

#include "Serial2/Serial2.h"

#if (PLATFORM_ID == 10)
SYSTEM_MODE(MANUAL)
#include "Serial4/Serial4.h"
#include "Serial5/Serial5.h"
#endif

#define COM Serial1

char dmy[63];
int baudrate[2] = { 115200, 2400 };
int br = 0;

void setup() {
}


void loop() {
    COM.begin(baudrate[br]);
    delay(100);
    
    uint32_t us = micros();
    COM.write((const uint8_t*)dmy, sizeof(dmy));
    us = micros() - us;
    Serial.printlnf("%6uµs for 63 byte @ %6d", us, baudrate[br]);
    
    while(COM.read() >= 0);
    
    COM.end();

    br++;
    br %= 2;
    
    delay(1000);
}

and got thes times on a Photon Serial1 SYSTEM_MODE(AUTOMATIC)

   152µs for 63 byte @   2400
   157µs for 63 byte @ 115200
   152µs for 63 byte @   2400
   157µs for 63 byte @ 115200
   152µs for 63 byte @   2400
   158µs for 63 byte @ 115200

I guess the reason for the higher baudrate being slower is the fact that the RX interrupt is already hammering the µC more vigorously while still putting data into the TX buffer.
But the otherwise expected significatnly longer execution time with lower baudrates is nowhere to be seen.
The expected time required if the full TX was in fact blocking would be
630 x 416.6µs = 262.5ms @ 2400 vs. 630 x 8.7µs = 5.5ms @ 115200
(630bit due to start and stop bit - would actually be a bit more since startbit is a 1.5bit)


Update:
These are the times on an Electron Serial1 SYSTEM_MODE(MANUAL)

   163µs for 63 byte @   2400
   168µs for 63 byte @ 115200
   163µs for 63 byte @   2400
   172µs for 63 byte @ 115200
   163µs for 63 byte @   2400
   166µs for 63 byte @ 115200

and Electron Serial4 SYSTEM_MODE(MANUAL)

   177µs for 63 byte @   2400
   186µs for 63 byte @ 115200
   173µs for 63 byte @   2400
   187µs for 63 byte @ 115200
   174µs for 63 byte @   2400
   186µs for 63 byte @ 115200
1 Like

Nice quick and dirty test @ScruffR! I was pretty sure about the blocking behavior though, so I added another quick and dirty test to compare. I used a cool library @jvanier wrote called Benchmark to test these kinds of things and I got different results despite the fact that they both use micros() for timing. I was kind of hoping that it would have just validated your results, but it seems to match your calculations for baud rates instead. Might be some kind of compiler optimization going on there, or some other bug with micros(). I also changed things a bit to see something on the scope and FTDI tool, looks ok there.

micros() said:     166µs for 63 bytes @ 115200
Benchmark said:   4798µs for 63 bytes @ 115200
micros() said:     162µs for 63 bytes @   2400
Benchmark said: 254144µs for 63 bytes @   2400
micros() said:     166µs for 63 bytes @ 115200
Benchmark said:   4850µs for 63 bytes @ 115200
micros() said:     162µs for 63 bytes @   2400
Benchmark said: 254144µs for 63 bytes @   2400
// This #include statement was automatically added by the Particle IDE.
#include <benchmark.h>

#if (PLATFORM_ID == 10)
SYSTEM_MODE(MANUAL)
#include "Serial4/Serial4.h"
#include "Serial5/Serial5.h"
#endif

#define COM Serial1

int baudrate[2] = { 115200, 2400 };
int br = 0;

void setup() {
}

void loop() {
    COM.begin(baudrate[br]);
    delay(100);
    
    uint32_t us1 = micros();
    COM.write("1UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU"); // '1',62 x 0x55, char 'U'
    uint32_t us2 = micros();
    Serial.printlnf("micros() said:  %6uµs for 63 bytes @ %6d", us2-us1, baudrate[br]);
    
    uint32_t duration = Benchmark.measure([&] {
        COM.write("2UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU"); // '2',62 x 0x55, char 'U'
    });
    Serial.printlnf("Benchmark said: %6uµs for 63 bytes @ %6d", duration, baudrate[br]);
    
    while(COM.read() >= 0);
    COM.end();
    br++;
    br %= 2;
    delay(1000);
}
2 Likes

Hmm, interesting and disconcerting too :pensive:
(Update: Further investigation has revieled a flaw in that last test, the TX seems in deed to be non-blocking - as exepcted :wink: )

But in this case Otherwise my ParticleSoftSerial whould behave better than the HW interface, since it’s fully interrupt driven (gloat) :innocent:
It’s just not suitable for baudrates > 35000 due to the interrupt latency :pensive:

2 Likes

If the internal buffer is 64 bytes and you’re comparing two sequential 63-byte writes, maybe the second write is slower because the first string is still in the buffer? Might be interesting to repeat the benchmark with a flush in between COM writes. My guess is that both micros and Benchmark are accurate but the two writes (empty buffer vs full buffer) are not identical.

1 Like

So the function UARTWrp_SendAndReceiveByte is called twice within parent function Metro_HAL_UsartTxStart as shown above. It continues to be called while STPM_com.txOngoing is true by the looks of the while statement. Does that answer your question?

while(Serial.read() >=0) didn't seem to make any difference, however replacing my hack of a delay statement with Serial4.flush() does allow the buffers to empty, however I'm still left with some registers seemingly containing incorrect data.

Although I'm well prepared to admit I've done something stupid! I don't think the issue is with the slave device. I've verified 19 packets in a row via logic analyser (slave returning freshly written configuration data) and they are all correct. I think there must still be issue either with the UART RX or some of the other code (almost all of which is from the ST libraries so it seems unlikely).

Thank you, that makes sense!
With the first one the buffer gets filled and so the subsequent one blocks by default to just shove one byte after the other in as the background process trickles the already present byte out over the wire.
:+1:

@BDub, would you care to repeat the test with a delay(500) or COM.flush() between the write blocks?

When you say you still see registers with incorrect data - could the data be “shifted” and valid for a different register? I didn’t see anything in the code snippet that synchronized the tx and rx, so you might be receiving byte1 when you’re expecting byte2.

You may want to wait until you have something to read (while(Serial.available() == 0) before you call Serial.read() in your tx/rx wrapper function. You also want to clear the TX and RX buffers before you start the tx/rx block - ScruffR mentioned using both Serial.flush() for the tx, and while(Serial.read()>=0 for the rx.

I would not depend on Serial.flush() in your wrapper function, that won’t guarantee that you’ve waited until the next byte is received. It won’t hurt, but it’s not enough.