Serial1 connection weirdness

I have a setup where a Photon and a Boron communicate to one another through the Serial1 protocols and it mostly works fine… except when it unexpectedly doesn’t for seemingly no reason at all. They are sitting on my desk and things will be fine for a while, then a situation arises when the Photon will send a message to the Boron and it will be corrupted and arrive back at the Photon, as if the Boron is somehow reflecting the message back to it, but it’s definitely not something in code. Is it possible that it’s something electrical in nature? Something low level?

The code I wrote sends a message from the Photon to the Boron, in this case, “Serial test 5” for example, along with a hyphen and a simple checksum to make sure I am receiving the message at the Boron unmolested. I have attached a picture where it works fine and then, 5 minutes later, borks out and all sorts of spurious information is sensed and flagged as an error. There is no logical reason for this to be happening in code as it should always do either one thing or the other, but not suddenly do something it was not written to do, in this case, the Photon reading back its own data into itself from the serial port, as if reflected by a mirror.

Ignore the temperature errors. I don’t have that physically hooked up at the moment.

Thanks!

Imgur

I just watched what was happening visually with the devices. As soon as a serial string was “reflected” electrically back from the Boron to the Photon, the Boron went into a brief period of green light flashing, meaning it lost cellular connection. But, I am not doing anything in code with the cellular connection. Why would anything happening on TX/RX have any impact on cellular? This makes no sense.

How quickly are you sending your data?
Have you got boundary checks in place?
Showing your code may provide more clues than a purely verbal explanation.

1 Like

Seeing the code would be good for a start. If you haven’t already, use SYSTEM_THREAD(ENABLED); at the start of your code - I am also using Serial1 for a very similar project - bunch of serial bytes into the port with a checksum and I had all sorts of similar issues without SYSTEM_THREAD(ENABLED)

1 Like

Those are good questions. The data is being sent at the standard 9600 baud: Serial1.begin(9600) on both devices. It works most of the time… just sometimes doesn’t. I don’t think I am doing anything which exceeds any legal bounds of the strings and am not working with typical arrays at the moment, but they are not called user error for no reason. The message-receiving code on the Boron is this:

void serialEvent1() {
    String buffer = "";
    while( Serial1.available() ) {
        delay( 3 );
        char c = Serial1.read();
        buffer += c;
    }
    buffer.trim();
    int len = buffer.length();
    if( len == 0 )  return;
    if( len <= 4 ) {
        Particle.publish( "serial", "ERR: garbage", PRIVATE | WITH_ACK );
        return;
    }
    String message  = buffer.substring( 0, len - 4 );
    String checksum = buffer.substring( len - 4 );
    if( checksum != calcChecksum(message) ) {
        String error = "ERR: ";
        error.concat( buffer );
        Particle.publish( "serial", error, PRIVATE | WITH_ACK );
        //  TODO: Request a resend
        return;
    }
    Particle.publish( "serialEvent", message, PRIVATE | WITH_ACK );
    //  TODO: Do something useful
}

String calcChecksum( String inString ) {
    unsigned int checksum = 0;
    int len = inString.length();
    for( int i = 0; i < len; i++ ) {
        int c = inString.charAt( i );   //  get ascii value
        checksum += c;
    }
    //  Convert checksum to hex
    String result = "-";
    char a = hexDigitFor( (checksum & 0x0F00) >>  8 );
    char b = hexDigitFor( (checksum & 0x00F0) >>  4 );
    char c = hexDigitFor( (checksum & 0x000F) >>  0 );
    result.concat( a );
    result.concat( b );
    result.concat( c );
    return result;
}

char hexDigitFor( int value ) {
    if( value > 9 )  return char( value - 10 + 'A' );
    return char( value + '0' );
}

See anything unusual, aside from the particular way I like to format my code? Thanks!

I will look up SYSTEM_THREAD(ENABLED) to understand more about it. Thanks.

OK, one thing virtually everybody on this community had me tell them so far :blush: was to avoid String and try adopting char arrays - that way you exclude heap fragmentation from the list of possible causes.
Next question, why device OS version are you running on your Boron. Prior to 1.2.0-rc.1 there were some incompatibilties with Serial1.available() and Serial1.read() when there was no data in the RX buffer.
If you know the length of your expected data you can try Serial1.readBytes(buffer, len) to rid yourself from the need to check your message boundaries (I wasn’t refering to buffer boundaries before, sorry for the confusion).
It’s also good to add start/stop marks in your messages when you intend to send multiple messages back to back.
Without such markers, if you ever get out of sync with your sender - e.g. due to connection loss -, your current approach may not be able to sync again.

BTW, I know it’s overkill and your approach is probably quicker, but for printing HEX you could use snprintf(buf, sizeof(buf), "%04X", checksum);

2 Likes

If you aren’t using SYSTEM_THREAD(ENABLED) and your device loses the cellular connection (aka the blinking green), then the next time the device finishes an iteration of calling loop() it will defer to the system thread to reconnect. This reconnection process will block for up to several minutes while it attempts to connect before calling loop() again. Thus, as mentioned, SYSTEM_THREAD(ENABLED); is almost certainly needed in your application. Doing so enables your loop() to run independently of the system thread that handles the connection, which is what you need.

Also keep in mind what ScruffR said :slight_smile:

2 Likes

All good questions

Both devices are running 1.1.0 at the moment. I was putting off upgrading both to 1.2.0 until release, but it’s safe to do so now?

I could rewrite to use char arrays. It’s annoying, but doable if the String class does not play well with the heap.

Sadly, the receiving device does not know how long the message should be, as it will change depending upon which part of the program is trying to tell the other device what to do.

I have not even begun to think about start/stop marks just yet or if I need them. I am just starting to work out the basic details of this serial transport layer.

The snprintf might work. My C is rusty (been years since I’ve done anything noteworthy with it), but I know bitwise manipulation well, so after not finding anything useful online, I resorted to it.

Thanks. I will try doing that.

That's where the end-marks come in handy.
With an end-mark you could use Serial1.readBytesUntil()

2 Likes

I at first thought about bracketing the message with start and stop markers of some kind, but then thought "That will just make the message larger and more likely to stall or something” so stopped with that line of thinking, figuring that a checksum would give me everything I need to know about the message health. if the checksum is not whole, then the trailing end failed. If the checksum does not compute, then something happened to the beginning. Maybe there’s some obvious thing I am not seeing.

From what I was reading online, folks were using while( Serial1.available() ) to consume all available bytes and go from there.

Weird that I cannot find readBytesUntil in the official documentation under the Serial heading or at least a suggestion to read about the Stream class to find out more. I supposed they think you’ll read all the documentation at some point.

Since serial communication is asynchronous and your production code may well be busy doing other things between two calls to serialEvent1() (which is - despite the name - not an event driven function but gets executed between iterations of loop()) how would your code be know where the message ends and the checksum starts?

e.g. imagine a RX buffer filled with this

9-4FFsome back2back-456arbitrary length-DEFstring containging a combo 111-011=100-CBA...

in the last string you have something that looks like a checksum but isn't :wink:
How would you break that up?

1 Like

The USARTSerialclass (among plenty others) is derived from Stream and hence inherits all functions therein.
But instead of documenting each of them multiple times (with each and every derived class) they are only documented centrally in the respective parent classes.

1 Like

I wish the documentation were more precise. For SerialEvent it states “A family of application-defined functions that are called whenever there is data to be read from a serial peripheral.” The description makes it seem it is not called UNTIL something is being received on Serial1, at which point it is called and I can process what is coming in. In that paradigm, it would make sense that I should have the beginning of the message and can keep reading until the end of the message as long as I keep processing incoming bytes. Ugh.

I wish they would note that in the documentation for Serial, that it derives from Stream and that anything Stream can do Serial can do as well. And elsewhere for other derived/inherited classes. Perhaps a notation on Serial that says, “inherits from Stream” or something. Oftentimes, documentation will be hyperlinked so you can jump to the parent class to see what other functions are accessible. But thanks for pointing it out all the same. : )

1 Like

That is a long standing sore point I (and others) have brought forward ages ago, but since this is how the legacy behaviour introduced by the Arduino "foundation" works Particle feels reluctant to change that.

e.g. see here
https://github.com/particle-iot/device-os/issues/1351

That's a suggestion we can bring to @rickkas7's attention.

1 Like

Well, if nothing else, I am getting a more in-depth understanding of the strengths and weaknesses of their Serial/Stream implementation.

I wonder if @rickkas7’s code here is still viable and can work on a Boron: https://github.com/rickkas7/SerialBufferRK.

I almost resorted to this library too. My current working code reads a serial stream made up as follows:
(control codes from ASCII table)

0x01 - SOH, start of message
message (convert everything to text so you can use the control bytes)
0x03 - ETX - end of text
checksum (convert everything to text so you can use the control bytes)
0x04 - EOT - end of transmission

This way it is easy to check the incoming stream for a new message (in case something corrupted a previous transmission). Set a flag to show start of message and then Serial1.read() the text into your char buffer until you get ETX (up to a maximum - discard anything at maximum) or Serial1.readBytesUntil(ETX) - set a flag, then read the next bytes into a char array for the checksum until you get to EOT. Then calculate the checksum.

Don’t use any delays like your delay(3) above - its not needed

2 Likes