The Photon2 seems to have issues that other devices do not have.
We presently use a range of Particle devices running two apps, all of which have never experienced any problems communications:
Electron
Boron
Argon
For some reason, however, the Photon 2 seems to really struggle with Serial and Serial1 communications.
Anecdotally, looking at debug data streaming through USB cable to a serial terminal on my PC is way slower on the Photon2 than those other platforms.
Programmatically, we consistently miss bytes when receiving large packets at 230400 baud rate on Photon2 on the Serial1 peripheral.
Other evidence that something is fishy on Photon2:
Things we have tried so far
We have tried:
allocating a 16 kB rx buffer using the acquireSerial1Buffer()Particle OS function
The biggest packet we receiving is around 14 kB in size so we thought this might work. Unfortunately it does not. I think this indicates that Particle OS is not able to receive all the bytes from the MCU's UART peripheral, or perhaps that the MCU's UART peripheral is erroring out on some of the bytes? The byte count of what we receive in a Serial1.read() loop remains consistently a few hundred bytes less than expected on a 14 kB packet...
Incidentally, the hardware USART on all platforms does not directly write into your buffer using DMA. The USART peripheral stores the data in a hardware FIFO. An interrupt is generated when the FIFO should be emptied, and if the system is blocked with interrupts disabled, or is busy servicing a higher priority interrupt, the data could be lost. This has always been the case.
OK, but Serial.available() always returns the number of bytes in said FIFO buffer that are waiting to be read, regardless of the hardware USART interrupts... right? Or is Particle OS just copying that hardware FIFO into it's own software FIFO when the "FIFO full" interrupt is triggered?
At a baud rate of 230400, (28,800 bytes per second), I would need to be averaging one Serial.read() per 34 µs, in order to keep the read pointer and write pointer of the USART FIFO from overflowing, and if there are Particle OS system blocks with interrupts disabled, right?
Does acquireSerial1Buffer() function allocate a bigger hardware FIFO for the USART peripheral?
Also, incidentally, when I try using half the baud rate, the problem doesn't seem to get any better...
It's not clear to me why it's not working for you, but the cellular modem on the M-SoM is connected by UART serial and runs at 921600 baud without losing bytes. However, it also runs with hardware flow control, so that might not be a good comparison. SPI uses the same type of interrupts as UART serial and that appears to work correctly at even higher speeds on RTL872x.
You should always read all available bytes from loop without returning. Since loop (or any thread) is only scheduled at 1 millisecond intervals, that can limit the amount of data read. If you have other lengthy operations in loop you should either allocate a large input buffer, or read from a worker thread, or both.
An isolated way to reproduce data loss would be ideal.
I did a quick test of UART serial receiving on the Photon 2. The code is in the repository below.
It uses a FT232 USB to Serial adapter and a node.js program that sends a constant stream of 8-byte packets from the computer to the Photon 2 at 230400 baud.
With the test program sending 2 packets every millisecond, around 16000 bytes/sec, loss is minimal, with 99% of packets successfully transferred. At slightly slower rates, it's 100% successful, tested for hours.
There are two important caveats:
The serial port buffer size must be increased. It used 512 bytes for the test. It fails at much lower rates with the default buffer size.
If you are using 6.3.0 or earlier on the P2, Photon 2, or M-SoM, and have a USB serial logging messages, the slowness of that will completely throw off serial receiving. Once I upgraded to the latest develop (which will be in 6.3.1), the log message for a missed packet doesn't cause a cascade of errors which eventually causes all packets to fail, because the logging slows down the UART reading so much.
We can confirm that in our testing, reducing logging on USB Serial does result in improved performance and decrease in failed packet transfers on Serial1. I think this confirms your theory that slow USB serial is causing cascade failure back to Serial1.
The problem is, we still want to do both (have lots of logging on USB serial, while still receiving all packets on Serial1).
You said that the problem is worse on 6.3.0.
We can confirm that the problem is present on 5.9.0 but does seem to be worse on 6.3.0
You say that 6.3.1 fixes the slow USB Serial problem, however it appears that 6.3.1 is not available yet? See Particle Workbench screenshot:
While we wait for 6.3.1 to become available, it sounds like the only viable alternative for us is to greatly reduce how much logging we are doing via USB serial, right?
The fix will be in 6.3.1 but there is no date for the release yet. However I tested with the develop branch, which is what 6.3.1 will be made from, and it does indeed make Serial1 work significantly more reliably.
So for now, reducing logging helps, but the only real solution is to wait for 6.3.1.