My Particle Argon is running as a BLE central device (running OS 6.2.0). It loses bytes when it asks a 3rd party peripheral to send a packet that is greater than 244 bytes. To be more precise, it drops exactly every 245th byte during the reception of a packet that was sent using fragmentation.
Background: I have a 3rd party BLE peripheral device by a company named Shelly which I poll for status. I initially used a Python script under Linux on a PC to poll it and I receive the exact number of bytes each poll (message length=774).
When I use my Argon to poll for the same data, when I read the response, I get the following return:
uint8_t rxb[1024];
int r,n = 0, len=774;
while { (r = (BleCharacteristic)pRx.getValue(&rxb[n], len-n)) > 0)
Serial.println(r);
n += r;
}
Serial.println(n);
tty output:
244
244
244
39
771
i.e. after these reads, n=771 every time.
When i compare the Argon's received bytes to the Python PC script's, each 245th byte is missing. This looks like a simple bug in the portion of the Particle BLE stack that deals with fragmentation.
I decided to re-create this problem by making my Photon2 a BLE peripheral that responds just like the Shelly device; however, that failed since the maximum packet length to send (setValue()) is limited to 244. I tried..
As it happens, this is not a critical problem for me since the bytes lost don't have any impact on my setup's operation (the lost bytes are only cosmetic).
I'm more that happy to help further with anything on this issue.
FYI I just tried pre-release 6.3.0. This BLE stack fragmentation issue still exists: Every 245th byte received in a large fragmented message is lost. Willing to test any and every attempt at fixing this.
Also, if I had source code for the stack I'd find and report a fix.
BLE fragmentation is most likely part of the nRF52 code, not part of Device OS. At least I did not see anything about fragmentation in the code in the BLE HAL for nRF52 in Device OS.
If it's part of Nordic SoftDevice, that is not open source.
I can see the spark BLE stack source below the 6.3.0 toolchain directory. I'll have a peek at that first. After I'm thinking I might report this problem to Nordic. Any suggestions?
Are calls in spark_wiring_ble.cpp like this: hal_ble_gatt_client_read()
calls to the SoftDevice stack? (i'm guessing so).
As far as losing bytes, here's what my testing found:
Upon the arrival of a "large packet" (i.e. greater that 244 bytes)
each time BleCharacteristic::getValue() is called asking for 244 bytes I lose 1 byte if there are more than 244 bytes left of the fragmented packet.
for example a message coming in that's 800 bytes long:
1st read 244 - lose 1 byte
2nd read 244 - lose 1 byte
3rd read 244 - lose 1 byte
4th read I ask for 68 and get 65
total read - 797 (lost 3 x (245-244) = 3)
For fun, I changed my loop to ask for only 200 bytes each call to BleCharacteristic::getValue()
1st read 200 - lost 45 bytes (I can tell by looking at the message)
2nd read 200 - lost 45 bytes
3rd read 200 - lose 45 bytes
4th read I ask for 200 and get 65
total read - 665 (lost 3 x (245-200) =135)
So when calling BleCharacteristic::getValue() to retrieve a fragmented packet
if you read n bytes each call, I lose 245-n bytes each call.
I think I'll try changing the spark_wiring_ble.cpp code to allow it to ask for 245 bytes on a call to getValue to see what happens....
Anyway, I wrote a public post on the Nordic support site. I'll see if I get a response.
BTW I did modify spark_wiring_ble.cpp to allow asking the hal for more than 244 bytes - no change.
While I'm waiting for Nodic to get back to me, any chance this loss is simply related to the fact that Particle has not implemented BLE packet fragmentation in their OS?
I changed toolchain 6.2.1 files spark_wiring_ble.cpp and ble_hal.cpp to not check passed getValue buffer lengths (i.e. no 244 limit). Now I receive the full message.
Calls to getValue now returns 245 byte packets each time for the correct (in this test) 798-byte re-assembled message (245+245+245+63).
I'll add this observation to my support ticket with Nordic and see what they say.
I did some more digging. It looks like the bug is in the Particle OS part that sets the BLE MTU size and how it relates it to a message/packet length. (SoftDevice always thinks the BLE MTU is 1 byte greater than what the Particle OS thinks).
I first restored the toolchain to the 244 byte limit. Then I changed the MTU size in my code via BLE.setDesiredAttMtu(). I set it to 240.
First pass, with the MTU set at 240, each call to getValue returns 237 bytes. And each call to getValue loses 1 byte per full 237 byte read.
I modified ble_hal.cpp to allow reading 1 more byte than the set MTU (Particle MTU), but still maximum 244. With the MTU set to 240, it returned 238 bytes each time, and did not lose any bytes. So full message was received.
NOTE: to be clear - I'm not offering the below as a fix. It's only to shine a light on the MTU discrepancy.
I'm done for the weekend. Next week I'll see if I can find the exact location of the error.
The Particle OS (I've tried 4.2.0, 6.2.1, and 6.3.0) BLE connections operate with an MTU of 247 in my implementation.
That yields a data packet of 244 bytes (247-3 overhead).
If I set the MTU in my code to 246 I receive data packets of 243 (246-3).
The above is all correct behavior.
However, this always causes a loss of 1 byte per full MTU packet read via getValue(). It appears that the SoftDevice BLE stack MTU is operating at 1 extra byte which is lost each getValue() read (of full MTU+1 packets). It's clear that any read of the SoftDevice stack for packets must read a full MTU's worth each time otherwise those remaining bytes are discarded by SoftDevice.
I've looked at the Particle OS BLE code (6.3.0), but I see nothing obvious that would cause this 1 byte discrepancy. Admittedly, I don't have a very good debugging environment for the Particle OS, and might have even been looking at the wrong OS code. (Not to mention I don't have access to the SoftDevice API documentation nor am I a BLE expert).
For my purposes (as a hobbyist) I can live with setting the MTU to 246 in my code (making SoftDevice operate at the optimal 247 size), and then modifying ble_hal.cpp to allow reading one byte more than the current Particle based MTU limit. This eliminates the data loss and safely still adheres to the 244 byte Particle OS limit which I dare not violate. I'll leave the actual fix to the experts
Good news: this is not a bug in the Particle OS.
Bad news: Fragmentation is not supported by the Particle OS, as indicated in the Particle online documentation, BLE portion (i.e. it supports maximum 244 byte packets, which avoids fragmentation).
Turns out the negotiated MTU is not the problem. It's now clear to me that both the Particle BLE layer and the SoftDevice engine are operating with the exact same MTU (in my case 247).
(Based on information given to me by Nordic) During BLE fragmentation it is possible to receive payload packets that are as large as MTU less 1 byte long. So what I have been observing (245 byte packets) is proper BLE behavior. Unfortunately, the current Particle OS layer does not allow packets greater than MTU-3 in size, no matter the MTU size.
I'm still happy with my workaround. Adding fragmentation support will clearly require some thought and plenty of testing.