BLE Peripheral UART Streaming Assertion Failure

I am currently modifying the BLE UART peripheral tutorial example (https://docs.particle.io/tutorials/device-os/bluetooth-le/#uart-peripheral) to send a data stream to the NRF UART v2.0 Android app. My current firmware sends 10 bytes every 100ms via the Timer class & callback. It always connects and works the first time the argon/ app pair after power-on; however I get an assertion failure (SOS + 10 blinks) after successive ‘disconnects’ and ‘pairs’ between the two. The number of successive pairs it takes to get the assertion failure varies each time.

What causes an assertion failure…? If needed I can post code, but I am really not doing much more than adding a timer to the built-in example Particle provided.

Can you post your code. Software Timers have limited stack size and may be part of the problem.

#include "Particle.h"

// This example does not require the cloud so you can run it in manual mode or
// normal cloud-connected mode
// SYSTEM_MODE(MANUAL);

const size_t UART_TX_BUF_SIZE = 20;

void onDataReceived(const uint8_t* data, size_t len, const BlePeerDevice& peer, void* context);

// These UUIDs were defined by Nordic Semiconductor and are now the defacto standard for
// UART-like services over BLE. Many apps support the UUIDs now, like the Adafruit Bluefruit app.
const BleUuid serviceUuid("6E400001-B5A3-F393-E0A9-E50E24DCCA9E");
const BleUuid rxUuid("6E400002-B5A3-F393-E0A9-E50E24DCCA9E");
const BleUuid txUuid("6E400003-B5A3-F393-E0A9-E50E24DCCA9E");

BleCharacteristic txCharacteristic("tx", BleCharacteristicProperty::NOTIFY, txUuid, serviceUuid);
BleCharacteristic rxCharacteristic("rx", BleCharacteristicProperty::WRITE_WO_RSP, rxUuid, serviceUuid, onDataReceived, NULL);

//My Timer Addition to the Particle Example App
void Timer100msCallback();
Timer Timer100ms(100, Timer100msCallback);
bool Timer100msSend = false;

//Resume Particle Example App
void onDataReceived(const uint8_t* data, size_t len, const BlePeerDevice& peer, void* context) {
    // Log.trace("Received data from: %02X:%02X:%02X:%02X:%02X:%02X:", peer.address()[0], peer.address()[1], peer.address()[2], peer.address()[3], peer.address()[4], peer.address()[5]);

    for (size_t ii = 0; ii < len; ii++) {
        Serial.write(data[ii]);
    }
}

void Timer100msCallback(){
    Timer100msSend = true;
}

void setup() {
    Serial.begin();

    BLE.addCharacteristic(txCharacteristic);
    BLE.addCharacteristic(rxCharacteristic);

    BleAdvertisingData data;
    data.appendServiceUUID(serviceUuid);
    BLE.advertise(&data);
    
    BLE.setTxPower(8);
    
    Timer100ms.start();
}

void loop() {
    if (BLE.connected()) 
    {
        uint8_t txBuf[UART_TX_BUF_SIZE];
        size_t txLen = 0;

        /*
        while(Serial.available() && txLen < UART_TX_BUF_SIZE) {
            txBuf[txLen++] = Serial.read();
            Serial.write(txBuf[txLen - 1]);
        }
        */
        
        for(int i = 0; i < 10; i++) txBuf[i] = 65+i;
        txLen = 10;
        
        if(Timer100msSend)
        {
           if (txLen > 0) txCharacteristic.setValue(txBuf, txLen);
           
           Timer100msSend = false;    
        }
    }
}

@peekay123 Any idea what’s going on here? Perhaps this is more of a question for @rickkas7?

I’m testing @rickkas7 latest BLE Serial UART library and it works great until the connected BLE device looses connection by going out of range which causes a disconnection and usually this causes the Argon to go into SOS :sos: red flash and the code and counter to reset back to zero.

If you are in range and hit the disconnect button on the Serial Terminal app then the Argon does not crash. If you disconnect by going out of range then you will usually see the Argon SOS and restart from scratch.

I’m sending data every second. I’m connecting via the Adafruit Bluefruit app and a Chrome Web Browser BLE Serial Terminal.

We need to figure out a way to prevent the BLE disconnect due to going out of range from crashing the Gen3 device.

@RWB We may be having seperate issues here. My problem is definitely disconnect related however it’s not caused by the argon going out of range. All the firmware I’ve tested has been with the argon within arms reach. I’ve tested on 2 seperate argons running the same firmware, both with the same results. No issues on first connection, however argon will crash after successive ‘disconnects’ and ‘pairs’ of varying amounts.

Actually, if I’m in good range and I connect and disconnect enough the Argon will SOS as shown in the screenshot below where the counter resets.

Seems like disconnecting from BLE can cause this SOS failure for some reason.

@peekay123 @rickkas7 @ScruffR @RWB Has anyone made any progress on this issue?

Not that I have seen.

I’m trying to do this test against this branch with many fixes:https://github.com/particle-iot/device-os/tree/fix/ble/v1.3.1-rc.1.

Will post the result here.

1 Like

Hi there,

There is a fix for this issue: https://github.com/particle-iot/device-os/commit/f65ea91f630eaaab7aa12d92a5c8f169f741ffb0. I’ve tested it using an Argon on my side. Could you also help verify if it works?

Thanks!
Guohui

So you were also seeing the Gen3 Devices SOS after the connected BLE Device went out of range? I’m hoping your fix works if so. I’ll test it.

Actually both of the SOS behaviors you mentioned above is caused by the same bug. The disconnection caused by going out of range is easier to produce the issue. Initiatively disconnecting the connection will invalid the connection handle immediately so that next data transmission attempts will be prevented. But there is still a little chance that the disconnection is happening during a on-going data transmission, which will result a SOS assertion. While disconnecting by going out of range, the connection handle is remaining valid but the connection is actually lost, until a delayed disconnection event raised. That’s why going out of range is easier to produce the issue.

Nice to see you have this issues figured out and fixed!

I use Workbench but have no idea how to compile that altered code so I’ll have to wait until it’s merged with the next firmware release.

Bumping an old thread as I have the same issue on an Argon and OS 3.3.0. It is a project in development, so the code is a bit messy at the moment in order to publish.

It seems to be the same thing, only rather than go out of range my peripherals are intentionally disconnecting themselves prior to going into deep sleep. It only sometimes happens, and the peripherals are two Xenons (not mesh connected) a few cm away. The failure seems to be at some point towards the end of the ‘onDataReceived’ callback.

The callback function checks to see if the message contains telemetry, does a couple of other things and then does a Particle.publish. I have a nagging feeling that I read somewhere you should not do a publish within a callback function. Can anyone confirm this is the case?

From some Log.info statements, it looks like the next BLE scan loop might start executing before the callback has completed. Is this possible if the device is waiting for a publish to complete? Would it be better practice if I maintained a message stack? When BLE UART data comes in, I could set a lock flag, dump the string onto the stack and complete quickly. In the main loop, I could then pop any stack items if the stack was not locked.

If I did that, there is the danger of the main loop locking the stack to pop the top message, the callback being activated then it hanging waiting for the lock. Is there a standard way to deal with locking resources and call back functions?

I found the answer to my own question:

The callback is called from the BLE thread. It has a smaller stack than the normal loop stack, and you should avoid doing any lengthy operations that block from the callback. For example, you should not try to use functions like Particle.publish() and you should not use delay()

I'm guessing that could be one possible cause of the crash.