Intermittent dropped BLE packets with AndroidBLEExample

,

Hi,

I had previously tried getting the rn-ble-example to work, but when I saw what the exported source looked like, it was too unwieldy to follow that path. So I turned to the Android and iOS BLE Examples. I implemented JoinNewNetworkRequest/ScanNetworksReply for the Android example, but BLE on Android still seems odd.
Sometimes, I'll never get my onRequestResponse() callback, even though the P2 always enqueues a reply. The log from the P2 will look something like this,


You can see that after the echo request (type 1) is replied, the P2 gets a SCAN_NETWORKS_TYPE (506) and replies. The phone never responds, and since I implemented a basic retry strategy, the phone asks again and the P2 again replies.

I do however see that read() inside of BleRequestChannel.java does get called when the P2 replies, but it seems like the first two bytes that become the payloadLed are random data, and since a random uint16 is usually a lot larger than the intended payload it never makes it to this.readResponse(packet);. I'm trying to dive into the Device OS with BLE_CHANNEL_DEBUG_ENABLED and compare the actual buffers, but I'm still working on that. Has anyone seen this or know where I should be looking? I may try to turn off encryption next and see if I still get issues.

Also if the P2 is struggling to connect to a network, it will never return the ScanNetworksReply with networks, I'm assuming this is due to the wifi radio being in use but it doesn't look like an error is returned but I may be looking for the wrong thing.

Thanks for sharing these logs! Can you share a bit of the cod needed to reproduce (feel free to DM if you don't want to post it publicly)? I'd like to dig into this a bit.

1 Like

Hey Colleen,

I'm actually having trouble recreating the first issue, its possible that simply power cycling the P2 fixed it. If I run into it again I'll try to post more info on it but in the meantime I'll send you the ble-provisioning.ino for now, thanks!

With the second issue of the P2 refusing to find wifi networks when it's retrying a connection, that happens consistently. In fact, even after I configure the wifi credentials the first time over BLE or through the CLI, the P2 will connect to the cloud but disconnect within 2 minutes and get stuck in a loop trying to connect. Once it's doing that I can't get scan for wifi networks.

0000012368 [comm.protocol] INFO: Received DESCRIBE request; flags: 0x04
0000125963 [ncp.rltk.client] INFO: disconnected
0000125965 [net.ifapi] INFO: Netif wl3 link DOWN, profile=NONE
0000125968 [net.rltkncp] ERROR: linkOutput up=1 link_up=0
0000125972 [system.nm] INFO: State changed: IP_CONFIGURED -> IFACE_LINK_UP
0000125975 [hal] INFO: DNS server list changed
0000125977 [hal] INFO: DNS server list changed
0000125980 [system.nm] INFO: State changed: IFACE_LINK_UP -> IFACE_UP
0000125999 [ncp.rltk.client] INFO: Try to connect to ssid: MoCAccino 5 GHz
0000126002 [net.lwip_rltk] INFO: is_promisc_enabled
0000126003 [net.lwip_rltk] INFO: get_eap_phase
0000127737 [system] ERROR: sock_send returned -1 118
0000127741 [comm.dtls] ERROR: mbedtls_ssl_write() failed: -0x4e
0000127748 [system] ERROR: sock_send returned -1 118
0000127753 [comm.dtls] ERROR: mbedtls_ssl_read() failed: -0x4e
0000127759 [comm.protocol] ERROR: Event loop error 34
0000127765 [system] WARN: Communication loop error, closing cloud socket
0000127771 [system] INFO: Cloud: disconnecting
0000127775 [system] INFO: Cloud: disconnected
0000130296 [ncp.rltk.client] INFO: disconnected
0000130797 [ncp.rltk.client] INFO: Try to connect to ssid: MoCAccino 5 GHz
0000130800 [net.lwip_rltk] INFO: is_promisc_enabled
0000130801 [net.lwip_rltk] INFO: get_eap_phase
0000135006 [ncp.rltk.client] INFO: disconnected

Edit: I wasn't seeing the first issue because I was compiling the Device OS with BLE_CHANNEL_DEBUG_ENABLED. As soon as I put another device on 5.7.0 without it I was seeing the issue again. Unfortunately that means I cant quite see what the device thinks it's sending though.
For the second issue, I think part of it may be because SYSTEM_THREAD(ENABLE); wasn't present in the demo code, so it was never my code as soon as the P2 was doing network tasks. I don't know why not having that line would also cause the device to disconnect from a network after 2 minutes, but it seems to help. Now it only has issues on the network on my office building.

Ok so I believe I've solved the initial issue.

This was the original code for onCharacteristicChanged()

@Override
public void onCharacteristicChanged(@NonNull BluetoothGatt gatt, @NonNull BluetoothGattCharacteristic characteristic) {
     super.onCharacteristicChanged(gatt, characteristic);

     runOnUiThread(() -> self.requestChannel.read(characteristic.getValue()));
}

I think the problem is that when runOnUiThread() get called, if we're not already in the UI thread it gets put onto an event queue. That can be an issue if the parameter characteristic.getValue() is passed because that will evaluate to the latest cached value, and not the value we received when onCharacteristicChanged() got called.

To test this, I changed the read() function to take two byte array parameters, the characteristic value now and the one that gets called by characteristic.getValue().

This is what the call looked like,

@Override
public void onCharacteristicChanged(@NonNull BluetoothGatt gatt, @NonNull BluetoothGattCharacteristic characteristic) {
     super.onCharacteristicChanged(gatt, characteristic);

     byte[] characteristicValue = characteristic.getValue();
     runOnUiThread(() -> requestChannel.read(characteristicValue, characteristic.getValue()));
}

Then after a while, I'd hit the state where those two values don't match. The characteristic.getValue() evaluates to the latest packet received and not the one that was received at the time the event was queued.

onCharacteristicChanged: 6e400023-b5a3-f393-e0a9-e50e24dcca9e
> 4a0141040fa9d0f933cbfadee6ca389376b084e299ccefa74c91b3d8229eb81aa4f9b03982fea4448981e0f3272cdbf79d029125d4834977a3802643367e1dc4138f388d410483046374f8bc9ce1e9bb7716318cbaf704db92749cbca77b5548678308ff2536662ce83cb391f2ae262e88a00bf8ebcbaf727c1691b598f0a572610d68a6ab1120117e4ea9a771aba0fc83e97e1232d0760b3b7c4326b941ea383133eef475c63341049c5d06bb81440e1cdca41214fff05e050cd1f586a9b0320ed0c2f85f8707c82d33e0c7312db03884f31d8e00925d84ce702e193dee6dcc60ffdee06a21ab4e8541048343b298496d433115
> read
onCharacteristicChanged: 6e400023-b5a3-f393-e0a9-e50e24dcca9e
> 8ab6e1606cf6907d611ff4102b0eb243c58c489080cb142a445c9fa317b14a3c4ca43ec80c41f5b6e545039dcc7718268ea5fcc56f93ac205cdd938327e6eaecccd2e8bbb06eb1b29a74787aa8e2d1bdce084f6f48c1fcd9a8000300174104074b3612008592f66d5e019d0977abb092af331e757eb1601b7ea56445c8dee9e1f357e5e9f6839a7ddaf70b79e14588ea9e0cf4aae8bf0fce2ab155d60004c04104717e35452e752b5f576866fa808fe890da7591352952953bc8a12a734927f27527ce9a5bbbe73ab7bbc4272c43a2caa62631202ee10eae2a7d2d9829a34fcff1209955f8adf29a4b082e4ea5c549d7db117f28
onCharacteristicChanged: 6e400023-b5a3-f393-e0a9-e50e24dcca9e
> c5b1a93dd1d331c5a48ea81bba19
CVALUE DOES NOT MATCH
> 8ab6e1606cf6907d611ff4102b0eb243c58c489080cb142a445c9fa317b14a3c4ca43ec80c41f5b6e545039dcc7718268ea5fcc56f93ac205cdd938327e6eaecccd2e8bbb06eb1b29a74787aa8e2d1bdce084f6f48c1fcd9a8000300174104074b3612008592f66d5e019d0977abb092af331e757eb1601b7ea56445c8dee9e1f357e5e9f6839a7ddaf70b79e14588ea9e0cf4aae8bf0fce2ab155d60004c04104717e35452e752b5f576866fa808fe890da7591352952953bc8a12a734927f27527ce9a5bbbe73ab7bbc4272c43a2caa62631202ee10eae2a7d2d9829a34fcff1209955f8adf29a4b082e4ea5c549d7db117f28
< c5b1a93dd1d331c5a48ea81bba19