BLE. discoverAllCharacteristics() hangs

ble
argon
Tags: #<Tag:0x00007fe21fdcbb68> #<Tag:0x00007fe21fdcb8e8>

#1

Hi all,

I’m trying to get my Argon speaking to a Lego BLE Hub. I can discover the device and call connect(), however I found this would hang and then cause an assertion error (10 red flashes). After investigating the automatic flag I tried setting it to false and then the call to connect() succeeded, but I can’t get the characteristics by calling getCharacteristicByUUID(). Trying to call discoverAllCharacteristics() directly also hangs. Here is what I got from the trace logs:

0000016440 [system] ERROR: Failed to load session data from persistent storage
0000053237 [app] INFO: Pre connect
0000053345 [wiring.ble] TRACE: New peripheral is connected.
0000053346 [wiring.ble] TRACE: Start discovering services.
0000053885 [wiring.ble] TRACE: Start discovering characteristics.

I’ll be looking through the device-os source code to see if I can spot the issue, but if anyone has any advice it would be much appreciated!

Just to add - this is on firmware 1.4.2.

Thanks

Mark


#2

Hi,

I’m still looking in to this issue and now have a particle debugger so I managed to get trace logs out of the hal.ble logger:

Serial monitor opened successfully:
0002026836 [hal.ble] TRACE: hal_ble_gap_start_scan().
0002026837 [hal.ble] TRACE: | interval(ms)   window(ms)   timeout(ms) |
0002026838 [hal.ble] TRACE:   160*0.625        80*0.625      5000
0002031792 [hal.ble] TRACE: hal_ble_gap_start_scan().
0002031793 [hal.ble] TRACE: | interval(ms)   window(ms)   timeout(ms) |
0002031794 [hal.ble] TRACE:   160*0.625        80*0.625      5000
0002036749 [hal.ble] TRACE: hal_ble_gap_start_scan().
0002036750 [hal.ble] TRACE: | interval(ms)   window(ms)   timeout(ms) |
0002036751 [hal.ble] TRACE:   160*0.625        80*0.625      5000
0002041610 [ot] INFO: -MLE-----: Send Advertisement (ff02:0:0:0:0:0:0:1)
0002041706 [app] INFO: Pre connect
0002041707 [hal.ble] TRACE: hal_ble_gap_connect().
0002041836 [hal.ble] TRACE: BLE role: 2, connection handle: 0
0002041837 [hal.ble] TRACE: | interval(ms)  latency  timeout(ms) |
0002041838 [hal.ble] TRACE:   36*1.25          0       500*10
0002041839 [hal.ble] TRACE: Attempts to exchange ATT_MTU if needed.
0002041840 [wiring.ble] TRACE: New peripheral is connected.
0002041841 [app] INFO: Post connect
0002041842 [app] INFO: Getting services
0002041843 [wiring.ble] TRACE: Start discovering services.
0002041844 [hal.ble] TRACE: hal_ble_gatt_client_discover_all_services().
0002042376 [app] INFO: Found 3 services
0002042377 [wiring.ble] TRACE: Start discovering characteristics.
0002042378 [hal.ble] TRACE: hal_ble_gatt_client_discover_characteristics().
0002042641 [hal.ble] TRACE: Request to change ATT_MTU from 23 to 247
0002042642 [hal.ble] TRACE: sd_ble_gattc_exchange_mtu_request() failed: 17
0002042646 [hal.ble] TRACE: hal_ble_gatt_client_discover_characteristics().
0002043141 [hal.ble] TRACE: hal_ble_gatt_client_discover_characteristics().
0002073346 [hal.ble] PANIC: AssertionFailure false^CSerial connection closed.

I’ll be looking in to the source code again now I have more details, but if anyone has any advice or is seeing a similar issue then please let me know.

Thanks

Mark


#3

Are you by any chance calling that from a BLE.scan() callback function?
I had a similar issue when doing that. Moving BLE.connect() and anything thereafter out of the callback and running that from loop() did solve the issue for me.

Can you post your code?
What are the characteristics UUIDs you are intending to use?


#4

Hi @ScruffR,

Thanks for your reply. No, I’m using the non-callback version of scan. Here is my code:

const size_t SCAN_RESULT_MAX = 30;
BleScanResult scanResults[SCAN_RESULT_MAX];
BleUuid foundServiceUuids[SCAN_RESULT_MAX];
BleUuid serviceUuid("00001623-1212-efde-1623-785feabcd123");
BleUuid characteristicUuid("00001624-1212-efde-1623-785feabcd123");
const size_t CHAR_RESULT_MAX = 1;
BleCharacteristic characteristic[CHAR_RESULT_MAX];
BlePeerDevice peer;

SerialLogHandler logHandler(LOG_LEVEL_INFO, {
  { "app", LOG_LEVEL_INFO },
  { "wiring.ble", LOG_LEVEL_TRACE },
  { "hal.ble", LOG_LEVEL_TRACE }
});

void onDataReceived(const uint8_t* data, size_t len, const BlePeerDevice& peer, void* context) {
  Log.info("Received: %d", len);
}

void setup() {
}

void loop() {
  if (!BLE.connected()) {
    int count = BLE.scan(scanResults, SCAN_RESULT_MAX);

    for (int idx = 0; idx < count; idx++) {
      int count2 = scanResults[idx].advertisingData.serviceUUID(foundServiceUuids, SCAN_RESULT_MAX);
      
      for (int idx2 = 0; idx2 < count2; idx2++) {
        if (serviceUuid == foundServiceUuids[idx2]) {
          Log.info("Pre connect");

          peer = BLE.connect(scanResults[idx].address, false);
          
          Log.info("Post connect");
          
          if (peer.connected()) {
            const Vector<BleService> services = peer.discoverAllServices();

            Log.info("Found %d services", services.size());
            
            const ssize_t count = peer.discoverAllCharacteristics(characteristic, CHAR_RESULT_MAX);

            Log.info("Found %d characteristics", count);
          }
        }
      }
    }
  } else {
    delay(5000);
  }
}

I’m trying to connect to a lego Control+ hub. I can successfully connect to this characteristic using command line tools on a raspberry pi. (UUIDs are included in the code above).

The panic occurs either if I set automatic to true or if I try to discover characteristics manually (service discovery appears to complete successfully and returns good data).

Thanks for your help

Mark


#5

Have you tried the vector version of discoverAllCharacteristics()?
Or have you tried to increase CHAR_RESULT_MAX?

In my code I also use the array version of discoverAllCharacteristics() but with a slightly oversized array (for “superstition” :blush:).


#6

Thanks @ScruffR - I tried the vector version and with CHAR_RESULT_MAX set to 100 and neither solved the issue.

If it gives any further clues the device panics almost exactly 30 seconds after calling discoverAllCharacteristics().


#7

Just a quick update - I’ve got further with my debugger and it appears that the Argon is panicking because it can’t acquire the BLE discovery semaphore (line 2721 in hal/src/nRF52840/ble_hal.cpp):

    CHECK_NRF_RETURN(ret, nrf_system_error(ret));
    isDiscovering_ = true;
    if (os_semaphore_take(discoverySemaphore_, BLE_OPERATION_TIMEOUT_MS, false)) {
        SPARK_ASSERT(false);
        return SYSTEM_ERROR_TIMEOUT;
    }
    return SYSTEM_ERROR_NONE;

I’ll continue trying to figure out why this call to os_semaphore_take is timing out.

Mark


#8

Hi,

So I’m not certain, but what appears to be happening is that the discoverySemaphore_ is never “given” after being taken. As the semaphore is initialised with a count of 1 then os_semaphore_take can only be called once before any further calls to os_semaphore_take will block.

Is there a call to os_semaphore_give missing in resetDiscoveryState? (I’m following the pattern that we take when setting discovering_ to true and give when setting discovering_ to false.)

Thanks

Mark


#9

That’s an interesting finde. Maybe @avtolstoy can comment on that.

But it seems to be something specific to the targeted device since I cannot reproduce that behaviour with a mock-up device exposing the UUIDs you got there in your code.


#10

Hi @ScruffR and @avtolstoy,

I added the os_semaphore_give call in to the resetDiscoveryState() method and my Argon no longer hangs, so it certainly seems like this might be a valid fix? I can’t see anywhere in the code where that discovery semaphore is released, so I’m almost certain something isn’t right …

Thanks

Mark