I am working on a project where two Photon 2's connect via BLE.
I have removed the calls to the BLE APIs from BLE handlers per Particle's recommendation.
I have also created threads that keep up the BLE conectivity alive.
I am using message queues (os_queue) to send messages from BLE handlers to the loop() thread safely.
Using DeviceOS 5.9.0 on both devices.
Few things that are puzzling me:
A- Messages from the peripheral device seem to be duplicated ON the central device
Example:
The BLE connection is already established
the peripheral sends one message
the central receives it
the BLE handler on the central adds this message to an internal queue and prints so to the serial console
around 120msec after, the BLE handler on the central fires up again and adds THE EXACT SAME message to the internal queue
the loop() thread os_queue_take the message and prints so to the serial console
the loop() thread os_queue_take the dup message and prints so to the serial console
NOTE: I do not see in any logs the peripheral device sending the message twice
B- os_queue messages on the central device get delayed from time to time
Example:
The BLE connection is already established
the peripheral sends one message
the central receives it
the BLE handler on the central adds this message to an internal queue and prints so to the serial console
the loop() thread os_queue_take the message and prints so to the serial console BUT IT DOES THIS 2 SECONDS AFTER
NOTE: in normal situations I observe that the os_queue_take takes at most 150msec.
C- These problems go away when I COMPLETELY suppress the logs on the firmware
I noticed some improvements on the behaviour (eg: errors will occur more sporadically) when I was temporarily disconnecting the serial console from the central or the peripheral device.
In the case of B, can you tell how long it takes between application loops?
If you had something like
static unsigned long loopTimerStart2 = 0;
static unsigned long loopTimerEnd2 = 0;
unsigned long timeNow = millis();
if (timeNow - loopTimerStart2 > 100 || timeNow - loopTimerEnd2 > 100) {
Log.info("LOOP TOOK A WHILE: top->top[%ld] end->top[%ld]", timeNow - loopTimerStart2, timeNow - loopTimerEnd2);
}
loopTimerStart2 = timeNow;
near the top of your loop, and put
loopTimerEnd2 = millis();
at the bottom.
You might see that it actually took 2 seconds to get back to your loop. I've seen the system take a long time when doing things like connecting to the cloud, but maybe you're seeing that with some BLE calls too.
For A all I can think of is enabling a BLE debug build to see when it's actually transmitting BLE data, if that's not what you had already been logging. I had been using this but only with provisioning mode related data but I would assume it would print in your case too but I'm haven't looked.
Another possibility is that logging from the BLE receive handler thread is blocking. Basically, if another thread (user, system, or timer) is writing to the log at the time the BLE thread wakes, and the BLE thread logs, the logging will block until the other thread releases the logging lock. But at that point, it may block long enough that the BLE operation times out and retries. You could test by turning logging back on but removing the log messages from BLE handlers. This is just a vague guess, not a fully formed hypotheses.
Hey James, I went ahead with your suggestion and measured loop(). It sometimes takes up to 3 seconds, so you were into something.
What I cannot fit in my mind yet is the fact that completely deactivating the serial logs makes this issue go away.
Seems to be exactly what happens. It makes all the problems go away when I remove all logging from BLE handlers.