UDP receivePacket() appears to be throttled to 1/second

Hello there,

First a bit of context: I am trying to port a lightweight version of a Lab Streaming Layer (LSL docs) outlet to run on an Argon. For time synchronization, the LSL protocol sends a rapid stream of 8 short packets, to each of which my device appends its own local time of packet arrival and time of return packet submission before returning the packet to the sender. (The Argon’s behavior is encapsulated in a function called handleTimeSync() which I have verified is being called approx. every ~1ms. The device OS is 6.2.1). I am monitoring the UDP traffic on this port externally using Wireshark as well as using Log.info() messages.

The results show:

  • The 8 short packets are transmitted to my UDP port in a burst of duration 454 ms

  • It takes 1028 ms before my line of code: len = serviceUDP.receivePacket(buffer, sizeof(buffer) - 1); results in len!=0 thereby detecting the arrival of the first of these packets. This is remarkably slow!

  • It takes a further 1007 ms before we detect the arrival of the second packet

  • This pattern continues, with the udp receivePacket() function appearing to have its response throttled to an approx. 1/second rate.

  • The packets are not being dropped since udp receivePacket() eventually discovers them and we can respond, but the delay means that the intended timesync has timed out and the protocol fails.

I have read other posts about UDP problems, but nothing that seems to explain this behavior. Any help in discovering why udp receivePacket() function appears to have its response throttled to an approx. 1/second rate and how we can fix this would be much appreciated.

Argon UDP is slow, but it should not be that slow. I would look to something else blocking your loop and preventing it from running more frequently. This could include things that disable interrupts or using SINGLE_THREADED_BLOCK which will affect the system thread. It should be around 80 ms normally but occasionally as high as 151 ms.

That being said, the Argon is a poor choice if you need high speed or low latency UDP or TCP. The Wi-Fi is implemented on the ESP32, which is running esp-at firmware. The main Argon MCU (nRF52840) communicates over serial to esp-at. That adds a lot of latency.

The Argon is the only device implemented that way; the Photon 2, M-SoM, and Photon 1 all implement the Wi-Fi TCP/IP stack on the main MCU.

0000055391 [app] INFO: latency min=3 mean=59 max=115 samples=10 lost=0 maxPerLoop=1
0000065391 [app] INFO: latency min=30 mean=82 max=120 samples=10 lost=0 maxPerLoop=1
0000075391 [app] INFO: latency min=21 mean=69 max=117 samples=10 lost=0 maxPerLoop=1
0000085391 [app] INFO: latency min=48 mean=78 max=120 samples=10 lost=0 maxPerLoop=1
0000095391 [app] INFO: latency min=25 mean=77 max=119 samples=10 lost=0 maxPerLoop=1
0000105391 [app] INFO: latency min=35 mean=89 max=131 samples=10 lost=0 maxPerLoop=1
0000115391 [app] INFO: latency min=35 mean=73 max=118 samples=10 lost=0 maxPerLoop=1
0000125391 [app] INFO: latency min=48 mean=87 max=154 samples=10 lost=0 maxPerLoop=1
0000135391 [app] INFO: latency min=38 mean=77 max=121 samples=10 lost=0 maxPerLoop=1
0000145391 [app] INFO: latency min=34 mean=80 max=151 samples=10 lost=0 maxPerLoop=1

Actually, the Photon 2 isn't that much lower latency. I assume this has something to do with the need to swap between the system and user threads. There is probably also some effect from the slow GPIO on the Photon 2 which affects the numbers.

0000030810 [app] INFO: latency min=2 mean=61 max=120 samples=10 lost=0 maxPerLoop=1
0000040810 [app] INFO: latency min=30 mean=69 max=117 samples=10 lost=0 maxPerLoop=1
0000050810 [app] INFO: latency min=3 mean=67 max=119 samples=10 lost=0 maxPerLoop=1
0000060810 [app] INFO: latency min=55 mean=99 max=138 samples=10 lost=0 maxPerLoop=1
0000070810 [app] INFO: latency min=3 mean=54 max=121 samples=10 lost=0 maxPerLoop=1
0000080810 [app] INFO: latency min=21 mean=67 max=107 samples=10 lost=0 maxPerLoop=1
0000090810 [app] INFO: latency min=26 mean=72 max=117 samples=10 lost=0 maxPerLoop=1
0000100810 [app] INFO: latency min=2 mean=55 max=115 samples=10 lost=0 maxPerLoop=1
0000110810 [app] INFO: latency min=25 mean=71 max=109 samples=10 lost=0 maxPerLoop=1

How the test works

The test use a Photon 1 (which has the fastest Wi-Fi of any device) as the UDP sender, and either the Argon or Photon 2 as the receiver. The receiver ran Device OS 6.3.4.

In addition to being on the same Wi-Fi network, the device's D2 pins and GND were connected.

When the transmitter was about to UDP.sendPacket, it would asset D2 high, transmit, then drop D2.

The receiver monitors the time D2 was asserted, and the latency until the time the packet was received.

In my test, the packet was always 64 bytes, with the first 4 bytes being a sequence number, so I know if packets were lost.

Packets were sent every 1 second.

The client uses publish and subscribe to find the server address.

The server aggregates the data over 10 second periods and reports the minimum, mean, maximum latency, and also the count of packets and lost packets.

Receiver Code

This is run on the Argon or Photon 2.

#include "Particle.h"

SYSTEM_THREAD(ENABLED);

SerialLogHandler logHandler(LOG_LEVEL_INFO, {
    { "comm", LOG_LEVEL_ERROR } // Don't log comm INFO or TRACE messages
});

const int UDP_PORT = 5123;
const int SIGNAL_PIN = D2;
const unsigned long REPORT_PERIOD_MS = 10000;

void interruptHandler(void);
void subscribeHandler(const char *event, const char *data);

UDP udp;
bool wifiReady = false;
unsigned long signalTime = 0;
uint8_t buffer[512];
bool publishAddr = false;
uint32_t lastCounter = 0;
unsigned long minLatency = ULONG_MAX;
unsigned long maxLatency = 0;
unsigned long totalLatency = 0;
size_t samples = 0;
size_t lost = 0;
size_t maxPerLoop = 0;
unsigned long lastReport = 0;

void setup() {
	pinMode(SIGNAL_PIN, INPUT_PULLDOWN);
	attachInterrupt(SIGNAL_PIN, interruptHandler, RISING);

	Particle.subscribe("udpLatencyReq", subscribeHandler, MY_DEVICES);
}

void loop() {

	if (WiFi.ready()) {
		if (!wifiReady) {
			wifiReady = true;
			udp.begin(UDP_PORT);
		}

		for(size_t numPerLoop = 1; ; numPerLoop++) {
			int count = udp.receivePacket(buffer, sizeof(buffer));
			if (count <= 0) {
				break;
			}

			if (signalTime) {
				unsigned long latency = millis() - signalTime;

				uint32_t counter = *(uint32_t *)buffer;

				// Log.info("received packet latency=%u ms, size=%d, counter=%lu", latency, count, counter);

				if (lastCounter != 0) {
					lost += (counter - lastCounter - 1);
				}
				lastCounter = counter;

				if (latency < minLatency) {
					minLatency = latency;
				}
				if (latency > maxLatency) {
					maxLatency = latency;
				}

				totalLatency += latency;
				samples++;

				if (numPerLoop > maxPerLoop) {
					maxPerLoop = numPerLoop;
				}
			}
			else {
				Log.info("received packet, no timing information available");
			}
		}

		if (millis() - lastReport >= REPORT_PERIOD_MS) {
			lastReport = millis();

			if (samples) {
				Log.info("latency min=%lu mean=%lu max=%lu samples=%u lost=%u maxPerLoop=%u", minLatency, (totalLatency / samples), maxLatency, samples, lost, maxPerLoop);
			}

			// Optionally comment this block out to make the results cumulative instead of bucketing for each report period
			minLatency = ULONG_MAX;
			maxLatency = 0;
			totalLatency = 0;
			samples = 0;
			lost = 0;
			maxPerLoop = 0;
		}

		if (publishAddr) {
			publishAddr = false;

			char buf[128];
			snprintf(buf, sizeof(buf), "%s:%d", WiFi.localIP().toString().c_str(), UDP_PORT);

			Particle.publish("udpLatencyResp", buf, PRIVATE);
			Log.info("sending address %s", buf);
		}
	}
	else {
		if (wifiReady) {
			wifiReady = false;
		}

	}
}


void interruptHandler(void) {
	signalTime = millis();
}

void subscribeHandler(const char *event, const char *data) {
	publishAddr = true;
}

Sender Code

#include "Particle.h"

SYSTEM_THREAD(ENABLED);

SerialLogHandler logHandler(LOG_LEVEL_INFO, {
    { "comm", LOG_LEVEL_ERROR } // Don't log comm INFO or TRACE messages
});


const int SIGNAL_PIN = D2;
const unsigned long SEND_PERIOD_MS = 100;
const size_t SEND_SIZE = 64; // Must be at least 4

void subscribeHandler(const char *event, const char *data);

UDP udp;
bool wifiReady = false;
uint32_t counter = 0;
uint8_t buffer[512];
unsigned long lastSend = 0;
IPAddress serverAddr;
int serverPort = 0;
unsigned lastServerRequest = 0;

void setup() {
	pinMode(SIGNAL_PIN, OUTPUT);
	digitalWrite(SIGNAL_PIN, LOW);

	Particle.subscribe("udpLatencyResp", subscribeHandler, MY_DEVICES);
}

void loop() {

	if (WiFi.ready()) {
		if (!wifiReady) {
			wifiReady = true;
			udp.begin(0);
		}

		if (!serverAddr) {
			// Request server address by publish/subscribe
			if (millis() - lastServerRequest > 2000) {
				lastServerRequest = millis();
				Particle.publish("udpLatencyReq", "", PRIVATE);
			}
			return;
		}

		if (millis() - lastSend >= SEND_PERIOD_MS) {
			lastSend = millis();

			counter++;

			Log.info("sending counter=%lu", counter);

			digitalWrite(SIGNAL_PIN, HIGH);

			memcpy(buffer, (uint8_t *)&counter, sizeof(counter));

			udp.sendPacket(buffer, SEND_SIZE, serverAddr, serverPort);

			digitalWrite(SIGNAL_PIN, LOW);
		}
	}
	else {
		if (wifiReady) {
			wifiReady = false;
		}

	}
}

void subscribeHandler(const char *event, const char *data) {
	int a[4];

	if (sscanf(data, "%d.%d.%d.%d:%d", &a[0], &a[1], &a[2], &a[3], &serverPort) == 5) {
		serverAddr = IPAddress(a[0], a[1], a[2], a[3]);
		Log.info("server address set to %s:%d", serverAddr.toString().c_str(), serverPort);
	}
	else {
		Log.info("unknown server request %s", data);
	}
}

2 Likes

Thankyou so much for the quick and helpful reply. It proves conclusively that the problem is in my code, and yes, after further investigation I discovered my handleTimeSync() was indeed being called every ~1ms initially, but at a later point another section of code entirely was blocking till it timed out. Sorry for wasting your time, but thanks for the tips re photon2 vs argon and latency data, all of which will be useful as this project evolves. Anyway, the good news is I now have lightweight LSL running on a Particle device, and happy to share if anyone else needs to use that protocol.

1 Like