UDP Delays on Argon

Hi all, I’m working on a platform that depends on one central computer sending near-realtime commands to Particle boards to control small robots. We’re using UDP on a network shared by the computer and the Particles for this purpose. However, the Argons are having strange delays in receiving UDP packets that are not replicable when using the same code on Photons or similar code on other computers.

The Argons receive datagrams in “bursts” somehow. They’ll report that nothing is available on the socket for three or four datagrams in a row, then suddenly report that all are available at once.

In our tests, the Particle board would initiate a connection by sending a predefined “register” datagram so that the computer would know what address to send to. Then, the computer would begin sending 32-byte datagrams to the board at 100ms intervals. The Argon only became aware of new data to receive about every 300-500ms, reporting between 3x32 and 5x32 bytes available respectively. With the exact same code, the Photon however regularly received packets every 100ms at 32 bytes each. Our tests each only had one client at a time to minimize the effects of congestion.

Is there some way to cause the Argon to behave more like the Photon? We’re willing to accept the occasionally lost data in UDP for the sake of speed, but constant delays of three or more packets is unacceptable.

The code run on the Particles is below.

UDP udp;

#include <stdlib.h>
#include <string.h>

// Variables set by server
float pos, theta;

// Datatype for raw data from server
struct command {
  double r;
  double theta;
};

byte server[] = { 192, 168, 10, 187 };
unsigned int udpPort = 4321;

void serverListen(void) {
    // Create place to store data
    struct command c;
    
    // Check for data
    int count, timeNow = millis();
    Serial.println("Looking for packet");
    
    // Try to receive packet
    do {
        delay(1);
        count = udp.receivePacket((byte*)&c, sizeof(c));
    } while(count <= 0); // Loop until data is received
    
    Serial.printf("\nPacket received! %d\n\r", millis() - timeNow);
    
    // Parse info
    pos = c.r;
    theta = c.theta;
}

void setup()
{
    pinMode(D7, OUTPUT);
    
    while(!WiFi.ready()) delay(1);
    
    Serial.println("Beginning udp...");
    
    udp.begin(udpPort);
    
    char sendReg[12] = "register DE"; // this literal is 11 chars + /0
    char receiveBuff[32];
    
    // Try to register until acknowledged
    pinMode(D7, OUTPUT);
    do {
        // Flash led, slow the loop to 10Hz
        digitalWrite(D7, HIGH);
        delay(50);
        digitalWrite(D7, LOW);
        delay(50);
        
        udp.sendPacket(sendReg, 12, server, udpPort);
        
        Particle.process();
    } while(udp.receivePacket(receiveBuff, 32) < 0); // Loop until any reply is given
    
    Serial.println("Register acknowledged!");
}



void loop() {
    // While true to rule out possibility of lag time from delay between loop() calls
    while(true){
        serverListen();
        Serial.printf("R = %3.3f, T = %3.3f\n\r", pos, theta);
        Particle.process();
    }
}

Thanks!

I haven’t attempted anything like what you’re doing, but are you using SYSTEM_THREAD(ENABLED) ? Maybe that might help.

Thanks for the reply-- I just tried that (and added waitUntil(WiFi.ready) to the setup), but get the same behavior.

One other thing I noticed, is that you’re delaying one millisecond before every attempt to read an incoming packet, called successively in loop(). Refactoring your code into a Finite State Machine and using System.millis() to set states would remove the need for delay() and unblock other operations.

Thanks for the additional tip! It would make sense that accidentally blocking system functions would cause the different behavior on Argons and Photons given all their differences. I’ve done the following changes to switch to a threaded version and remove the delay(1). However, the same behavior still occurs.

SYSTEM_THREAD(ENABLED)

UDP udp;

#include <stdlib.h>
#include <string.h>

// Variables set by server
float pos, theta;

Thread thrd;

// Datatype for raw data from server
struct command {
  double r;
  double theta;
};

byte server[] = { 192, 168, 10, 187 };
unsigned int udpPort = 4321;

void serverListen(void) {
    while(true){
        // Create place to store data
        struct command c;
        
        // Check for data
        int count = 0, timeNow = millis();
        Serial.println("Looking for packet");
        
        // Try to receive packet
        do {
            if(micros() % 100 == 0) // Check about every millisecond
                count = udp.receivePacket((byte*)&c, sizeof(c));
        } while(count <= 0); // Loop until data is received
        
        Serial.printf("\nPacket received! %d\n\r", millis() - timeNow);
        
        // Parse info
        pos = c.r;
        theta = c.theta;
        Serial.printf("R = %3.3f, T = %3.3f\n\r", pos, theta);
    }
}

void setup()
{
    pinMode(D7, OUTPUT);
    
    waitUntil(WiFi.ready);
    
    Serial.println("Beginning udp...");
    
    udp.begin(udpPort);
    
    char sendReg[12] = "register DE"; // this literal is 11 chars + /0
    char receiveBuff[32];
    
    // Try to register until acknowledged
    pinMode(D7, OUTPUT);
    do {
        // Flash led, slow the loop to 10Hz
        digitalWrite(D7, HIGH);
        delay(50);
        digitalWrite(D7, LOW);
        delay(50);
        
        udp.sendPacket(sendReg, 12, server, udpPort);
    } while(udp.receivePacket(receiveBuff, 32) < 0); // Loop until any reply is given
    
    Serial.println("Register acknowledged!");
    
    // Start the listener thread
    thrd = Thread("udp", serverListen);
}

void loop() {
}

You never exit serverListen(). Why does it run as its own thread when loop() is doing nothing?

Just an oversight on my part. I do appreciate the constructive feedback, but this code is just for testing this issue-- we’ll be sure to put more thought in the final project when all parts are working together. I’m mostly just curious why this code works fine on a Photon but not the Argon. Even with that threaded code, the Photon works as expected.
At this point, could it still be likely that important Argon system calls are being blocked and causing the issue? As far as I understand it, they’ll occur as needed when system thread is enabled.

My question about why you had a separate thread was because (again) I don’t have direct experience with this, but might it be causing system tasks to be delayed? Just trying to be helpful offering my suggestions for things to consider. Grasping at straws to try to help you get your code working. :grinning:

@nhewitt99, I don’t have an answer for the latency but the Photon is fundamentally different from the Argon. The Photon uses a Broadcom WiFi system with a WICED stack running on the same processor as the DeviceOS. The Argon uses an ESP32 as the WiFi NPC connecting to the nRF52840 running the DeviceOS. The connection is done over serial with hardware flow control. So all the UDP work is being done on the ESP32. So there is bound to be some latency. It could be that the ESP32 is queueing data until it reaches a high mark and then sends the data to the nRF52840. Only the Particle folks can answer that one.

1 Like

@picsil Sorry if I came across as brief-- I appreciate the help :slight_smile: I tried removing the while loop inside the thread and instead calling it from loop() using Particle’s os scheduler to limit the rate, but had the same results.

Right now it seems like this is pointing to more of a hardware issue considering the multiple different approaches to the software that I’ve tried (I’ve also attempted using the Particle’s “buffered” UDP operations but got the same results). Thanks for the insight @peekay123, it sounds to me like that could be the root of this issue. Hopefully a Particle dev will clarify whether this behavior can be changed.

I ran two different tests, one using the Photon (running Device OS 1.1.1) as the UDP server (receives packets) and an the Argon (running Device OS 1.3.0-rc.1) as the UDP client (transmitting packets), and another in the reverse direction.

In addition to being on the same Wi-Fi network, the device’s D2 pins and GND were connected.

When the transmitter was about to UDP.sendPacket, it would asset D2 high, transmit, the drop D2.

The receiver monitors the time D2 was asserted, and the latency until the time the packet was received.

In my test, the packet was always 64 bytes, with the first 4 bytes being a sequence number, so I know if packets were lost.

Packets were sent every 100 milliseconds (10 per second, or 100 per 10 second bucket).

The client uses publish and subscribe to find the server address.

The server aggregates the data over 10 second periods and reports the minimum, mean, maximum latency, and also the count of packets and lost packets.

Server Code

#include "Particle.h"

SYSTEM_THREAD(ENABLED);

SerialLogHandler logHandler(LOG_LEVEL_INFO, {
    { "comm", LOG_LEVEL_ERROR } // Don't log comm INFO or TRACE messages
});

const int UDP_PORT = 5123;
const int SIGNAL_PIN = D2;
const unsigned long REPORT_PERIOD_MS = 10000;

void interruptHandler(void);
void subscribeHandler(const char *event, const char *data);

UDP udp;
bool wifiReady = false;
unsigned long signalTime = 0;
uint8_t buffer[512];
bool publishAddr = false;
uint32_t lastCounter = 0;
unsigned long minLatency = ULONG_MAX;
unsigned long maxLatency = 0;
unsigned long totalLatency = 0;
size_t samples = 0;
size_t lost = 0;
size_t maxPerLoop = 0;
unsigned long lastReport = 0;

void setup() {
	pinMode(SIGNAL_PIN, INPUT_PULLDOWN);
	attachInterrupt(SIGNAL_PIN, interruptHandler, RISING);

	Particle.subscribe("udpLatencyReq", subscribeHandler, MY_DEVICES);
}

void loop() {

	if (WiFi.ready()) {
		if (!wifiReady) {
			wifiReady = true;
			udp.begin(UDP_PORT);
		}

		for(size_t numPerLoop = 1; ; numPerLoop++) {
			int count = udp.receivePacket(buffer, sizeof(buffer));
			if (count <= 0) {
				break;
			}

			if (signalTime) {
				unsigned long latency = millis() - signalTime;

				uint32_t counter = *(uint32_t *)buffer;

				// Log.info("received packet latency=%u ms, size=%d, counter=%lu", latency, count, counter);

				if (lastCounter != 0) {
					lost += (counter - lastCounter - 1);
				}
				lastCounter = counter;

				if (latency < minLatency) {
					minLatency = latency;
				}
				if (latency > maxLatency) {
					maxLatency = latency;
				}

				totalLatency += latency;
				samples++;

				if (numPerLoop > maxPerLoop) {
					maxPerLoop = numPerLoop;
				}
			}
			else {
				Log.info("received packet, no timing information available");
			}
		}

		if (millis() - lastReport >= REPORT_PERIOD_MS) {
			lastReport = millis();

			if (samples) {
				Log.info("latency min=%lu mean=%lu max=%lu samples=%u lost=%u maxPerLoop=%u", minLatency, (totalLatency / samples), maxLatency, samples, lost, maxPerLoop);
			}

			// Optionally comment this block out to make the results cumulative instead of bucketing for each report period
			minLatency = ULONG_MAX;
			maxLatency = 0;
			totalLatency = 0;
			samples = 0;
			lost = 0;
			maxPerLoop = 0;
		}

		if (publishAddr) {
			publishAddr = false;

			char buf[128];
			snprintf(buf, sizeof(buf), "%s:%d", WiFi.localIP().toString().c_str(), UDP_PORT);

			Particle.publish("udpLatencyResp", buf, PRIVATE);
			Log.info("sending address %s", buf);
		}
	}
	else {
		if (wifiReady) {
			wifiReady = false;
		}

	}
}


void interruptHandler(void) {
	signalTime = millis();
}

void subscribeHandler(const char *event, const char *data) {
	publishAddr = true;
}

Client Code

#include "Particle.h"

SYSTEM_THREAD(ENABLED);

SerialLogHandler logHandler(LOG_LEVEL_INFO, {
    { "comm", LOG_LEVEL_ERROR } // Don't log comm INFO or TRACE messages
});


const int SIGNAL_PIN = D2;
const unsigned long SEND_PERIOD_MS = 100;
const size_t SEND_SIZE = 64; // Must be at least 4

void subscribeHandler(const char *event, const char *data);

UDP udp;
bool wifiReady = false;
uint32_t counter = 0;
uint8_t buffer[512];
unsigned long lastSend = 0;
IPAddress serverAddr;
int serverPort = 0;
unsigned lastServerRequest = 0;

void setup() {
	pinMode(SIGNAL_PIN, OUTPUT);
	digitalWrite(SIGNAL_PIN, LOW);

	Particle.subscribe("udpLatencyResp", subscribeHandler, MY_DEVICES);
}

void loop() {

	if (WiFi.ready()) {
		if (!wifiReady) {
			wifiReady = true;
			udp.begin(0);
		}

		if (!serverAddr) {
			// Request server address by publish/subscribe
			if (millis() - lastServerRequest > 2000) {
				lastServerRequest = millis();
				Particle.publish("udpLatencyReq", "", PRIVATE);
			}
			return;
		}

		if (millis() - lastSend >= SEND_PERIOD_MS) {
			lastSend = millis();

			counter++;

			Log.info("sending counter=%lu", counter);

			digitalWrite(SIGNAL_PIN, HIGH);

			memcpy(buffer, (uint8_t *)&counter, sizeof(counter));

			udp.sendPacket(buffer, SEND_SIZE, serverAddr, serverPort);

			digitalWrite(SIGNAL_PIN, LOW);
		}
	}
	else {
		if (wifiReady) {
			wifiReady = false;
		}

	}
}

void subscribeHandler(const char *event, const char *data) {
	int a[4];

	if (sscanf(data, "%d.%d.%d.%d:%d", &a[0], &a[1], &a[2], &a[3], &serverPort) == 5) {
		serverAddr = IPAddress(a[0], a[1], a[2], a[3]);
		Log.info("server address set to %s:%d", serverAddr.toString().c_str(), serverPort);
	}
	else {
		Log.info("unknown server request %s", data);
	}
}

Argon Client (transmitting), Photon Server (receiving)

This looks good.

0000140000 [app] INFO: latency min=6 mean=10 max=23 samples=100 lost=0 maxPerLoop=1
0000150000 [app] INFO: latency min=6 mean=11 max=89 samples=100 lost=0 maxPerLoop=1
0000160000 [app] INFO: latency min=7 mean=10 max=86 samples=100 lost=0 maxPerLoop=1
0000170000 [app] INFO: latency min=6 mean=11 max=49 samples=97 lost=3 maxPerLoop=1
0000180000 [app] INFO: latency min=6 mean=7 max=17 samples=99 lost=1 maxPerLoop=1
0000190000 [app] INFO: latency min=5 mean=8 max=74 samples=100 lost=0 maxPerLoop=1
0000200000 [app] INFO: latency min=6 mean=8 max=18 samples=100 lost=0 maxPerLoop=1
0000210000 [app] INFO: latency min=7 mean=8 max=34 samples=100 lost=0 maxPerLoop=1
0000220000 [app] INFO: latency min=6 mean=9 max=73 samples=100 lost=0 maxPerLoop=1
0000230000 [app] INFO: latency min=4 mean=7 max=38 samples=100 lost=0 maxPerLoop=1
0000240000 [app] INFO: latency min=6 mean=8 max=39 samples=100 lost=0 maxPerLoop=1
0000250000 [app] INFO: latency min=4 mean=8 max=18 samples=100 lost=0 maxPerLoop=1
0000260000 [app] INFO: latency min=5 mean=7 max=14 samples=100 lost=0 maxPerLoop=1
0000270000 [app] INFO: latency min=6 mean=8 max=40 samples=100 lost=0 maxPerLoop=1
0000280000 [app] INFO: latency min=4 mean=7 max=15 samples=100 lost=0 maxPerLoop=1
0000290000 [app] INFO: latency min=4 mean=9 max=66 samples=98 lost=0 maxPerLoop=1
0000300000 [app] INFO: latency min=4 mean=10 max=96 samples=102 lost=0 maxPerLoop=1
0000310000 [app] INFO: latency min=7 mean=8 max=18 samples=97 lost=3 maxPerLoop=1
0000320000 [app] INFO: latency min=5 mean=9 max=59 samples=100 lost=0 maxPerLoop=1
0000330000 [app] INFO: latency min=5 mean=7 max=12 samples=100 lost=0 maxPerLoop=1
0000340000 [app] INFO: latency min=5 mean=7 max=16 samples=100 lost=0 maxPerLoop=1
0000350000 [app] INFO: latency min=7 mean=8 max=37 samples=100 lost=0 maxPerLoop=1

Photon client (transmitting), Argon Server (receiving)

In this direction, the latency is much higher. Also, it may be higher than the log would indicate.

The maxPerLoop is the maximum number of packets we were able to read at once. Since we only transmit packets every 100 milliseconds, in theory this should always be 1, unless loop is delayed, or packets are queued before delivery.

In this direction, maxPerLoop is no longer 1, which means packets are getting bunched up.

Since the toggling of D2 can’t account for which packet time is being recorded, it could actually be greater than 100 milliseconds but it just can’t be measured this way.

0000020000 [app] INFO: latency min=26 mean=46 max=77 samples=100 lost=0 maxPerLoop=2
0000030000 [app] INFO: latency min=27 mean=45 max=81 samples=100 lost=0 maxPerLoop=2
0000040000 [app] INFO: latency min=31 mean=44 max=82 samples=98 lost=2 maxPerLoop=2
0000050000 [app] INFO: latency min=32 mean=52 max=86 samples=99 lost=0 maxPerLoop=2
0000060000 [app] INFO: latency min=38 mean=54 max=89 samples=101 lost=0 maxPerLoop=2
0000070000 [app] INFO: latency min=39 mean=58 max=100 samples=99 lost=0 maxPerLoop=4
0000080000 [app] INFO: latency min=1 mean=51 max=100 samples=100 lost=0 maxPerLoop=2
0000090000 [app] INFO: latency min=1 mean=37 max=54 samples=100 lost=0 maxPerLoop=2
0000100000 [app] INFO: latency min=4 mean=31 max=57 samples=97 lost=3 maxPerLoop=3
0000110000 [app] INFO: latency min=7 mean=32 max=68 samples=99 lost=1 maxPerLoop=3
0000120000 [app] INFO: latency min=20 mean=46 max=76 samples=100 lost=0 maxPerLoop=3
0000130000 [app] INFO: latency min=26 mean=48 max=78 samples=101 lost=0 maxPerLoop=4
0000140000 [app] INFO: latency min=28 mean=45 max=78 samples=99 lost=0 maxPerLoop=4
0000150000 [app] INFO: latency min=30 mean=48 max=86 samples=101 lost=0 maxPerLoop=2
0000160000 [app] INFO: latency min=36 mean=56 max=88 samples=99 lost=0 maxPerLoop=2
0000170000 [app] INFO: latency min=38 mean=57 max=91 samples=101 lost=0 maxPerLoop=2
0000180000 [app] INFO: latency min=41 mean=54 max=91 samples=100 lost=0 maxPerLoop=2
0000190000 [app] INFO: latency min=41 mean=54 max=92 samples=99 lost=0 maxPerLoop=2
0000200000 [app] INFO: latency min=42 mean=61 max=98 samples=100 lost=0 maxPerLoop=2
0000210000 [app] INFO: latency min=48 mean=64 max=100 samples=100 lost=0 maxPerLoop=4
0000220000 [app] INFO: latency min=50 mean=62 max=100 samples=99 lost=1 maxPerLoop=3
0000230000 [app] INFO: latency min=1 mean=37 max=53 samples=100 lost=0 maxPerLoop=2
0000240000 [app] INFO: latency min=3 mean=38 max=58 samples=101 lost=0 maxPerLoop=5
0000250000 [app] INFO: latency min=8 mean=31 max=62 samples=99 lost=0 maxPerLoop=3

Photon client (transmitting), Argon Server (receiving) (1 second)

To find the actual latency I decreased the transmission speed in the client to 1 per second. From this, I can now see that the mean latency is sometimes > 100 milliseconds, and reached as high as 252 milliseconds.

This would explain the bunching up of the packets where maxPerLoop was > 1 in the previous test.

0000047480 [app] INFO: latency min=43 mean=94 max=145 samples=10 lost=0 maxPerLoop=1
0000057480 [app] INFO: latency min=47 mean=98 max=149 samples=10 lost=0 maxPerLoop=1
0000067480 [app] INFO: latency min=48 mean=88 max=148 samples=10 lost=0 maxPerLoop=1
0000077480 [app] INFO: latency min=50 mean=90 max=151 samples=10 lost=0 maxPerLoop=1
0000087480 [app] INFO: latency min=51 mean=111 max=252 samples=10 lost=0 maxPerLoop=1
0000097480 [app] INFO: latency min=53 mean=83 max=153 samples=10 lost=0 maxPerLoop=1
0000107480 [app] INFO: latency min=53 mean=84 max=106 samples=10 lost=0 maxPerLoop=1
0000117480 [app] INFO: latency min=56 mean=91 max=156 samples=10 lost=0 maxPerLoop=1
0000127480 [app] INFO: latency min=7 mean=82 max=156 samples=10 lost=0 maxPerLoop=1
0000137480 [app] INFO: latency min=59 mean=89 max=159 samples=10 lost=0 maxPerLoop=1
0000147480 [app] INFO: latency min=59 mean=109 max=160 samples=10 lost=0 maxPerLoop=1
0000157480 [app] INFO: latency min=60 mean=105 max=210 samples=10 lost=0 maxPerLoop=1
0000167480 [app] INFO: latency min=60 mean=86 max=113 samples=10 lost=0 maxPerLoop=1
0000177480 [app] INFO: latency min=64 mean=103 max=214 samples=10 lost=0 maxPerLoop=1
0000187480 [app] INFO: latency min=64 mean=94 max=116 samples=10 lost=0 maxPerLoop=1
0000197480 [app] INFO: latency min=22 mean=82 max=121 samples=10 lost=0 maxPerLoop=1
0000207480 [app] INFO: latency min=72 mean=93 max=123 samples=10 lost=0 maxPerLoop=1
0000217480 [app] INFO: latency min=28 mean=91 max=127 samples=10 lost=0 maxPerLoop=1
0000227480 [app] INFO: latency min=27 mean=93 max=132 samples=10 lost=0 maxPerLoop=1
0000237480 [app] INFO: latency min=32 mean=92 max=133 samples=10 lost=0 maxPerLoop=1
0000247480 [app] INFO: latency min=34 mean=93 max=133 samples=10 lost=0 maxPerLoop=1
0000257480 [app] INFO: latency min=35 mean=85 max=135 samples=10 lost=0 maxPerLoop=1

Summary

It does appear that when receiving UDP the Argon does have much higher latency that the Photon does. I do not know why at this time.

5 Likes

Thanks for looking into this and confirming the issue-- I’m glad that the issue is being investigated and that I’m not just crazy :wink:

@rickkas7 sorry to bother you, just wondering if there are any more updates on this. Should I submit an issue on Github?

I’m wondering why you are using UDP (UNRELIABLE? Data Protocol) instead of TCP which offers a guaranteed connection.

Dave

UDP is excellent for low-latency, especially over wifi. If latency is the goal and UDP is already too high, I wouldn’t expect TCP to be any better.

I tested UDP for my application a few months ago and it was unusable. Had to switch back to the photon. I’d love to make use of BLE for low-energy robot triangulation but it’s hard to move a robot around with bunching of commands or latency > 500ms.

3 Likes

I agree with the time lag problem for robotics. For the photon over TCP I had to make a web socket, unfortunately my code using node "ws" is now deprecated github here. (The github site is not very user friendly, and has some machine learning as well, sorry about that)

UDP seems very fast without threading, a time lag does seem to randomly show up with SYSTEM_THREAD(ENABLED). Unfortunately for me, for other reasons I need SYSTEM_THREAD(ENABLED)

Presently I am just trying to understand what mesh.publish() actually sends, what port it works on, how it sends the data. I can't seem to get netcat to work, although I think everything is encrypted so netcat might not be very helpful.