TCPServer misses 40-50% of packets from client

I’m trying to use the Photon as a control device for a serial board, the serial communication works great, but I’d like it to act as a TCP server that I can send bytes to. I’ve simplified my code significantly to try and track down the issue, but for some reason, when I’m sending data over TCP I miss 40%-50% of the communications. I am sending data from CLI on a network computer (macbook) with netcat:

echo -e "\xaa\x05\xfe\x93\x00\x00\x01\x41" | nc 192.168.1.17 6000

And my application on the photon is very short, I have spent a lot of time reading other posts on the community hoping to find something I’m missing, but to no avail.

//Variables for TCP communication
int port = 6000;
TCPServer server = TCPServer(port);
TCPClient client;

SYSTEM_MODE(SEMI_AUTOMATIC);

void setup() {
    WiFi.connect();
    while(!WiFi.ready());
    delay(1000);
    server.begin();
}

void loop() {
    //check to see if TCP client is connected
    if (client.connected()) {
        Serial.println("Client Connected");
        //Client has sent data
        if(client.available() != 0){
            Serial.println("Data received from client");
            Serial.print("Received Data: ");
            while(client.available() != 0){
                Serial.printf("%i, ", client.read());
            }
            Serial.println();
        }
        client.stop();

    } else {
        // if no client is yet connected, check for a new connection
        client = server.available();
    }
}

I tried turning on the System thread, but it didn’t improve anything, and in my full app I also make a UDP broadcast every 15 seconds, which interestingly enough stopped working with the system thread on. I initially left system mode at automatic and called Particle.disconnect() after setup, changing this to semi-automatic didn’t help either. Out of ideas I also tried moving the, client.stop(); and client=server.available(); out of the logic blacks and calling them at the end of each loop, but that didn’t work either.

I’ve also tested this on a P1 board we made with the same results. Am I missing something obvious? Any help would be greatly appreciated!

This is the test program I use for TCPServer receiving data.

I just re-ran the test and successfully received 10485760 bytes in 82 seconds, no lost bytes or errors. Repeated the test 3 times. System firmware 0.5.3.

#include "Particle.h"

const int MAX_CLIENTS = 5;
const int LISTEN_PORT = 7123;
const int CLIENT_BUF_SIZE = 1024;
const unsigned long INACTIVITY_TIMEOUT_MS = 30000;
const int CLOSE_AFTER_SIZE = 1024 * 1024 * 10; // 10 MB, set to -1 for unlimited
const unsigned long LOG_EVERY_BYTES = 512 * 1024; // 512K

class ClientConnection {
public:
	ClientConnection();
	virtual ~ClientConnection();

	void loop();
	bool accept();

protected:
	void clear();
	void readRequest();
	void writeData();

private:
	unsigned char clientBuf[CLIENT_BUF_SIZE];
	bool inUse;
	int clientId;
	TCPClient client;
	int readOffset;
	int writeOffset;
	unsigned long lastUse;
	unsigned char expectedChar;
	unsigned long bytesRead;
	time_t startTime;
	unsigned long lastLogBytes;
};


String localIP;
TCPServer server(LISTEN_PORT);
ClientConnection clients[MAX_CLIENTS];
int nextClientId = 1;

void setup() {
	Serial.begin(9600);

	// From CLI, use something like:
	// particle get test5 localip
	// to get the IP address of the Photon (replace "test5" with your device name)
	localIP = WiFi.localIP(); // localIP must be a global variable
	Particle.variable("localip", localIP);
	Serial.printlnf("server=%s:%d", localIP.c_str(), LISTEN_PORT);

	server.begin();
}

void loop() {
	// Handle any existing connections
	for(int ii = 0; ii < MAX_CLIENTS; ii++) {
		clients[ii].loop();
	}

	// Accept a new one if there is one waiting (and we have a free client)
	for(int ii = 0; ii < MAX_CLIENTS; ii++) {
		if (clients[ii].accept()) {
			break;
		}
	}
}


ClientConnection::ClientConnection() : inUse(false) {
	clear();
}

ClientConnection::~ClientConnection() {
}

void ClientConnection::loop() {
	if (!inUse) {
		return;
	}

	if (client.connected()) {
		readRequest();

		if (millis() - lastUse > INACTIVITY_TIMEOUT_MS) {
			Serial.printlnf("%d: inactivity timeout", clientId);
			client.stop();
			clear();
		}
	}
	else {
		Serial.printlnf("%d: client disconnected", clientId);
		client.stop();
		clear();
	}
}

bool ClientConnection::accept() {
	if (inUse) {
		return false;
	}

	client = server.available();
	if (client.connected()) {
		lastUse = millis();
		inUse = true;
		clientId = nextClientId++;
		startTime = Time.now();
		Serial.printlnf("%d: connection accepted", clientId);
	}
	return true;
}

void ClientConnection::clear() {
	lastUse = 0;
	readOffset = 0;
	writeOffset = 0;
	inUse = false;
	expectedChar = 0;
	bytesRead = 0;
	lastLogBytes = 0;
}

void ClientConnection::readRequest() {
	// Note: client.read returns -1 if there is no data; there is no need to call available(),
	// which basically does the same check as the one inside read().

	int count = client.read(clientBuf, CLIENT_BUF_SIZE);
	if (count > 0) {
		for(int ii = 0; ii < count; ii++) {
			if (clientBuf[ii] != expectedChar) {
				Serial.printlnf("%d: mismatch expected %02x got %02x index %d bytesRead %d, closing",
						clientId, expectedChar, ii, bytesRead + ii);
				client.stop();
				clear();
				break;
			}
			expectedChar++;
			bytesRead++;

			if (bytesRead - lastLogBytes > LOG_EVERY_BYTES) {
				lastLogBytes = bytesRead;
				time_t now = Time.now();

				Serial.printlnf("%d: received %d bytes in %d sec",
						clientId, bytesRead, now - startTime);
			}

			if (CLOSE_AFTER_SIZE > 0 && bytesRead >= CLOSE_AFTER_SIZE) {
				time_t now = Time.now();

				Serial.printlnf("%d: received %d bytes in %d sec, closing connection",
						clientId, bytesRead, now - startTime);
				client.stop();
				clear();
				break;
			}
		}
		lastUse = millis();
	}
}

1 Like

Thanks! I’m going to check this out now, I should clarify that I’m not losing bytes, I think I explained that poorly, I’m missing packets

Assuming the nc command is in some sort of loop without a delay you’ll most certainly lose some connections, because the Mac will be making them faster than the Photon can receive them and they’ll just be dropped. However, my test code can handle multiple connections at the same time, so it’s much less likely to lose connections, especially if they arrive in bursts of less than MAX_CLIENTS at a time.

I actually send the nc command manually about once a second to avoid that problem as much as possible. But your code seems to have a similar issue:

1: connection accepted
1: mismatch expected 00 got 00 index 0 bytesRead 134878781, closing
2: connection accepted
2: client disconnected
3: connection accepted
3: client disconnected
4: connection accepted
4: client disconnected
5: connection accepted
5: client disconnected
6: connection accepted
6: mismatch expected 08 got 08 index 16 bytesRead 134878781, closing
7: connection accepted
7: mismatch expected 08 got 08 index 16 bytesRead 134878781, closing
8: connection accepted
8: mismatch expected 08 got 08 index 16 bytesRead 134878781, closing
9: connection accepted
9: mismatch expected 08 got 08 index 16 bytesRead 134878781, closing

It actually missed about 5 packets. I may just be asking too much of the module, or I need to figure out how to actually close a client and destroy it so I can reuse it properly.

I changed the inactivity timeout to 1 second and slowed down my sends to 1 every ~5 seconds… no change. This seems very strange, I can listen to Serial in the same loop and never miss a single packet.

You’re referring to connection 2 - 6, where the connection is accepted, but then it’s disconnected with no data received, right? That’s almost certainly caused by what I think is a bug, possibly in WICED. It happens when the server closes the connection (sends a FIN packet) when there’s data that hasn’t been received and processed yet by the user code (client.avaiable() > 0). This only happens when sending to the Photon, and when the server indicates the end-of-data by closing the connection. As soon as the FIN packet is received, client.connected() returns false, and there’s no way to get the data that was buffered.

No, I mean there are 5 times where nothing is printed at all, as though the Photon never received anything from the client. I did set up a simple test using a Pocket Chip and I am receiving every packet with the exact same NC command sending from my Mac, so this isn’t a network issue or a netcat issue :frowning:

Verbose output of netcat also shows:

Connection to 192.168.1.17 port 6000 [tcp/*] succeeded!

even when the packet is missed… hmmm I really wish I could return more than 4 bytes from a Particle.function, that would make this whole thing pretty much moot

It seems to be a problem with that use case. I can reproduce your problem as is, but by making a few tweaks it goes away. It’s a bug somewhere, but I’m not sure where.

The mode where you send data to the Photon and indicate its completion by closing the connection causes not only the problem I mentioned earlier where the data is lost, but also seems to cause a problem with accepting connections.

I switched the test around so after 10 bytes received, the Photon closed the connection. The server side waited for 1 second after sending the data before closing the connection, allowing time for the data to be handled by the Photon.

With it set up that way, I can make 10 connections per second to the Photon, with no data loss, and no connections lost. I ran the test for a few minutes and it worked perfectly.

In other words, the Photon can accept connections rapidly with no loss, except in the client closes the connection quickly case where something gets messed up, then things go bad. So, yes, there is probably a bug somewhere.

Thanks for helping verify this, I’ll see what I can do about adjusting my use case to work around it… @mdma could you maybe shed any light on if this is a bug, or just not the intended use case?