TCPClient on photon firmware (0.4.3, 0.4.4) not working correctly, read() and connected() behaving badly

Hi there,
I've spend so many hours now trying to get a simple TCP socket read to work on the current photon firmware. I've tested against the current 0.4.3 release and also against 0.4.4.rc2. I would like to share what I found out - I think there is something seriously wrong with the TCPClient or its underlying HAL implementation - I've tried everything to work around this, but to no avail. Maybe anyone has an idea what's wrong and can help.

Okay, here's the problem:

I had the following very simple and straightforward code to read data from a socket until end-of-stream, running flawlessly on the Core (when compiled with the old Core-only firmware repositories. Compiling using one of the current branches above leads to the same problems on the Core):

int bytesRead = 0;
while (client.connected() && bytesRead < bufferLen) {
int readNow = client.read((uint8_t*)(readBuffer + bytesRead), bufferLen - bytesRead);
if (readNow < 0) break;
bytesRead += readNow;
}

Now, when compiled against the current firmware (with the new HAL layer), this code only work as long as there is only one single TCP/IP packet involved (at least it seems so). When the first packet is consumed (usually about 1K to 1.5K of data), the read() method will return -1. (I've tracked it down into spark_wiring_tcpclient.cpp calling socket_receive() with a timeout of 0 resulting in an effective timeout of only 5ms on the select() call in socket_hal.c resulting almost always in FD_ISSET being false when the next IP packet hasn't been received yet.)

Then I thought, okay, well, I could something like this (saw similar code in some HttpClient implementations) before calling read():

while (client.connected() && !client.available()) Spark.process();

Sounded like a great idea. Guess what, you end up in an endless loop because client.connected() will always return true, no matter what, even if the socket has already been closed from the remote end.

Next thing I did is I added a timeout condition to the while loop like this:

while (client.connected() && !client.available() && ![timeoutcondition]) Spark.process();

Doing this will make it work, you will always have to wait until the timeout elapsed when the socket end has really been reached.

The whole behavior of read() and connected() is very strange, it makes it kind of unusable(). Any ideas? Thanks. :slight_smile:

From many interactions with the Particle team over the last weeks, I can say that they are aware of these issues (connected() returning true with closed sockets, and other issues with sockets on the Photon).

The firmware team and several community members are hunting down those issues. :hammer: :bug:

The GitHub issues page is a good place to keep up with those developments.

2 Likes

Yes, welcome to the TCPClient fan club :smile:

In cooperation with Particle engineers, there are a few of us working through the tcpclient issues. We’ve got rid of most of the most egregious panics right now, but are tracking down resource leaks and you are correct, tcpclient::connected() is a pathological liar at this point.

As @jvanier points out, the discussion/action is split between here and github, examples include: #539, #516, #538, #536 etc

Feel free to join our happy band, or just watch those issues for promising developments, as your time/skill/inclination allows.

2 Likes

All our recent fixes are available in 0.4.4.rc.3 - please flash that and let us know if you’re still seeing problems. Thanks! :smile: