Electron TCPClient.write issue

wmcelderry · April 6, 2016, 11:33am

Hi All,

Long story short:

sending buffer via TCPClient.write(buffer,len) where len==1500 to my computer gets data corrupted/missing chars (and reports -1 bytes sent, though it does send data - clue?)
sending buffer via Serial.write(buffer,len) sent all data correctly (even after the failed attempt - so local buffer is correct)
sending buffer via TCPClient.write(buffer,len) where len==500 to my computer gets data correct (and reports 500 bytes sent as expected)

Odd that reducing the buffer size from 1500 to 500 made it all work better!

This is particularly an issue for the electron as you pay for data, and sitting in a while loop on a -1 result, believing you are waiting for the network to catchup, but really sending more and more corrupt data is a bit frustrating!

Can any one shed any light on this, or explain how I can get some better debug info if I get a similar issue in future?

I've downloaded latest FW from github and flashed via DFU (make PLATFORM=electron ...), so it could be work in progress/bug in new version and I'm not ruling out plain old user error on my part (though weird fix if that's the case...).

I don't have time to investigate in more detail at the moment, but I wanted to raise it in case others can confirm if they experience a similar issue and avoid others wasting time.

(The code is commercially sensitive, so I can't post it here without hacking out the guts of it, and that's back to the time issue...)

Thanks,

Will.

P.S. Could be related to:

I was transmitting a fair chunk of data, but thought send would block until the cellular network was ready for more data, or send nothing at all?

rickkas7 · April 6, 2016, 11:42am

I haven’t tested it on the Electron, and the network stack is different on the Electron, but on the Photon sending more than 1024 bytes at a time resulted in corrupted data. Here was my post on it:

wmcelderry · April 6, 2016, 1:52pm

Thanks Rick, that’s interesting.

I don’t see where you mention that exceeding 1024 bytes causes corruption in the post you linked.

Did you report it somewhere else?

Thanks,

Will.

rickkas7 · April 6, 2016, 2:58pm

It’s in a comment in the code. I saw data corruption at 2048 and not at 1024 byte writes; I didn’t determine the exact number but there was really no advantage to having the number larger than 1024 because at that point data was supplied faster than it could go out, so pushing in larger amounts of data wouldn’t help speed. Presumably this minimum sized write to keep the buffers full number is even smaller for 3G than for Wi-Fi, so 500 seems reasonable to me on the Electron.

davidgatti · April 6, 2016, 4:37pm

Hi @wmcelderry, the issue that you are explaining is bit strange. I’m working right now on this project to dissect everything Sockets https://github.com/davidgatti/IoT-Raw-Sockets-Examples, and I didn’t run in a similar issue. But I’m using a Photon and your device and mine have different network cards.

For example I started to see corrupted data after sending more then 10KB in one go. I also was sending chars and not a buffer.

In any case, sending sending more then 1146 bytes in one go is not a good idea since the standard right now for network hardware is 1500 bytes for one message (MTU). Everything above that will be chopped and sent in more then one message.

I would recommend slicing your data, and sending it one chunk after the other, so you have more control over it.

Also sending data to fast is not a good idea, this are tiny devices with very limited resources

rickkas7 · April 6, 2016, 4:47pm

Not true about the speed - if you look at the example I linked to above the Photon is perfectly capable of sending out about 800 Kbytes/sec… I transmitted 4 GB of data in about 86 minutes, 1 MB per connection, each taking about 1.5 seconds, without a single error.

davidgatti · April 6, 2016, 4:51pm

Right, well interesting. Can you elaborate more what do you mean by connection? Did you actually disconnected and immediately reconnected to the server and kept sending data?

rickkas7 · April 6, 2016, 4:54pm

I opened up a TCP connection, sent 1 MB of data, and closed the connection. Then I immediately opened a new connection and repeated the process. I probably should test sendings gigabytes of data over a single connection and see how that works. Maybe I’ll give that a try tonight.

peekay123 · April 6, 2016, 5:05pm

@rickkas7, did you sent the 1MB in 1024 byte chunks?

davidgatti · April 6, 2016, 5:06pm

If you can, please do. Because in my case I opened the connection and was just streaming.

rickkas7 · April 6, 2016, 5:11pm

Yes, 1024 bytes per write call seemed to be the optimal size in my tests. 2048 and larger, the data would sometimes be corrupted. Smaller worked fine, but the data rate dropped, which seems to indicate that I wasn’t able to keep the buffers completely full and the connection was occasionally starved for data and not running at full capacity.

davidgatti · April 7, 2016, 11:55am

I found the issue on my side. instead of using .write I was using .print which was causing the problem for me. I was getting random ASCII characters. Now that I switched to .write, everything works.

@wmcelderry since you also use .write, I’m not sure what is going on. Could it be that the .write method on the Electron uses code from the .print method?

Could you tell us more how your data gets corrupted? Do you get extra characters, or are you losing data?

Below you can check my code that I’m working on.

https://github.com/davidgatti/IoT-Raw-Sockets-Examples/blob/master/Examples/BigMessageParticle2NodeJS/tcpParticleClient.cpp

davidgatti · April 7, 2016, 4:40pm

After more thinking about the issue, I realized that we are overthinking this Just tested this code, it just works. Particle will chop the data in a way that makes sense in that particular moment and will handle everything correctly.

TCPClient client;

int port = 1337;
byte server[] = { 192, 168, 1, 100 };

char* txt = "1. Particle is a prototype-to-production platform for developing an Internet of Things\n2. Particle is a prototype-to-production platform for developing an Internet of Things\n3. Particle is a prototype-to-production platform for developing an Internet of Things\n4. Particle is a prototype-to-production platform for developing an Internet of Things\n5. Particle is a prototype-to-production platform for developing an Internet of Things\n6. Particle is a prototype-to-production platform for developing an Internet of Things\n7. Particle is a prototype-to-production platform for developing an Internet of Things\n8. Particle is a prototype-to-production platform for developing an Internet of Things\n9. Particle is a prototype-to-production platform for developing an Internet of Things\n10. Particle is a prototype-to-production platform for developing an Internet of Things\n11. Particle is a prototype-to-production platform for developing an Internet of Things\n12. Particle is a prototype-to-production platform for developing an Internet of Things.\n13. Particle is a prototype-to-production platform for developing an Internet of Things.\n14. Particle is a prototype-to-production platform for developing an Internet of Things.\n15. Particle is a prototype-to-production platform for developing an Internet of Things.\n16. Particle is a prototype-to-production platform for developing an Internet of Things.\n17. Particle is a prototype-to-production platform for developing an Internet of Things.\n18. Particle is a prototype-to-production platform for developing an Internet of Things.\n19. Particle is a prototype-to-production platform for developing an Internet of Things.\n20. Particle is a prototype-to-production platform for developing an Internet of Things.\n21. Particle is a prototype-to-production platform for developing an Internet of Things.\n22. Particle is a prototype-to-production platform for developing an Internet of Things.\n23. Particle is a prototype-to-production platform for developing an Internet of Things.\n24. Particle is a prototype-to-production platform for developing an Internet of Things.\n25. Particle is a prototype-to-production platform for developing an Internet of Things.\n26. Particle is a prototype-to-production platform for developing an Internet of Things.\n27. Particle is a prototype-to-production platform for developing an Internet of Things.\n28. Particle is a prototype-to-production platform for developing an Internet of Things.\n29. Particle is a prototype-to-production platform for developing an Internet of Things.\n30. Particle is a prototype-to-production platform for developing an Internet of Things.\n31. Particle is a prototype-to-production platform for developing an Internet of Things.\n32. Particle is a prototype-to-production platform for developing an Internet of Things.\n33. Particle is a prototype-to-production platform for developing an Internet of Things.\n34. Particle is a prototype-to-production platform for developing an Internet of Things.\n35. Particle is a prototype-to-production platform for developing an Internet of Things.\n36. Particle is a prototype-to-production platform for developing an Internet of Things.\n37. Particle is a prototype-to-production platform for developing an Internet of Things.\n38. Particle is a prototype-to-production platform for developing an Internet of Things.\n39. Particle is a prototype-to-production platform for developing an Internet of Things.\n40. Particle is a prototype-to-production platform for developing an Internet of Things.\n41. Particle is a prototype-to-production platform for developing an Internet of Things.\n42. Particle is a prototype-to-production platform for developing an Internet of Things.\n43. Particle is a prototype-to-production platform for developing an Internet of Things.\n44. Particle is a prototype-to-production platform for developing an Internet of Things.";

void setup() {

    Serial.begin(9600);

    Serial.println("UP");

    // Connect to the remote server
    if(client.connect(server, port)) {

        Serial.println("Connected");


    } else {

        // if we can't connect, then we display an error.
        Serial.println("error");

    }

}

void loop() {

    int txtSize = strlen(txt);  // Get the size of our data

    // Create a string so we can use it as our buffer
    unsigned char buf[txtSize];

    // Convert char array pointer in to a nice string
    for(int c=0; c<= txtSize; c++) {

        buf[c] = txt[c];

    }

    // Send our data to the remote server
    client.write(buf, txtSize);

    // Send the separator
    client.print(',');

    Serial.println("Done");

    delay(1000);

}

This code just works with no issues on my Photon. The delay at the end is just for me, so I have time to check what my server is getting.

wmcelderry · April 7, 2016, 4:54pm

Hi David,

I chose 1500 as it is the default MTU for many networks, I realised it would be too big with headers, but I thought I’d worry about that later especially as I don’t know the media headers of the cellular network…

The symptoms on the electron were that

Data is delivered, but it has characters missing
I was sending a PUT request, it sometimes said PUT, sometime PT and sometime UT at the start, probably more issues through out.
The return code is -1, indicating some form of failure has been identified…

I would hope either there is no data sent, or it is correct - particularly as it’s TCP (and because I’ve been billed for it sitting in a wait and retry loop… only ~1MB, but still frustrating).

I’m glad it works on the photon - my experience indicates it won’t work on the electron, and that’s why I started the thread.

I still admit it could be user error on my part, until someone else can confirm with an Electron one way or the other.

I’m sorry not to help more, but I’m busy working on a client project and am focussing on other aspects until they are resolved.
Once that’s done I’ll come back and test further.

Thanks all for your input.

W.

davidgatti · April 7, 2016, 5:07pm

Right, I re-red what you wrote in the first post and I think I’m starting do understand this.

The second parameter for the .wrtie method tells the method how much data should be sent from the buffer that you are passing. In your case you are saying, send from 0 to 500 characters.

Put the whole size of the buffer that you want to send, and let the WiFi module (I guess) do the right thing. See how it goes. If you check the code that I pasted in the previous post, you’ll see what I mean.

It’s a quick test

Finger cross.

ps. in my tests, even if you chop the data yourself, and put everything nicely in a loop. the system will still combine the data before sending… if there is enough free memory in the WiFi buffer, low traffic, and who knows what else. So you can write for example 500 bytes at a time, but you might get on the other end 2000 bytes in one message, because they were combined together before sending.

What I want to say, is that you can’t control how the data will be sent.

PeteP · April 7, 2016, 10:37pm

Correct. Somewhat confusingly, TCP connections transfer a boundaryless stream of bytes, not messages. The network(s) between the two communicating applications is free to repackage the stream of bytes as it sees fit and to deliver the bytes to the destination in chunks of any size. You have to use either prior knowledge or a higher-level protocol to establish any boundaries in the data stream.

That makes programming network communication a bit odd. In many networking APIs it's necessary to code loops that repeatedly write (on the sending side) and read (on the receiving end) until all intended data is sent or received. It appears that the Particle TCPClient API works this way for reading. For writing, however, the documentation says:

Returns: byte: write() returns the number of bytes written. It is not necessary to read this value.

That last sentence implies that it isn't possible for a write() call to accept only part of the data. Presumably, a write() call will not return until all the data can be sent (or until an error occurs.)

Here is a concise description of what's going on with links to some code. Googling "example of properly reading a tcp stream" will find lots of other example code. For the full details I recommend the (massive) books by W. Richard Stevens (Amazon).

I hope this is helpful.

-- Pete

rickkas7 · April 7, 2016, 10:43pm

I took a very quick peek under the hood of the Electron TCP code. There may be errors in my analysis. I had some wine beforehand. You can find the code here:

The reason you can get some data transmitted and still get a -1 response (MDM_SOCKET_ERROR) is in MDMParser::socketSend(). Writes larger than USO_MAX_WRITE bytes (1024) are broken up into multiple writes, non-atomically. So it’s quite possible to send the first 1024 bytes then fail on the second block, resulting in a partial send.

I’m less positive about this, but I don’t think the Electron will ever recover from a -1 result on a write. The writes look synchronous, so unless there’s some buffering in a lower layer I can’t see, the Electron doesn’t have an equivalent to the “buffer full try again later” result code (-16 on the Photon).

In any case, there is definitely no advantage of writing more than 1024 bytes at a time, as they’re internally broken into smaller writes. Based on the network speed difference, I can’t imagine anything larger than 512 bytes would be useful.

wmcelderry · April 7, 2016, 10:47pm

Hi David,

Thanks for the replies.

I can’t put all my data in to one ‘write’ call because it is generated on the fly and I don’t know how big it will be, so I have to prepare enough data to fill a buffer (what I’d term a chunk) or allocate a very big buffer and ‘hope’ it’s big enough. (I don’t like taking the ‘hope it works’ approach!)

When I chunked it in 1500 bytes and sent it to the TCPClient.write, the call never sent successfully - always returned -1 (in my experience), but it did send data similar to the source buffer, but corrupted.

I reckon that is equivalent to your suggestion - put in a write call for 1500 bytes (being all the data for this chunk) and let the network deal with it how it may. The problem is the way it dealt with it - it should either not send anything or send the correct data. TCP has fields to know if the data has been corrupted, so I believe it left the electron in a corrupt state.

To explain the rest of test system: I’m using ‘netcat’ as my server (a very well established tool in *nix, which I will eat my hat if someone proves it is at fault). TCP’s CRC and end to end ACK should ensure the data is delivered completely in order and without corruption and I trust netcat to .

I am totally happy that the network will split and combine the packets automatically and I’m not trying to force the situation, just give the networking system a helping hand by passing data in MTU sized chunks as I’ll be chunking anyway. It will only make a difference in edge cases.

Just got Pete, then Rick’s updates - busy night!

Thanks for your help - good to know 1024 is “the number to use”. I hope the hard work helps others and either makes its way to the docs or causes a slight modification in the firmware to return the number of bytes successfully sent…

All the best,

W.

davidgatti · April 8, 2016, 9:02am

For sure netcat is not the issue But you write about helping the protocol. The point of my previous message was that TCP doesn’t need your . It is a grown man that can handle it

Basically even if you don’t know the seize of the data, just send what you have. It can be one byte, or 10. The WiFi module will buffer the data on its own, and send it when it feels like it (it will wait for more, but after a certain amount of time, it will just send what it has). The only thing you need to do is to put a character, somewhere to let you know when you’v got your data. @PeteP did put a useful link in his message that explains this.

In my repo on GitHub I’m trying to put all this network knowledge in one place https://github.com/davidgatti/IoT-Raw-Sockets-Examples. Where in this example I’ll have the whole story of what I learned thanks to your three heart, and for sure I learned not to overcomplicate

wmcelderry · April 8, 2016, 10:09am

Hi David,

We are almost in agreement on that… almost

I’m saying that when the library in the Electron (or whatever MCU you use) is asked to send a packet the networking code has to ‘choose’ how long to hold on to the data before sending it anyway as a small network packet (only detectable under the levels of TCP). The algorithm that is used for this ‘choice’ can affect the throughput of the resulting program (and even how much you are billed on an electron!).

An example to illustrate my meaning (as the above is very terse):
TCP client sending 80 bytes every 0.05 seconds = 20 user calls to TCPClient.write(array,80).

If the TCP library sends one network packet for every request, there will be 20 network packets a second that include user data + media layer header + IP layer header + TCP layer header (for argument sake I’ll use 30 bytes), so your 80 bytes becomes maybe 110 bytes * 20 = 2200 bytes sent over the network per sec.

If your TCP library decides to ‘hang on to the data for up to 0.2 seconds’ in the hopes it can get more data in a single network packet - closer to the MTU - it improves the efficiency, as it combines what could have been 4 user packets:
80*4=320 data bytes + IP + TCP + media headers (30bytes) = 350 bytes per network packet *5 packets per second = 1750bytes per second.

Both deliver the complete and correct user data of 1600 bytes, the first uses 600 bytes of overhead, the second 150, and that is without worrying about the ack packets that may be sent coming the other way.

The down side of this efficiency gain is that the latency increases, which can be acceptable or can be a problem for the end user. The TCPClient has no way of knowing ‘what is acceptable’ in the electron.

If you ask the library to send data at the same size as the MTU, there is no ‘choice’ to make - it should send the data immediately as it can’t get any more efficient than that.

I agree that this isn’t too important for correct data transmission - it shouldn’t make a difference if I ask the electron to send 2 bytes or 200000 in a single write call to the correct operation, but may affect the lower level efficiency and latency.

So you’re right, TCP doesn’t need my - if you are sending a known quantity of data, send it in the biggest chunks you can is good approach 99% of the time.

My data is bursty and doesn’t care about latency (within the 10 mins mark), so if I send it more efficiently that makes me happier (billed less, less battery use), hence trying to help the network algorithm and avoid smaller packets than the MTU being sent when not necessary.

In practice my choice of 1500 was terrible - it almost guarantees small packets:

Assuming an MTU of 1500 for the cellular network:
1500 user bytes broken in to MTU sized network packets including 30 bytes of headers on each packet would yield 2 packets:

1470 user bytes + 30 bytes of headers,
30 user bytes + 30 bytes of headers

Yep, the TCP library may help fix my silly problem, but probably not because my data is very bursty, so it may be up to a 2 or 3 seconds between calls to write which is likely to cause a time out and send the second (tiny) packet. As I don’t care about the data latency, I should hang on to the 30 bytes until the next lot of data comes in and add it all together.

Sorry, that’s a long and verbose reply to an aside. I hope it illustrates why I mention ‘helping’ the network. It’s not hugely important, as it doesn’t change that we both agree - if data is sent on a TCP connection:
A) it shouldn’t be corrupt
B) the user needs to know so they don’t send the same data again.

Whatever efficiency optimisations are put in place TCP should not deliver unexpected or corrupt data, which is what seems to be happening in my experience - and @rickkas7 's analysis seems to indicate that (B) is not unexpected when some kind of failure happens in the network code.

W.

Topic		Replies	Views
Electron sending buffer to server (TCP client) Troubleshooting	0	629	March 24, 2017
Electron tcpclient.write issues Firmware	11	1596	August 24, 2018
[solved] Electron: TCPClient weirdness Troubleshooting	16	5594	May 1, 2016
Best practices when writing out large buffer using TCPClient? Troubleshooting	7	2794	April 8, 2016
Maximum TCPCLIENT buffer size? Cloud	2	599	January 22, 2022

Electron TCPClient.write issue

Related topics