I’ve been having lots of issues with TCPServer related code killing my Spark Cores and it was suggested in #spark that I try the example presented in the documentation to see what happened. Using that example and a very simple Python client, I’m able to cause a Spark Core to hard fault in seconds.
Besides spitting out some information, how is that test different? I think the issue here is speed. If I slow down sending data, it works normally for longer.
@bko The exception we receive is that the socket is closed by the spark. That happens when the Spark panics. Basically the remote computer sends a message, the Spark responds, then the remote computer attempts to send another message and the Spark is lost, eventually the Spark panics (SOS - 1) and restarts. When the Spark panics the remote client generates an exception with the socket being reset by peer
send/sendall won’t matter here – the spark core is hard faulting, I don’t need the python library to keep trying to send (which is what sendall would do) because by then the socket is already dead.
Confirmed, sendall() does nothing different. As mentioned above, no need to catch the exceptions – if an exception happens, it’ll be printed and the spark core is dead already so there’s no need to recover from that exception.
OK–I will lay out what I think is going on here. As I have noted before, the simple Spark example is not production-quality code and in particular, I think it is a “packet amplifier”. Some of this is speculation but I think it is well founded. Let me explain.
When you send to the Spark from python, you are almost certainly sending one TCP packet for the whole string but the code then reads that string one byte at a time and calls client.write() on one byte, which today sends a one byte packet, so one packet in, many (11 in this case) out.
There are only so many packet buffers (used for TX and RX) on the TI CC3000 and when you run out, it causes the core to panic.
I think a much better way of handling this is to read all the bytes and call client.write() once. I have not tested this sketched out code but you will get the idea:
unsigned long lastTime = millis();
uint8_t myBuffer[64]; //pick a size here
int p = 0;
while( int Nbytes = client.available() && millis()-lastTime<10000) {
for(int i=0;i<Nbytes;i++) {
myBuffer[p++] = client.read();
}
if (p>=64) { //error here
}
}
if ( ...not timeout etc...) {
client.write(myBuffer, p);
}
I currently only do one server.write(‘y’) statement for every ~11 bytes I get and it is dying also (but not hard faulting, not sure what it’s doing). If I edit that program to not even do a server.write(‘y’), it dies in seconds.
On Arduino there is an array of clients that the server holds so doing a server.write() writes to all of the clients in a loop.
On Spark, there is one client for each server (but you can get a new client if one comes and goes) so doing a server.write() is exactly the same as doing a client.write() with the client you get handed back from server.available().
I also have a server app that fill eventually fail. It does block reads and writes. It will wait for a connection, then read from that connection (spark supports 128 byte block reads) so I have a few reads, then I write the response to the client (one block write) and close the connection. I get about 20 mins (transactions occur every 5 seconds. What happens in this case is the Spark blocks for about 20 seconds, executes the main loop, blocks again for 20 seconds and will live like that forever, it never accepts socket connects in that mode.
This is not very nice to the client, as when I use another spark for the client it will panic when it can’t connect to the spark running as a server.
I’ve gotten my TCPServer code (mentioned in the other thread) working pretty good now – the spark core still dies very quickly though if I don’t do a server.write().
I tested this out here just to be sure, and I noticed a hard fault when I ran a client write with what I think was an improperly terminated string, for example: