Hard fault caused by the TCPServer example from the Spark docs and a simple Python client

Hey guys,

I’ve been having lots of issues with TCPServer related code killing my Spark Cores and it was suggested in #spark that I try the example presented in the documentation to see what happened. Using that example and a very simple Python client, I’m able to cause a Spark Core to hard fault in seconds.

Here’s the client: http://pastebin.com/TjqHHr6K

And the example server can be found here: http://docs.spark.io/firmware/#tcpserver-tcpserver

The only modification to that is changing the port from 23 to 8888.

Is the Spark Core really just this bad at networking?

Hi @guppy

Let’s try to stay in this thread from now on–posting all over gets confusing.

Is your core up to patch for “deep update” the TI CC3000 patch? That could cause this behavior. Instructions for applying the patch are in the doc:

http://docs.spark.io/troubleshooting/#deep-update

I have a very similar “telnet” type program that I tested for another forum member that ran all night for me.

@bko, I can confirm that the problem exists with patched cores. I have new and old (kickstarter w/ deep update) and it fails on all.

1 Like

I’ve tried it on 3 cores, one with the TI patch.

Someone in #spark also tested it out.

Hi

can you test this:

TCPServer server = TCPServer(23);
TCPClient client;

char addr[16];

void setup()
{
  server.begin();
  IPAddress localIP = WiFi.localIP();
  sprintf(addr, "%u.%u.%u.%u", localIP[0], localIP[1], localIP[2], localIP[3]);
  Spark.variable("Address", addr, STRING);
}

void loop()
{
  client = server.available();

  if (client.connected()) {
    client.println("Connected to Spark!");
    client.println(WiFi.localIP());
    client.println(WiFi.subnetMask());
    client.println(WiFi.gatewayIP());
    client.println(WiFi.SSID());
    client.println(">");

    while (client.connected()) {
      while (client.available()) {
        client.write(client.read());
      }
    }
  }
}

Besides spitting out some information, how is that test different? I think the issue here is speed. If I slow down sending data, it works normally for longer.

Sorry, I should have mentioned that all my cores have at least the deep update applied to them and one is running newer than that.

@bko, core panic after sending “Connected to Spark!”.

Here is the python code to test

#!/usr/bin/python
 
import socket
 
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('192.168.1.105',23))
 
message = "0123456789\n"
 
while True:
        s.send(message)
        print s.recv(len(message))

Confirmed, that code dies for me also using my Python client.

Does it work better when you change the Python code to

      s.sendall(message)

It would also be good to catch exceptions from trying to send.

@bko The exception we receive is that the socket is closed by the spark. That happens when the Spark panics. Basically the remote computer sends a message, the Spark responds, then the remote computer attempts to send another message and the Spark is lost, eventually the Spark panics (SOS - 1) and restarts. When the Spark panics the remote client generates an exception with the socket being reset by peer

send/sendall won’t matter here – the spark core is hard faulting, I don’t need the python library to keep trying to send (which is what sendall would do) because by then the socket is already dead.

Confirmed, sendall() does nothing different. As mentioned above, no need to catch the exceptions – if an exception happens, it’ll be printed and the spark core is dead already so there’s no need to recover from that exception.

OK–I will lay out what I think is going on here. As I have noted before, the simple Spark example is not production-quality code and in particular, I think it is a “packet amplifier”. Some of this is speculation but I think it is well founded. Let me explain.

When you send to the Spark from python, you are almost certainly sending one TCP packet for the whole string but the code then reads that string one byte at a time and calls client.write() on one byte, which today sends a one byte packet, so one packet in, many (11 in this case) out.

There are only so many packet buffers (used for TX and RX) on the TI CC3000 and when you run out, it causes the core to panic.

I think a much better way of handling this is to read all the bytes and call client.write() once. I have not tested this sketched out code but you will get the idea:

unsigned long lastTime = millis();
uint8_t myBuffer[64];  //pick a size here
int p = 0;
while( int Nbytes = client.available() && millis()-lastTime<10000) {
  for(int i=0;i<Nbytes;i++) {
    myBuffer[p++] = client.read();
  }
  if (p>=64) { //error here
  }
}
if ( ...not timeout etc...) {
  client.write(myBuffer, p);
}

That brings up my other thread that I’m still trying to figure out: https://community.spark.io/t/is-using-tcpserver-unstable-or-is-it-my-code/8674/16

I currently only do one server.write(‘y’) statement for every ~11 bytes I get and it is dying also (but not hard faulting, not sure what it’s doing). If I edit that program to not even do a server.write(‘y’), it dies in seconds.

I am not sure what that is doing either.

On Arduino there is an array of clients that the server holds so doing a server.write() writes to all of the clients in a loop.

On Spark, there is one client for each server (but you can get a new client if one comes and goes) so doing a server.write() is exactly the same as doing a client.write() with the client you get handed back from server.available().

I also have a server app that fill eventually fail. It does block reads and writes. It will wait for a connection, then read from that connection (spark supports 128 byte block reads) so I have a few reads, then I write the response to the client (one block write) and close the connection. I get about 20 mins (transactions occur every 5 seconds. What happens in this case is the Spark blocks for about 20 seconds, executes the main loop, blocks again for 20 seconds and will live like that forever, it never accepts socket connects in that mode.

This is not very nice to the client, as when I use another spark for the client it will panic when it can’t connect to the spark running as a server.

Any News on this problem? Does anyone have a working implementation of TCPServer?

I’ve gotten my TCPServer code (mentioned in the other thread) working pretty good now – the spark core still dies very quickly though if I don’t do a server.write().

1 Like

Heya @mtnscott,

I tested this out here just to be sure, and I noticed a hard fault when I ran a client write with what I think was an improperly terminated string, for example:

//works
#define chunkSize 256
uint8_t testPacket[chunkSize];
client.write(testPacket, chunkSize);
//throws an SOS
#define chunkSize 256
char testPacket[chunkSize];
client.write(testPacket);

Not sure if that helps or not, but I thought I’d pop in to try! :slight_smile:

Thanks,
David