Is using TCPServer unstable or is it my code?

guppy · December 9, 2014, 2:00am

So I’m attempting to use a number of Spark Cores (and hopefully down the road Photons) to control G35 Christmas lights but I’m running into a stability problem that is killing the dream

Here’s the code for my TCPServer: https://zaf.ca/tcpserver.txt – usually within about 4 to 15 minutes, the spark core becomes unresponsive and eventually it (usually) reboots. I’ve run various variations of the software to rule out issues including comment out the call to process_message() so that just the code in loop() is running.

I’ve ruled out the library being the issue also by running https://zaf.ca/test.txt for several hours to several days.

Does anyone have an insight?

Thanks.

guppy · December 10, 2014, 3:44am

Well, it seems to be a problem with TCPServer and not my code.

I’m using a Python client to send commands to my Spark Core and I’ve added a one second sleep after each send and by doing that, the Spark Core has survived over 8 hours. I’ll start dropping that timer down to see if I can find out at what time interval it’s dying. The one second sleep makes this project completely unusable at this point but I’m holding out for the Photon )

Also, if I take out the server.write(‘y’) from my sketch, the spark core dies within a few minutes – does anyone know why that might be?

Hootie81 · December 10, 2014, 5:16am

quickly looking at your code it looks like you process the data as you read it? is it possible to read it all at once very quickly then process it? might help take some load off the buffers?

guppy · December 10, 2014, 12:49pm

read() only returns one byte at a time so I have to read as I go until I encounter a newline and the process it.

My Python client will hopefully be connected for long periods of time and continuously sending commands to control my Christmas lights. I’m only sending around 500 bytes every 3 or 4 seconds and able to kill the Spark Core after about 18 minutes – surely we can expect better than 500 bytes every 3 or 4 seconds.

Hootie81 · December 10, 2014, 1:08pm

Not sure where my brain was earlier… i see what your doing now

Here is a function i use in another project that reads data from a serial port… up to a maximum (length) or until a carriage return or a line feed. Very similar to what you do…

I remember playing with the delay for days if i made it any shorter then the transfer would get slower and slower till it was unusable and the core would disconnect from the cloud and go rapid cyan then eventually reconnect.

void readString(char *ptr, int length) {

    int pos = 0;

    while (!Serial.available()) SPARK_WLAN_Loop(); //wait for serial data to come in 
    while (Serial.available()) {

        inChar = Serial.read();
        if (inChar == 0x0A || inChar == 0x0D)
            break;
        ptr[pos] = inChar;
        pos++;
        delay(10);


        if (pos >= length - 1)
            break;
    }
    ptr[pos] = '\0';

    while (Serial.available())
        (void)Serial.read(); //throw it away

    return;
}

guppy · December 10, 2014, 1:55pm

Yeah, that’s what is happening here also – if I read too fast, it eventually does the flashing cyan and will sometimes reconnect (after like 15+ minutes). I’ll throw a delay into my code and see how it works – the bigger issue for me is I don’t really want to do a server.write(‘y’) because I don’t need a response from the spark core but that causes it to die even faster, I imagine the server.write is adding almost enough of a delay to keep things sorta happy.

The delay I’m doing right now is on the client side. I’m not too concerned about rouge applications connecting to the port and breaking things as these will not be internet accessible when running in production.

guppy · December 10, 2014, 7:50pm

https://zaf.ca/tcpserver2.txt is an updated version of my code.

I added a delay(2) after the offset++ (and changed some other things not really important to this discussion) and I’m testing it now to see how stable it is. I had an 5ms delay and it ran for a few hours without issue – dropped it to 1ms but it died so I’m working up from 1ms now to 5ms

guppy · December 10, 2014, 8:52pm

oops, I left a delay(5) in there by accident so I was really doing a 7ms delay.

Retesting with that extra 5ms taken out

guppy · December 11, 2014, 1:46am

Runs with a delay of 2ms for a longer time but still dies. I’ll reset with a 5ms delay and leave it run for a couple of days to see how stable it is.

guppy · December 12, 2014, 1:46pm

Well, 5ms still dies – it just takes longer.

Any chance Photons will ship sooner if the first batch of 10,000 sell out?

bko · December 12, 2014, 3:00pm

Hi @guppy

I don’t use TCPServer much (but it has always been OK for me when I did) but lots of folks seem to be having problems with it.

One thing that was reported to help was loading the latest patch from TI onto the TI CC3000 but there are firmware changes that have to go along with the patch so you would have to build locally from a special branch of the firmware for testing. @kennethlimcp started a thread about testing this here:

mtnscott · December 12, 2014, 8:02pm

@guppy @bko

I have been struggling with TCPServer for days. It appears to not be stable and I have written a demonstration application that will use one Spark as a client and another as a Server. If the client can’t connect to the server it will panic. The server will run anywhere from 20 mins to 90 mins and then gets into a state where it won’t run the user loop - like 20s delays between running and won’t accept connections - sending the client into a panic.

@bko, @kennethlimcp I would love to have you take a look at it. I have already given it to @Dave. It currently builds using the CLI. The tar file I have will get you going easily if you already have the CLI up and able to build. Let me know.

Here is the thread we have been working in github

guppy · December 12, 2014, 8:28pm

I didn’t see these replies until just now but I have a very simple server and Python client that can cause a Spark Core to hard fault in seconds: https://community.spark.io/t/hard-fault-caused-by-the-tcpserver-example-from-the-spark-docs-and-a-simple-python-client/8764

mtnscott · December 12, 2014, 9:03pm

@guppy Now why did I not think of that. How simple can you get with your example So I confirmed you only get to 13 before it panics the server. I have tried the changes suggested by @gorsat with no benefit. Here is a thread on github that also suggests there are problems with the implementation of TCPServer.

@Dave, @zachary, @bko, @kennethlimcp We are all working various threads on this topic. Here is a very quick demonstration of the failure. @gorsat can you get this to work ?

bko · December 12, 2014, 9:05pm

Let’s all try to stick to one thread please! I just posted on @guppy’s new thread that I have a very similar telnet-type program I tested for someone here and it ran well for me.

guppy · December 14, 2014, 5:29am

Does anyone know why if I don’t do a server.write(‘y’), it causes the spark core to hard fault within a few seconds? I don’t really care to send data back to the client but right now I have to unless I want to kill the spark core.

guppy · December 14, 2014, 3:27pm

https://zaf.ca/tcpserver3.txt has been running for 10 hours now, modified the reading a bit based on @bko’s example code. I also added a timeout on the connected() loop and the available() loop. Much better and no delays were needed to get it here.

Topic		Replies	Views
TCP Server Help Troubleshooting	5	1467	August 14, 2014
Cloud drops constantly, no TCPServer working, no hair left...many bumps on the road Troubleshooting	14	3278	October 15, 2014
TCP Connection unstable Troubleshooting	4	1997	November 28, 2015
Trying to connect to anything other than Spark Cloud. Spark.process() taking over Troubleshooting	3	1140	October 9, 2014
What did I screw up? (SparkCore seemed dead even after cold reset; until factory reset) Troubleshooting	6	1339	December 22, 2013

Is using TCPServer unstable or is it my code?

Related topics