I became inspired from the local communication example. so now I am trying to make a local Restfull api.
So i started humbly with creating a webserver that returns the message that had been sent.
similarely to the other problems (mentioned below) i run into stability issues.
here is my code:
TCPServer server = TCPServer(8080);
char lastByteOfAddress[4];
bool on = false; // the state of the led.
void setup()
{
server.begin(); // start listening
//using api.Spark.io to retrieve the last part of the ip address of the spark
sprintf(lastByteOfAddress,"%d",int(Network.localIP()[3]));
Spark.variable("ipaddress", &lastByteOfAddress, STRING);
delay(1000);
}
void loop()
{
TCPClient client = server.available();
if (client){
while (client.available()){
client.write(client.read());
}
client.flush();
client.stop();
}
}
calling the webserver will usually give a result the first time but becomes very unstable subsequent times either returning empty result or timing out. Eventually the core will not let itself be reflashed, and i have to do a factory reset.
Is this an issue or am I doing something wrong?
I also looked at these similar issues, but they were both not using the spark as TCPServer.
Iâm getting this too! I thought my cores were broken. When I read your post I realised Iâm using TCPClient too, and when I tested it last night and got it working, it was with simple test app, not the one with TCPClient. Once I reflashed the project using TCPClient, the problem re-emerged.
Glad to see others are experiencing it too, I thought I was going insane.
There might be truth to the idea that the TCP stack is not as stable as we users expect it to be seeing the other posts and your experiences.
I read your post earlier. it says you retracted it. At the time it didnât look like a TCPClient issue to me.
The ârequiring a factory reset to flashâ issue is more general than the âTCPClient instabilityâ issue. In your case you might have had both issues.
For example i have had the ârequiring a factory reset to flashâ in simple firmware without TCPClient.
Where the only way out was:
restart the core and reflash before the runtime error presents itself. This is only possible when there is enough time between restart and the occurance of the exception.
do a factory reset and reconnect the core to the wifi network.
The common factor in these programs were that
there were runtime errors in it (in my case there was an indexer that went out of bounds)
flashing did not work any more after a certain time
the color led was still âsighingâ the right color (the one for âconnected to the cloudâ).
the issue went away when the code was corrected.
Perhaps somebody knows of a try/catch construct that is easy to use and which can be used to make the built-in led (D7) blink
or maybe the multicolor led could indicate a runtime error as an additional state.
I had a similar problem, but never the problem that I had to reflash. Resetting a few times always helped. See my earlier post. From my point of view the TCP stack seems to be unstable. This is pretty weird because I actually went down to the socket API which is - apparently - provided by TI. I hope this can be fixed, without this the Spark core is a bit uselessâŚ
Hey guys - the issue here is that the CC3000 connect() call blocks, and so if you try to open a TCP socket and it fails, then it blocks for 60 seconds, which basically kills the connection to the Cloud.
Weâre working on a fix for this, which will decouple the user application and the Spark code so they donât block each other. Fix coming in a couple weeks, since itâs unfortunately not a quick solution.
Your post is regarding the more general ârequiring a factory reset to flashâ issue.
For that issue, I fully support the long term fix youâre making. That should make a world of difference.
Seeing your reply, it seems that this will not be a fix for the TCPClient instability.
@zach : As the TCP instability issue is only part of the issue, (the title covers both the âTCP instabilityâ and the âcannot flash coreâ issue) maybe the TCP instability discussion should be spawned on itâs own thread?
Regarding the instability issue, I have some additional information.
I have tried to debug this issue. Specifically the long blocking loop.
I had the blue led alternate from on to off on each iteration of the loop (added a delay(250), and it kept on blinking at around 2X a second (meaning 4 loops occur within a second).
Mind you this is before and after the tcp client was blocked.
I have tried several methods;
within the scope of the loop opening TCPClient, reading all data until TCPClient.available() <=0 and closing the client
making the TCPClient variable global and opening in the loopâs 1st iteration, reading one character on each progressive iteration untill the last iteration would end the TCPClient.
and any number of variations between the two.
All solutions had the same outcome.
long story short, the issue is not solely dependent on the loop portion taking too long.
@roderikv I believe this issue and the one I discuss in the other thread might be one and the same in that server.available() may be blocking, and therefore creates the same connectivity issues that long delays would cause.
Any suggestions how to code around this issue?
The code above is the most simple version i could think of, and i cannot get it to work stably.
or is there no workaround until the fix has been applied.
thanks for your answer. I think we are not talking about the same issue here. Maybe I am wrong, but I think this is an instability of the TCP stack. I actually disabled the cloud completely by commenting out all the code which was related to the cloud. I even rewrote the sys tick function not only update the timers. I am not using the TCPClient anymore but went down to pure socket calls, but still got the same issue.
All that makes me believe that we are not talking about the same issue. I opened a thread with the title âProblem with TCP socketâ where I posted my source code. The issue I am seeing happens after the first socket was closed. Then it gets sometimes extremely slow or does not react anymore at all. I can provoke that after 2 seconds.
I think Roderik and I are talking about the same - second - issue, here.
Iâm encountering strange problems with the TCPClient too. I was originally getting the couldnât reflash without factory reset issue, but now Iâm getting strange instability related to TCP.
In my situation, I have the core connecting to a node server over TCP. Initially it all works good, but my system needs to be able to handle network outages and re-establish connections. I periodically write a keep-alive character (thatâs ignored by the node server) to the TCPClient. If it returns -1, I know it requires a reconnect.
However, once it reconnects, the core gets stuck in a reboot cycle. Each time it successfully connects to the node server, then resets itself, over and over. The only way to stop the cycle is to manually press the reset button (soft reset).
Incidentally, I wrote this on an arduino first and ported it over. On the arduino, TCPClient.connected() could successfully tell me if the connection was active (after a write), but on the core, it tells me itâs still connected, even after an unsuccessful write, when the connection has clearly dropped.
//centralControl is an TCPClient object
centralControl.write(KEEP_ALIVE);
if(!centralControl.connected()){
centralControl.stop();
//Serial.println("Not connected.");
connect();
}
Hi dermotos, I am seeing the same issue when reconnecting. Not the rebooting part though (yet).
When you say
Do you mean that the tcp connection is reset? Or the program.
If the core reboots, that could indicate some programming error involving null pointers. I am not alltogether convinced that the tcp stack can cause the spark to reboot. But you never know...
@dermotos, I think we might be in similar positions with our projects.
If you donât mind, let me brain dump here for a minute and maybe something in my experience will help you guys.
I have the Core setup with a TCPClient and a local machine nearby with a Node.js server. A few things Iâve learned so far about doing this.
Calling client.connected() will return true even if the socket is closed. This is not very helpful in my experience, so I gave up on using it.
Calling client.connect() to a server that can not be reached will cause the core to hang for 60 seconds while it attempts to connect. This long delay breaks the connection to the cloud.
If you try to client.connect() to a server thatâs not available over-and-over in any sort of loop scenario, the core will become unreachable and you will need to factory reset.
Left on their own, sockets from the core only stay open for about 60 seconds.
In attempt to keep the connection open longer I was also using a KeepAlive bit that I transmitted every N seconds from the core to the server. This seemed to work okay for a while, but I noticed that I was actually opening upwards of 8 different sockets (definitely not my intention) and the stability of the sockets became questionable. Some sockets would stop sending data and would have to timeout before allowing other data through. I blame this erratic behavior on my general lack of knowledge and experience with TCP though.
What Iâve done instead (using the Local Communication example as a starting point) is never let the core try to connect to the server on itâs own, and instead only connect when told to through a public âconnectâ function from the cloud. This means the core can never brick itself if the server is not available.
Then, on the flip side of this, in the Node server I use the Sparky library to run the public âconnectâ method as soon as the server is created, but then also also on socket.close(). So far this method has left me with a single open socket at any given time, no need for noisy KeepAlive requests, and as soon as the socket drops after 60 seconds, it immediately opens right back up again.
What Iâm still missing is to setup an interval timer that tries to connect the core after N seconds if the public âconnectâ call fails. This should allow the connection to be reset regardless of who disconnects first.
And hereâs the kicker, I have had almost no firmware flashing issues using this configuration. Seems that if the core has one stable open socket - without a lot of screwing about - the web IDE can locate the core and flash it every time.
Sorry for the novel. I have some messy code on github if anyoneâs interested in digging through it.
Thanks for the detailed report; looks like there may be some issues here with our implementation of TCP, and possibly some issues due to the way that the CC3000 API works. Iâll add this to our backlog to look into in more detail.
@Zach: I am interested in getting the TCP connection stable as soon as possible. So I will certainly have a look myself at whatâs going on. Something I am not clear about is what part of this was written by Texas Instruments and which part was written by you. I am assuming that the TI parts should be stable - after all it is a product which is used in many other products. So having such an obvious disfunction of the TCP stack seems to be unbelievable. My guess is then that it is something specific to the Spark Core implementation. So any pointers about where the division between the two parts really is would be helpful. Thanks!
The core-firmware repository is ours, and it references the core-common-lib library, which includes many dependencies, including the CC3000 host driver, which is from Texas Instruments:
@Zach: So I finally found some time to play around with this. Here are my findings: It seems that the TI socket library has two problems: It cannot cope with buffers above a certain size for reading (and probably writing?!?). Also closing a socket too fast after writing is a problem as well.
Details: I have a program which uses the raw socket API to accept a connection. Then it reads from the socket. After reading it writes out a pre-canned answer and then closes the socket.
Initially my buffer for reading was 1024 bytes large. That lead to problems which manifested themselves by needing longer and longer when accepting a connection and finally stopping accepting anything at all.
Unfortunately setting the buffer to a smaller value (128) did not help by itself. In addition adding a delay(10) after writing and buffer closing the socket was also needed. Both changes alone did not make any difference. Together they fixed the issue I was seeing. After that it was possible to connect hundreds of times to the socket without delay. Great reliefâŚ