WiFi unstable, goes dead temporarily at varying intervals

I’m seeing a problem that the WiFi is pretty unstable.

My app is pretty simple, (using Webserver.h from the community library) based on specific HTTP GET requests it flips relays on a relay shield.

I received my core and relay shield almost a week ago, along with a second core for a friend of mine. Both cores are experiencing the problems that I am going to describe in detail. Also, in order to rule out outside causes, I have tested this on two completely different WiFi networks (different routers from different manufacturers) in different buildings.

Frequency:

Varies. In one day 1 occurrence per 2 hours was followed by 5 occurrences in one hour, and that was followed by 1 occurrence over the next hour.

Behavior:

The core stops responding to all requests (including http GETs and spark core firmware updates from the cloud), but the status light continues to breathe cyan. After anywhere from a few seconds to five minutes, the status light will begin flashing cyan for a variable amount of time and sometimes the status light will turn off completely (for up to 30 seconds). After this the status light begins breathing cyan again and is the core begins to respond normally.

Troubleshooting steps and results:

I wrote a small bit of code to ping the local gateway every 60 seconds and flip a relay if it failed. Upon the next occurrences of the problem, the relay did not flip when the core stopped responding, and did not do so until the status light began to breathe cyan.

Code is below, Any help is appreciated.

(I originally posted this in the “Cyan of death” thread as it is similar to a problem that someone else posted in that thread, but Zach asked me to post this in a new thread with my code)

// *** SOURCE INCLUDES ***
#include "HttpClient/HttpClient.h"
#include "WebServer/WebServer.h"


// *** DECLARE / INITIALIZE ***

// Begin: Declare/Initialize for relay shield
    int RELAY1 = D0;
    int RELAY2 = D1;
    int RELAY3 = D2;
    int RELAY4 = D3;
// End:   Declare/Initialize for relay shield


// Begin: Declare/Initilaize for ping loop variables

    int lastMinute;
    int tenSecondResetWifi = 0;

// End:   Declare/Initilaize for ping delay loop variable



// Begin: Declare/Initialize for WebServer, require WebServer.h

    /* all URLs on this server will start with / because of how we
     * define the PREFIX value.  We also will listen on port 80, the
     * standard HTTP service port */
    #define PREFIX ""
    WebServer webserver(PREFIX, 80);
    
// End:   Declare/Initialize for WebServer


// Begin: Declare/initialize WebClient, requires WebClient.h

    unsigned int nextTime = 0;    // Next time to contact the server
    HttpClient http;
    
    // Headers currently need to be set at init, useful for API keys etc.
    http_header_t headers[] = {
        //  { "Content-Type", "application/json" },
        //  { "Accept" , "application/json" },
        { "Accept" , "*/*"},
        { NULL, NULL } // NOTE: Always terminate headers will NULL
    };
    
    http_request_t request;
    http_response_t response;

// End:   Declare/initialize WebClient


// *** FUNCTIONS ***

// Begin: Functions for WebServer. require Webserver.h
    // This command is the default, executed when only / is sent
    void defaultCmd(WebServer &server, WebServer::ConnectionType type, char *, bool) {

        // Assemble HTML
        P(message) = 
            "<html>"
           "<head><title></title></head>"
           "<style>"
           "  body { background: black; font-family: Ariel; color: #CCCCCC; }"
           "  a { color: #00CCCC; text-decoration: none; }"
           "</style>"
           "<body>"
           "<h2>"
           "<center>"
           "<h1>III-Core-G</h1>"
           "<a href=\"pulse.html\">pulse.html</a><br><br>"
           "<a href=\"enabled.html\">enabled.html</a><br><br>"
           "<a href=\"disabled.html\">disabled.html</a><br><br>"
           "<a href=\"eventGhostStatus.html\">eventGhostStatus.html</a><br><br>"
           "<a href=\"clearStatus.html\">clearStatus.html</a><br><br>"
           "<a href=\"setRebootStatus.html\">setRebootStatus.html</a><br><br>"
           "<a href=\"reportStatus.html\">reportStatus.html</a><br><br>"
           "<br>"
           "</center>"
           "</h2>"
           "</body>"
            "</html>"
            ;
        server.printP(message);
        
        return;
    }

    // This command is executed when /clearRebootStatus.html is sent
    void clearStatusCmd(WebServer &server, WebServer::ConnectionType type, char *, bool) {

        // If type isn't GET, stop
        if (type == WebServer::HEAD) return;
        else if (type == WebServer::POST) return;

        // Go low on RELAY3
        digitalWrite(RELAY3, LOW);
        // Go low on RELAY4
        digitalWrite(RELAY4, LOW);
        // Go back to previous page
        server.httpSeeOther("javascript:javascript:history.go(-1)");

        return;
    }   
    
    // This command is executed when /disabled.html is sent
    void disabledCmd(WebServer &server, WebServer::ConnectionType type, char *, bool) {

        // If type isn't GET, stop
        if (type == WebServer::HEAD) return;
        else if (type == WebServer::POST) return;

        // Go high on RELAY2
        digitalWrite(RELAY2, HIGH);
        // Go back to previous page
        server.httpSeeOther("javascript:javascript:history.go(-1)");
        
        return;
    }    
    
    // This command is executed when /enabled.html is sent
    void enabledCmd(WebServer &server, WebServer::ConnectionType type, char *, bool) {


        // If type isn't GET, stop
        if (type == WebServer::HEAD) return;
        else if (type == WebServer::POST) return;

        // Go low on RELAY2
        digitalWrite(RELAY2, LOW);
        // Go back to previous page
        server.httpSeeOther("javascript:javascript:history.go(-1)");
        
        return;
    }    

    // This command is executed when /eventGhostStatus.html is sent
    void eventGhostStatusCmd(WebServer &server, WebServer::ConnectionType type, char *, bool) {

        // If type isn't GET, stop
        if (type == WebServer::HEAD) return;
        else if (type == WebServer::POST) return;

        sendEventGhostStatus("index.html?III-Core-G-statusResponse");
        // Go back to previous page
        server.httpSeeOther("javascript:javascript:history.go(-1)");
        
        return;
    }
       
    // This command is executed when /pulse.html is sent
    void pulseCmd(WebServer &server, WebServer::ConnectionType type, char *, bool) {

        // If type isn't GET, stop
        if (type == WebServer::HEAD) return;
        else if (type == WebServer::POST) return;

        // Go HIGH on RELAY1 for 500 ms, then back to LOW
        digitalWrite(RELAY1, HIGH);
        delay(500);
        digitalWrite(RELAY1, LOW);

        // Go back to previous page
        server.httpSeeOther("javascript:javascript:history.go(-1)");
        
        return;
    }    

    // This command is executed when /reportStatus.html is sent
    void reportStatusCmd(WebServer &server, WebServer::ConnectionType type, char *, bool) {

        // Assemble HTML
        String myMessage =
            "<!DOCTYPE html><html>"
            "   <head>"
            "   <title>III-Core-G</title>"
            "   <style>"
            "       body { background: black; font-family: Ariel; color: #CCCCCC; }"
            "       a { color: #00CCCC; text-decoration: none; }"
            "   </style>"
            "   </head>"
            "   <body>"
            "   <center><h1>III-Core-G Status</h1></center>"
            "   <h2>"
            "   Network:        ";
        myMessage += WiFi.SSID();
        myMessage += "<BR>   Strength:       ";
        myMessage += WiFi.RSSI() + "   "
            "<br>   IP Address:     ";
        IPAddress myIP = WiFi.localIP();
        myMessage += String(myIP[0]) + "." + String(myIP[1]) + "." + String(myIP[2]) + "." + String(myIP[3]);
        myMessage += "<br>   Subnet:         ";
        myIP = WiFi.subnetMask();
        myMessage += String(myIP[0]) + "." + String(myIP[1]) + "." + String(myIP[2]) + "." + String(myIP[3]);
        myMessage += "<br>   Gateway:        ";
        myIP = WiFi.gatewayIP();
        myMessage += String(myIP[0]) + "." + String(myIP[1]) + "." + String(myIP[2]) + "." + String(myIP[3]);
        myMessage += "   ";
        //    "<br>   MAC:            "
        //WiFi.macAddress())"
        myMessage += "<br><br> lastMinute:       ";
        myMessage += lastMinute;

        myMessage += "   </body>"
            "   </html>";

        server.print(myMessage);
    }

    // This command is executed when /setRebootStatusCmd.html is sent
    void setRebootStatusCmd(WebServer &server, WebServer::ConnectionType type, char *, bool) {

        // If type isn't GET, stop
        if (type == WebServer::HEAD) return;
        else if (type == WebServer::POST) return;

        // Go high on RELAY2
        digitalWrite(RELAY4, HIGH);
        // Go back to previous page
        server.httpSeeOther("javascript:javascript:history.go(-1)");
        
        return;
    }    

// End:   Functions for WebServer. require Webserver.h


// Begin: Functions for Webclient

    void sendEventGhostStatus(String path) {

        request.hostname = "192.168.1.10";
        request.port = 80;
        request.path = path;

        http.get(request, response, headers);
    }
    
// End:   Functions for Webclient
    
void setup() {
    
    // Begin: Setup for relay shield

        //Initilize the relay control pins as output
        pinMode(RELAY1, OUTPUT);
        pinMode(RELAY2, OUTPUT);
        pinMode(RELAY3, OUTPUT);
        pinMode(RELAY4, OUTPUT);
        // Initialize all relays to an OFF state
        digitalWrite(RELAY1, LOW);
        digitalWrite(RELAY2, LOW);
        digitalWrite(RELAY3, LOW);
        digitalWrite(RELAY4, LOW);
    // End:   Setup for relay shield

    // Begin: Setup for WebServer, requires WebServer.h
    
        /* register our default command (activated with the request of
         * http://x.x.x.x/cmd */
        webserver.setDefaultCommand(&defaultCmd);
        webserver.addCommand("clearStatus.html", &clearStatusCmd);
        webserver.addCommand("disabled.html", &disabledCmd);
        webserver.addCommand("enabled.html", &enabledCmd);
        webserver.addCommand("eventGhostStatus.html", &eventGhostStatusCmd);
        webserver.addCommand("pulse.html", &pulseCmd);
        webserver.addCommand("reportStatus.html", &reportStatusCmd);
        webserver.addCommand("setRebootStatus.html", &setRebootStatusCmd);
    
        /* start the server to wait for connections */
        webserver.begin();
    // End:   Setup for WebServer, requires WebServer.h
    
}

void loop() {

    // Begin: Loop for WebServer, requires WebServer.h
        // process incoming connections one at a time forever
        webserver.processConnection();
    // End:   Loop for WebServer, requires WebServer.h

}
1 Like

I have similar experiences where the core continues to breath cyan but the user loop is not being executed. You get the long pause, CFOD and reset. In the firmware there is some code that resets the core if it detects a CFOD. (You can also disable this reset)

One thing you can do is compile the code set to display the debug output. Put the results out here for us to look at. Another thing to try is to disable the cloud functionality and see how long your core can stay alive. That actually seems to help a little, but then you lose remote programming capability.

At the current state of the firmware, it appears that only the baked in cloud functions work with any reliability. Non-cloud network functionality is a point of frustration for many people right now.

Hope those couple of suggestions help. The debug output would be interesting to see.

2 Likes

Hi @iiiman,

Just as a sanity check, have you tried updating your cc3000 module with the CLI, or with deep update?

I read through your code, and I wonder also if your myMessage concatenation block might be causing lots of ram to be used? Might be silly, but have you tried creating it in a single large block, or streaming it with lots of server.print calls?

Thanks,
David

2 Likes

@Dave, even though I was sure it wouldn’t help, I went ahead and updated the cc3000 module with the CLI before my first post, just in case. Unfortunately there was no change in behavior.

I guess it is possible that the myMessage concatenation block is using too much RAM, but this problem persists even if I remove that code completely and go with very basic functionality.

Thanks for the suggestions though, any thoughts and suggestions are definitely welcome… keep them coming. I really want this to work and so far I am very concerned that it appears that direct use of the Spark Cloud is required for any app that is to be stable and reliable. I would honestly love it if the problem would turn out to be some dumb mistake in code that I wrote :stuck_out_tongue_winking_eye:

@iiiman - could you prune your example code down to the minimum needed to cause the problem?

Not sure if this is relevant, but perhaps it has to do with some data not fully read? https://community.spark.io/t/sending-emails-from-the-core-locking-up/2545/6?u=misternetwork

Not sure if it is similar to [quote=“iiiman, post:1, topic:6598”]
webserver.processConnection()
[/quote]

@MisterNetwork, that is the same thing I have been thinking on this. I pulled that method from the all of the different Webserver.h examples listed in the Community Library, so I have scrapped the code in favor of writing a more basic webserver from scratch, without a library and test as I go. That way, if the problem is physical, I will find the minimal code needed to reproduce the error that @mdma suggested I look for… and if it the problem is in that library then I shouldn’t have the same problem.

I still have all of the code that was causing the problem, I just started a new app, so I can go back to it if I fail miserably :smile:

My first goal will be to simply have a very simple webserver that echoes any request it gets back to the client. Once I get to that point I can begin to parse the request and have different requests call simple functions to flip the relays or report information as necessary.

In the meantime, thanks for the suggestions… they are still welcome, and when I get some results from this endeavor I will let you guys know.

1 Like

Not a solution, but a link to more people with a similar issue …
Cyan Breathing & unresponsive

Update:

I realized that I am setting up my spark core to respond only to one server anyway, so I moved the actual webpage to the server and aside from that I left my code intact and added a function to send an http heartbeat to the server every 30 seconds (I am going to increase it to 2 minutes, but for the purposes of testing it’s short now). It ran for 14 hours with only 2 interrupts of about 3 minutes each. After that, I ran a function that disconnected the spark core from the cloud and it currently is currently on it’s second day of continuous running without missing a beat, and I have been making sure to trigger the functions the same way as I would if it were installed. So far it seems solid.

I’m going to leave it in this state for a few days… fingers crossed.

And we’re back…

I have torn apart my code and rewritten much of it multiple times at an attempted exercise in efficiency. After a good bit of troubleshooting and retesting, here are just a few of the things I did to clean up my code and setup:

  • I removed a function call that caused the core to connect TCPclient to a webserver while there was already a client connected to the core TCPserver.
  • I simplified the TCPclient connection to the webserver so there now there are no loops, if logic, or concatenation while the TCPclient is connected to the webserver.
  • I changed to a different webserver from a simplistic and slightly buggy EventGhost setup to an Apache server on a RaspberryPi purpose built for this application.

I also went ahead and used D7 and the core LED to determine which connection (incoming or outgoing TCPclient) was causing a problem. When a TCPclient is connected to the core TCPserver, the core LED changes color. Likewise, when the core opens a TCPclient connection to the webserver, D7 is lit. This indicator has shown me that there is a problem when the core opens a TCPclient connection to the webserver, and it seems that sending the same message to the webserver fails at a seemingly random frequency.

I have the core opening a TCPclient connecting to the webserver and sending a messaged formatted as an HTTP GET. This function is called from the loop function (roughtly) every minute, and when the core receives a request to perform an action. The webserver interprets the message and logs it.

Here is the code that opens the TCPclient connection to the server.

// Send http GET
digitalWrite(INDICATORLED, HIGH);
TCPClient client;
client.connect(hostname, port);
client.print("GET /scripts/response.php?type=");
client.print(operationPerformed);
client.print("&message=");
client.print(WiFi.SSID()); // WiFi SSID of connected network
client.print("_");
client.print(WiFi.RSSI()); // WiFi signal strength
client.print("_");
client.print(Spark.connected()); // Status to determine if connected to Spark Cloud
client.print("_");
client.print(digitalRead(RELAY1)); // Status of RELAY1
client.print("_");
client.print(digitalRead(RELAY2)); // Status of RELAY2
client.print("_");
client.print(messageCounter); // logCounter
client.print(" HTTP/1.0\r\n");
client.print("Connection: close\r\n");
client.print("Host: ");
client.print(hostname);
client.print("\r\nContent-Length: 0\r\n\r\n");
delay(10);
client.flush();
client.stop();

// Turn off INDICATORLED
digitalWrite(INDICATORLED, LOW);

Somewhere between the digitalWrites to INDICATORLED (which equals D7), the core is getting hung up and can hang for several minutes, as indicated by the D7 LED remaining lit. After the first time this happens after the core has been reset, the core is very unstable, and usually (but not always) ignores most connection attempts to the core TCPserver.

I’m hoping I’m just making some simple mistake here instead of the TCPclient simply being unstable.

Here is the full code:

SYSTEM_MODE(SEMI_AUTOMATIC);

// *** DECLARE / INITIALIZE

// Declare heartbeatTimer
    int heartbeatTimer;
    
// Declare logCounter
    int messageCounter = 0;
    
// Declare / Initialize bootNotificationSent
    bool bootNotificationSent = false;

// Declare / Initialize disconnectFromCloudOnBoot
    bool disconnectFromCloudOnBoot = false;

// Begin: Declare/Initialize for relay shield
    int RELAY1 = D0;
    int RELAY2 = D1;
    int RELAY3 = D2;
    int RELAY4 = D3;
    int INDICATORLED = D7;
// End:   Declare/Initialize for relay shield

// Begin: Declare / Initialize for TCPServer
    TCPServer server = TCPServer(80);
// End:   Declare / Initialize for TCPServer

// *** FUNCTIONS

    // Function used to return indexed substring from client
    String getValue(String data, char separator, int index) {
        int found = 0;
        int strIndex[] = {
            0, -1      };
        int maxIndex = data.length()-1;
    
        for(int i = 0; i <= maxIndex && found <= index; i++) {
            if(data.charAt(i) == separator || i == maxIndex) {
              found++;
              strIndex[0] = strIndex[1]+1;
              strIndex[1] = (i == maxIndex) ? i+1 : i;
            }
        }
      return found > index ? data.substring(strIndex[0], strIndex[1]) : "";
    }

    // Function to perform operation received from client
    void performOperation(String operation) {
        // Turn on INDICATORLED while sending message
        char hostname[] = "10.3.2.106";
        int port = 80;
        int clientTimeoutCounter = 0;
        String operationPerformed = "";
        // Perform actions based on operation
        if (operation == "statusHeartbeat") {
            // Report that Core has booted before first heartbeat
            if (bootNotificationSent == false) {
                operationPerformed = "SparkCoreBoot";
                bootNotificationSent = true;
                // Disconnect from cloud if disconnectFromCloudOnBoot is true
                if (disconnectFromCloudOnBoot) Spark.disconnect();
            } else {
                // Every 15th status, reset counter
                if (messageCounter < 15) {
                    messageCounter++;
                } else {
                    messageCounter = 0;
                }
                operationPerformed = "status";
            }
        } else if (operation == "pressGarageButton") {
            digitalWrite(RELAY1, HIGH);
            delay(500);
            digitalWrite(RELAY1, LOW);
            operationPerformed = "garageTriggered";
        } else if (operation == "disable") {
            digitalWrite(RELAY2, HIGH);
            operationPerformed = "garageButtonDisabled";
        } else if (operation == "enable") {
            digitalWrite(RELAY2, LOW);
            operationPerformed = "garageButtonEnabled";
        } else if (operation == "sparkCloudConnect") {
            Spark.connect();
            operationPerformed = "SparkCoreConnectedToCloud";
        } else if (operation == "sparkCloudDisconnect") {
            Spark.disconnect();
            operationPerformed = "SparkCoreDisconnectedFromCloud";
        } else if (operation == "resetSparkCore") {
            operationPerformed = "SparkCoreReset";
        } else {
            // Invalid command, abort
            return;
        }
        // Send http GET
        digitalWrite(INDICATORLED, HIGH);
        TCPClient client;
        client.connect(hostname, port);
        client.print("GET /scripts/response.php?type=");
        client.print(operationPerformed);
        client.print("&message=");
        client.print(WiFi.SSID()); // suffix[0]: WiFi SSID of connected network
        client.print("_");
        client.print(WiFi.RSSI()); // suffix[1]: WiFi signal strength
        client.print("_");
        client.print(Spark.connected()); // suffix[2]: Status to determine if connected to Spark Cloud
        client.print("_");
        client.print(digitalRead(RELAY1)); // suffix[3]: Status of RELAY1
        client.print("_");
        client.print(digitalRead(RELAY2)); // suffix[4]: Status of RELAY2
        client.print("_");
        client.print(messageCounter); // suffix[5]: logCounter
        client.print(" HTTP/1.0\r\n");
        client.print("Connection: close\r\n");
        client.print("Host: ");
        client.print(hostname);
        client.print("\r\nContent-Length: 0\r\n\r\n");
        delay(10);
        client.flush();
        client.stop();

        // Turn off INDICATORLED
        digitalWrite(INDICATORLED, LOW);

        // In the case of operation == resetSparkCore, reboot Core after connection is closed
        if (operation == "resetSparkCore") Spark.sleep(SLEEP_MODE_DEEP,5);
    }

void setup() {

    // Connect to Spark Cloud (System mode is set to semi-automatic)
        Spark.connect();

    // Begin: Setup for relay shield

        //Initilize the relay control pins as output
        pinMode(RELAY1, OUTPUT);
        pinMode(RELAY2, OUTPUT);
        pinMode(RELAY3, OUTPUT);
        pinMode(RELAY4, OUTPUT);
        pinMode(INDICATORLED, OUTPUT);
        // Initialize all relays to an OFF state
        digitalWrite(RELAY1, LOW);
        digitalWrite(RELAY2, LOW);
        digitalWrite(RELAY3, LOW);
        digitalWrite(RELAY4, LOW);
        digitalWrite(INDICATORLED, LOW);
    // End:   Setup for relay shield

    // Start the server
    server.begin();

    // Initialize serial connection
//    Serial.begin(9600);

    // Initialize heartbeatTimer
    heartbeatTimer = Time.now();
}

void loop() {

    int index = 0;
    int BUFSIZE = 255;
    char serverClientline[BUFSIZE];
    TCPClient serverClient = server.available();
    String operation = "";
    // IF for heartbeat functionality
    if (Time.now() > heartbeatTimer + 60) {
        heartbeatTimer = Time.now();
        performOperation("statusHeartbeat");
    }
    // IF for HTTP server functionality
    if (serverClient) {
// Serial.println("--- new client");
RGB.control(true);
RGB.color(255, 127, 0);
        bool firstLineNotCompleted = true;
        bool newLine = false;
        // an http request ends with a blank line
        while (serverClient.connected()) {
            if (serverClient.available()) {
                char c = serverClient.read();
                // Store first line from client to clientline
                if (c != '\n' && c != '\r' && firstLineNotCompleted) {
                    serverClientline[index] = c;
                    index++;
                    if(index >= BUFSIZE) {
                        index = BUFSIZE -1;
                    }
                    continue;
                } else {
                    firstLineNotCompleted = false;
                }
                // If a blank line is reached, respond appropriately
                if (c == '\n' && newLine) {
                    String urlString = String(serverClientline);
                    String urlSubString = urlString.substring(urlString.indexOf('/'),urlString.indexOf(' ',urlString.indexOf('/')));
                    operation += getValue(urlSubString, '/', 1);
// Serial.println(operation);
                    break;
                } else if (c == '\n') { // If it's a new line, set newLine = true
                    newLine = true;
                } else if (c != '\r') { // If it's any other character, set newLine = false
                    newLine = false;
                }
           }
        }
        // give the web browser time to receive the data
        delay(10);
        // close the connection:
        serverClient.flush();
        serverClient.stop();
// Serial.println("--- client disconnected");
RGB.control(false);

    }
    // IF for HTTP client functionality
    if (operation != "") performOperation(operation);
}

Just as a quick flick over the connection code i can see

TCPClient client;

that should be before setup not in the loop.

Do a
client.stop();
in its place just in case it gets left open somehow.

add an if statement to the connect call, if it succeeds then do the client.prints (and set the LED low)… otherwise flush and stop the client (and leave the LED lit so you know it didn’t connect).

See how you go with those changes and let us know

1 Like

Thanks for the suggestions. I made those changes, but unfortunately the core is just as unstable as before, through 5 re-flash and resets.

How does it go if you change the mode to automatic? there are some others with a similar issues when running in semi auto

It has the same issues. I originally moved it to semi-automatic in an effort to cut down on the core communications, because the core became unusable when it couldn’t reach the Spark Cloud at all one day. I’ve done frequent tests with auto vs semi, and semi has had a slightly higher percentage of uptime during tests (though that is just what I’ve observed).

the only thing i can suggest is to try a different method of flushing the tcp client. might not make any difference but hey give it a go

while (client.available()) client.read();

oh and another thing… may help may not??

and similar

I appreciate the suggestions. I have looked into delaying the client print past and even tried adjusting that time and at one point even had a timed looped that waited up to 1000 ms for the client to close on it’s own, all with no improvement of stability.

I’m also curious to see what Spark’s support personnel suggest… hopefully @Dave will forgive me for calling him out here.

1 Like

Hi @iiiman

I read through your code quickly and have a question: Are you trying to have both a TCPServer and TCPClient connection to different hosts at the same time? I can’t see clearly from the code what the lifetimes of the objects are meant to be.

It seems like some of your code is a web server but there is also a HTTP GET with parameters to a local host. I agree with @Hootie81 that you are swimming upstream when you try to stack allocate a TCPClient for the GET request. Obviously TCPServer gives you a client, so you should use that one.

Also String.substring is a memory intensive operation since it allocates new memory for every substring.

Are you seeing red SOS LED flashes?

2 Likes

Check out my threads. In my TCP one I solved this through a number of things, one is by using write instead of print

My Webserver library based device was so unstable that most often it didn’t run longer than a couple of minutes. The first major stability improvement I achieved by calling Spark.disconnect() somewhere in the setup(). Still the device would crash within a day, often less than 10 hours.

Now I’ve made a minor change that appears to make a major improvement. At the moment my Spark Core has been running for more than 50 hours straight without having to reset. The minor change (if I remember well) is that I’ve changed the order of some calls. Most of the action URLs in my server are called to perform some action.
Previously the command handling callback functions executed the action (e.g. beep ten times) and then at the end called the function server.httpSeeOther("/") to redirect the client back to the index page. Now I changed that by first calling server.httpSeeOther("/") and then executing the action. Somehow this makes a big difference.

I assume that the big difference in stability is linked to the delays used in executing the action. Those delays make the server less responsive and I presume that somehow the lack of timely response data caused the CC3000 driver to block. Note that it didn’t misbehave for every call to the action URL, but only every now and then. However, my logfiles showed that often it would stall within half a minute after responding to an action URL and then time-out.

FYI those time-outs were most often for exactly 60 seconds (not 58, not 62, no 60!) before returning control to my code. Less often it was a precise 20-second time-out. Such long time-outs would then be followed by an automatic reset by my own wathdog routine.

Edit: as an experiment I just switched the spark cloud back on to see how long it will run without reset. In my server I’ve added a handy function that let’s me toggle the connection to the cloud by pressing the mode button. Next to that my server features a heartbeat blink & blip. These are a made using an active buzzer and LED on D0, that are switched on for one ms every second. Already I can hear that the one second blip sounds less regular than before. Looking at the LED I see that every few seconds it blinks less brightly. Could this be timing related? I can imagine that with the Spark cloud connected, the 1ms delay gets to be less precise…

Edit 2: It just stopped. :frowning:
In the logfile I can see that it stopped close to 17 minutes after connecting to the cloud. It ended again with a 60 second time-out (60.861 sec), which occurred about four minutes after the last page-request, so seemingly spontaneous and unrelated to the request. My watchdog resets the core automatically and now it’s running again with the Spark cloud disconnected. The heartbeat blip sounds more regular and the blinks are of similar brightness.

1 Like

@bko, I am not having two connections open at once. At one time I did, but in an effort to make things more efficient I made some changes to make sure that the connection to the core’s TCPserver is closed before the core opens a connection out to the webserver that will write the log.

Here is an explanation of how I intend my core to function:

  • The core waits to receive an HTTP GET, and functions like a REST server (it parses everything to the right the “/” of the HTTP get and accepts it as a single command.
  • Based on that single command, it performs whatever task assigned to that command and then sends a message in the form of an HTTP GET to a webserver. The message includes the command and a status of the core.
  • In addition to the above, every 60 seconds the core sends a status message in the form of an HTTP GET to the webserver that is keeping the log.

The messages the core sends are used as both heartbeat and log of core actions. If the heartbeat does not happen over a 5 minute period, an alert is sent that the core is offline. Also, every time the core triggers the garage door to open, an alert is sent by the webserver that is doing the logging, to make sure I know when it is triggered remotely.

To send the spark core commands, an HTTP GET request is sent to it, for example:

curl 'http://10.3.2.80/pressGarageButton'

The HTTP GET that the core sends to the webserver is formatted like this:

    GET /scripts/response.php?type=<commandReceived>&message=<NetworkSSID>_<NetworkRSSI>_<Spark.connected()>_<digitalRead(RELAY1)>_<digitalRead(RELAY2)>_<value of messageCounter variable> HTTP/1.0\r\n

Example:

    GET /scripts/response.php?type=garageTriggered&message=MyNet_-65_1_0_1_12 HTTP/1.0\r\n

The results of my testing indicate that most problems occur as part of the status message that is sent every 60 seconds, which is why I’m focusing there right now. After it has the first problem (which can be anywhere from 1 minutes to several hours), the problem sometimes resolves, sometimes goes back to sending the status message but not accepting commands, but so far has never stopped permanently (though it can take a couple of hours to recover at all). Whether or not the core is connected to the cloud or whether the core is in automatic or semi-automatic, the problems are the same.

As far as String.substring is concerned, I am only using it to parse the incoming message to extract the command. Since I only send one command, it will used only once per loop. I have not seen any red SOS LED flashes in any of my testing. As the core does not have a string function to parse, this is what I have used. Also, many failures have been recorded without the core receiving a single message since the last reset, which is why I have been focusing on the core sending out it’s messages.

I am perfectly willing to change from substring to another method and I can take @CloudformDesign’s suggestion to move from client.print to client.write, but I am starting to have doubts. Why are these functions available and documented if they cause problems? Also, the base code and methods that I have been using here are from Arduino code that does work for an Arduino. The Spark Core is advertised as “Arduino-ish” with the same wiring and programming language, yet functions that work for an Arduino and are detailed in the Spark Core-code firmware documentation seem to cause many problems. I apologize if I’m being overly harsh, but I am getting a little frustrated by developing code from documentation that is starting to prove untrustworthy.