[Solved] Udp.write() and tcp.write() 20 second hang

I’ve been facing this issue for about two months now but it used to happen so rarely that I could never figure out what it was that blocked the core. So yesterday I finally flooded my code with Serial.prints and the issue happens more regularly therefore I’ve been able to pinpoint where my code hangs which is udp.write() and tcp.write().

General loop() Flow:

  1. Check for UDP Packets
    • Parse packet
    • Reply back with a UDP Packet ( 20 second hang)
  2. Check if connected to Remote TCP Server
    • If not connected TCP Server
      • Connect to NodeJs TCP Server
      • Send first time connect message (20 second hang)
    • If connected to TCP Server
      • Check for TCP Packets (Condition never true after a hang)
        • Parse packet
        • Reply back with TCP Packet
      • Send keep-alive TCP Packet (20 second hang)

Some notes and thoughts:

  1. My code runs a UDP-Server, UDP-Client and a TCP-Client
  2. Other network functions except write() seem to work fine. So udp.parsepacket(), udp.read(), udp.available(), tcp.connect(), tcp.connected() work fine even after the code starts to hang.
  3. Haven’t tested tcp.available() or tcp.read() so can’t comment on them yet.
  4. So far I haven’t been able to figure out when this occurs. Sometimes happens within seconds and sometimes takes many hours.
  5. Once either of udp.write() or tcp.write() hangs, both will always hang from there on.
  6. udp.write() and tcp.write() hang for precisely 20 seconds and the code continues to run full speed after that until the next write(). The writes are not successful, that is my NodeJs server or UDP server don’t receive any data from the Spark even after 20 seconds. Serial.prints continue to work though.
  7. I have to an extent ruled out the lack of memory to be a problem because I was able to dynamically allocate a 128 byte buffer and spammed Serial.prints() in my code. I don’t know if that is a good enough test. I won’t be surprised if this is the issue because my code is pretty huge.
  8. According to this post: http://community.spark.io/t/tcpserver-available-blocking-for-20-seconds/6576 … Serial / any other interrupts could be an issue and I have been able to replicate it more often with a lot more Serial prints. But I recall once completely removing all Serial in my code but it used to crash from time to time.
  9. If tcp.write() hangs, then udp.write() definitely hangs as well and vice versa. I don’t know how they both are related.
  10. I’ve used delay(20) where ever I felt that my code runs into long loops.
  11. My code is designed to not connect to the Spark Cloud. I wonder if something is wrong with my code since I’m quite certain other people who connect to the Spark Cloud use tcpClient.write liberally and they don’t seem to have any issues.
  12. I am still running on the firmware that does not use the control your connection update. I’ll move into it tomorrow since it requires me to change my code a little bit and see how it performs.

Please help me out here. I have several sparks sitting idle, let me know if you’d like me to run any kind of tests. I’ll also try having a skeletal code that hangs ready by tomorrow. Thanks! :smile:

As I am sure you understand, we need to see your code to help more.

The last few problems I can remember where things “hang” have been the TI CC3000 waiting for ARPs that never come. If you can boil your test case down and capture network traffic in wireshark or similar, it should be pretty obvious what is going on.

Udp.write() and client.write() can fail for a variety of reasons but the CC3000 being out of buffers or an ARP not being responded to (which leads to an out of buffer condition since can continue to write() ) are common.

@bko Thank you, I will post a bare bones code that has the same problems in a few hours.

2 Likes

Hi @nitred,

If you do have a minimal test case, that would be awesome! After a round of hiring, our firmware team has been cranking again these last few weeks, and I’d love to get them a big list of issues with test cases so they can continue to power through them. :slight_smile:

Thanks!
David

2 Likes

Hi guys, so sorry. I had issues with connecting to the Wifi router on the new firmware, so I got busy with other things. My Spark wouldn’t connect to the Router on the new firmware but on the old firmware it did. I just gave it fresh Credentials over USB and it finally connected. I did not expect that to be a problem so I sort of stalled on this front.

Hopefully I’m not too late. I’ll post the code soon!

I have the semi-minimal code ready and its running. I will post it the moment I can get it to replicate the issue. I’ll also try to post a python script to automatically send udp packets.

See the linked post to my issue. It also suffers from this 20sec block. Spark guys/gals, you really need to look into this as a high priority item - 20sec block on network functions is unacceptable. I’ve also provided minimal code that reproduces the issue on the latest firmware. I’ve completely given up on Spark as its unstable and cost me a lot of development time and money and moved on to another platform.

@Dave @bko So I have about 5 Sparks sitting with minimal code running on them but they haven’t failed in the past week. Unfortunately I always had a power cut or a router reset quite frequently so I couldn’t test for more than 8 hours continuously, which is still way more than how much my original code lasts.

So I went through my original code to see if there were any loop holes in the flow. Is it possible that things can go wrong if I call tcpClient.write() even if the tcpClient is not connected to any remote server?

I think I have to fix the flow anyway so I’ll test and let you know regardless as to how my original code is doing.

@guru_florida Yep a failing Spark on any level is unacceptable, and like you I’ve invested some time on it. But I’d like to give it a little more time for things to settle down. As far as I know, this is the last issue remaining before Spark becomes stable. Spark has grown big over the last few months to an extent that I’m sure they can’t afford to fail as well, so I’ll be sticking around for a little while longer. Also I heard an even more stable Newer Hardware is coming up soon. But it is frustrating.

Hi @nitred

client.write() calls client.status() to figure out if it can write. Client.status() checks to see if the socket is open, WiFi is ready and the socket is active. So I think if you call client.write() on a closed socket nothing bad happens and the call should return -1 indicating error.

1 Like

Great! Thanks for the clarification @bko

1 Like

Hey Guys!

Sorry about the frustration, I just wanted to pop in and let you know we are watching this thread, but I’m sorry we haven’t had the chance to address this particular issue yet. We’re working on it!

Thanks,
David

Hi @guru_florida,

Did you refer to a post in this thread, or post an example? I don’t see anything linked from this thread as you mentioned.

I also checked your thread here and asked you for code, but I don’t think you posted anything:

https://community.spark.io/t/tcpserver-available-blocking-for-20-seconds/6576/5

Thanks!
David

1 Like

Hi @nitred,

It sounds like you’re not seeing an issue now, is that correct?

Thanks,
David

1 Like

@Dave I’m not seeing the issue on my minimal test code as of yet but the code hasn’t been tested for more than 8 hours continuously.

I’m applying some changes to my original code and I will be testing it by tomorrow. Will let you know then :smile:

i made a small program which upload file to a FTP server, using TCPClient. During my test i uploaded considerably fast a file of 5MB (containing just text) reading and writing the content as char: Then trying to upload a smaller file, a jpg (still reading and writing as char, as i forgot to try to change it) i saw was terribly slow.
This was just happening yesterday night very late so i didn’t have much time to test it properly, but it seems something similar…
maybe is just because i need to switch to byte?

Hi @Clorofilla

It sounds like you might be running into the difference between client.write( byteArray, byteLength) and client.print(someString).

The call to client.write() with the length will send one packet with all the bytes in it. The call to client.print will send one back for each character in the string on both Spark and Arduino.

This is getting fixed but for now, client.write() is best.

Hi @bko that helped, im experiencing another problem related to this, not sure if i should post it in here or open a new thread.

In short: write() works at the begin and pass part of the file over the server but, at a certain point get stuck for few seconds, then the application continue, but write doesn’t actually send anything anymore. I will post my code here or in the new thread for testing.

1 Like

Here is the code to see the problem i hope, it requires an SD module connected, just fill this part with the FTP information:

#define FTP_USER "user"
#define FTP_PASS "pass"
#define FTP_ADDRESS "148.251.48.69" 
#define FTP_DATA_ADDRESS "148,251,48,69," //this is the ip address for the data connection (given here  "227 Entering Passive Mode (148,251,48,69,181,150)")
#define FILE_TO_UPLOAD "DSCF0003.jpg"
// **************************************

then when flashed just send any char throught serial and will start to connect, transfer the file and disconnect.

// This #include statement was automatically added by the Spark IDE.
#include "sd-card-library/sd-card-library.h"

#include "application.h"

#define BUF_SIZE 1000

// *** JUST FILL IN THIS TO TEST ***
#define FTP_USER "user"
#define FTP_PASS "pass"
#define FTP_ADDRESS "148.251.48.69" 
#define FTP_DATA_ADDRESS "148,251,48,69," //this is the ip address for the data connection (given here  "227 Entering Passive Mode (148,251,48,69,181,150)")
#define FILE_TO_UPLOAD "DSCF0003.jpg"
// **************************************

// SOFTWARE SPI pin configuration - modify as required
// The default pins are the same as HARDWARE SPI
const uint8_t chipSelect = A2;    // Also used for HARDWARE SPI setup
const uint8_t mosiPin = A5;
const uint8_t misoPin = A4;
const uint8_t clockPin = A3;

TCPClient ClientOperation,ClientData;
File myFile;
String messageToSend,messageReceived;

char LineBreaker[2];
int step=-1;
byte serverAddress[4];
  

int StartingPosition;
int EndingPosition;
int value;
String preString=FTP_DATA_ADDRESS;
String RawValue;
int length;

Sd2Card card;
SdVolume volume;
SdFile root;
SdFile file;

int totalread=0;
int fileSize=0;

void error(const char* str)
{
  Serial.print("error: ");
  Serial.println(str);
  if (card.errorCode()) {
    Serial.print("SD error: ");
    Serial.print(card.errorCode(), HEX);
    Serial.print(',');
    Serial.println(card.errorData(), HEX);
  }
  while(1) {
    SPARK_WLAN_Loop();
  };
}

void InitSD(){
    

  // initialize the SD card at SPI_FULL_SPEED for best performance.
  // try SPI_HALF_SPEED if bus errors occur.
  // Initialize HARDWARE SPI with user defined chipSelect
  if (!card.init(SPI_FULL_SPEED, chipSelect)) error("card.init failed");
  
  // initialize a FAT volume
  if (!volume.init(&card)) error("volume.init failed!");

  Serial.print("Type is FAT");
  Serial.println(volume.fatType(), DEC);
     
  
  if (!root.openRoot(&volume)) error("openRoot failed");
    
    Serial.println("SD initialized");
    
}

void ipArrayFromString(byte ipArray[], String ipString) {
  int dot1 = ipString.indexOf('.');
  ipArray[0] = ipString.substring(0, dot1).toInt();
  int dot2 = ipString.indexOf('.', dot1 + 1);
  ipArray[1] = ipString.substring(dot1 + 1, dot2).toInt();
  dot1 = ipString.indexOf('.', dot2 + 1);
  ipArray[2] = ipString.substring(dot2 + 1, dot1).toInt();
  ipArray[3] = ipString.substring(dot1 + 1).toInt();
}

void connectToMyServer(String ip) {

  ipArrayFromString(serverAddress, ip);
  if (ClientOperation.connect(serverAddress, 21)){// && client1.connect(serverAddress, 9000) && client2.connect(serverAddress, 9000)) {
    Serial.println("connected to the FTP server");
  } else {
    Serial.println("failed ");
  }
}

void setup() {
    Serial.begin(115200);
      
      LineBreaker[0]='\015';
      LineBreaker[1]='\012';
      LineBreaker[2]='\0';

  while (!Serial.available()) Spark.process(); //Spark.process() it seems without the spark has some problem
  
  //just to consume whatever is passed in the begin to start
  while (Serial.available()) Serial.read();

    InitSD();


 connectToMyServer(FTP_ADDRESS);
}

void SendFileContent(char * Filename){

    double currentTime,sentFTPTime;
    int remainingBytes=0;
    uint32_t t;
    double r;
    uint32_t n;
    byte * buf;
    
    
  totalread=0;
  fileSize=0;

   // open the file to send from the SD
  if (!file.open(&root, Filename, O_RDWR)) {
    error("open failed");
  }
  
  //allocate space to store a piece of data
  buf = (byte *)malloc(sizeof(byte)*BUF_SIZE); 
  
  n = file.fileSize()/BUF_SIZE; //chunks to send
  fileSize=file.fileSize();
  t = millis();
  
  Serial.print("n: ");
  Serial.println(n);
  delay(3000);
  
  //send chunks of data, size is BUF_SIZE
  for (uint32_t i = 0; i < n; i++) {
    if (file.read(buf, BUF_SIZE) != BUF_SIZE) {
      error("read failed");
    }
    else{
        //The time interval between two TCP/IP actions must >1 ms.
        while((currentTime-sentFTPTime)< 10) 
            currentTime=millis();
        //Serial.println(buf);
        ClientData.write(buf,BUF_SIZE);
        progressReport(BUF_SIZE);
        sentFTPTime=currentTime;
        currentTime=millis();
    }
        
  }
  free(buf);
  
  
  //at the end this write the remaining bytes if there are any
  remainingBytes=file.fileSize() % BUF_SIZE;
  
  if (remainingBytes>0){
    //allocate space to store a piece of data
    buf = (byte *)malloc(sizeof(byte)*remainingBytes); 
    if (file.read(buf,remainingBytes) != remainingBytes) {
        error("read failed");
    }
    else{
        ClientData.write(buf,remainingBytes);
        progressReport(remainingBytes);
    }
  }
  free(buf);  
  
  //speed report
  t = millis() - t;
  r = (double)file.fileSize()/t;
  Serial.print("Read ");
  Serial.print(r);
  Serial.println(" kB/sec");
  Serial.println("Done");
  file.close();
    
    
} 

void progressReport(int Add){
    totalread+=Add;
    Serial.print(totalread);
    Serial.print("/");
    Serial.println(fileSize);
    
}


void loop() {

  switch(step){//Prepare the message to send
      case 1:
            messageToSend="USER ";
            messageToSend.concat(FTP_USER);
            delay(3000);
      break;
      case 2:
            messageToSend="PASS ";
            messageToSend.concat(FTP_PASS);
            delay(3000);
      break;      
      case 3:
            messageToSend="PASV";
            delay(3000);
      break;      
      case 4:
            messageToSend="STOR ";
            messageToSend.concat(FILE_TO_UPLOAD);
            delay(3000);
      break; 
      case 5:
           ClientOperation.stop();
           Serial.println("ClientOperation stopped");
           step+=1;
      break; 
  } 
  
  if (messageToSend != ""){//Send the message to the server

    length=messageToSend.length();

    ClientOperation.write((uint8_t *)messageToSend.c_str(),length);
    ClientOperation.write((uint8_t *)LineBreaker,2);
      
    Serial.println(messageToSend);
    messageToSend="";
    delay(3000);
  }
  
  //Client operation   
  if (ClientOperation.connected()) {
   
    while(ClientOperation.available()) {//shows reponses messages from the FTP server
        char charac1 = ClientOperation.read();
        Serial.print(charac1);
        messageReceived+=charac1;
    }
    
    if (step==3){//find the last two number that will determine the port to connect to (from the response message "227 Entering Passive Mode (xx,xx,xx,xx,142,73)")
        length=preString.length();
        StartingPosition=messageReceived.indexOf(preString)+length;
        EndingPosition=messageReceived.indexOf(",",StartingPosition);
        value=messageReceived.substring(StartingPosition,EndingPosition).toInt();
        value*=256;
       
        StartingPosition=EndingPosition+1;
        EndingPosition=messageReceived.indexOf(")",StartingPosition);
        value+=messageReceived.substring(StartingPosition,EndingPosition).toInt();
        ClientData.connect(serverAddress, value); //open a connection and wait to transfer the file
        delay(1000);
    }
    
    if (ClientData.connected() && step==4) { 
      Serial.println("ClientData IS COONECTED: ");
      SendFileContent(FILE_TO_UPLOAD);
      Serial.println("COPYING DONE");
      ClientData.stop();
      Serial.println("CONNECTION CLOSED");
    }
    
    step+=1;
    messageReceived="";
    delay(1000);  
  }
  
  
}

Hi @Clorofilla

In order to isolate the problem, I would change from using the SD card library as a source to just writing a sequence of incrementing or decrements bytes (1,2,3…255,255,254,…,3,2,1). That way you can see if the problem is on the network side or the SD card side.