Simple UDP program breaks the core

SomeFixItDude · June 2, 2014, 4:02pm

Hey all,
Just wanted to reach out to see if anyone else is having this problem and/or if the Spark team is aware of these problems. If the spark core is sending udp data to an ip address on a regular basis and receives one datagram small enough to fit in the buffer that is not processed (read from) you get a CFOD.

Please note that the receive buffer is larger than the 12 bytes and I have patched the CC3000 to 1.28

#define RX_BUF_MAX_SIZE	512

Here is the setup:

Core transmitting udp packets on an interval.
Same core receives a packet from another source.
Within few seconds CFOD.
Resets does it again and again…

In my example below, spark core has 10.0.0.13 ip and the nodejs server is running on 10.0.0.3.

spark core code:

UDP Udp;

unsigned char TxMsg[12] = { 1, 254, 1, 254, 1, 254, 43, 212, 71, 184, 3, 252 };

void setup() {
    pinMode(D7, OUTPUT);
    digitalWrite(D7, LOW);

    Udp.begin(9000);
}

void loop() {
    Udp.beginPacket(IPAddress(10,0,0,3), 9000);
    Udp.write(TxMsg, 12);
    Udp.endPacket();
    
    digitalWrite(D7, LOW);
    delay(200);
    digitalWrite(D7, HIGH);
}

nodejs code:

var PORT = 9000;
var HOST = "10.0.0.13";

var dgram = require("dgram");
var message = new Buffer("sparky  CFOD");

var client = dgram.createSocket("udp4");
client.send(message, 0, message.length, PORT, HOST, function(err, bytes) {
    if (err) throw err;
    console.log("UDP message (len = " + message.length + ") sent to " + HOST +":"+ PORT);
    client.close();
});

What I would have assumed was… the packet would just sit there on the buffer and do absolutely nothing.

I also noticed a few other strange UDP behaviors.

Anomaly 1: I don’t seem to ever need to call Udp.endPacket(); If I call the Udp.write(char) method, I can not call it it multiple times to assemble the message I want. Instead each call to Udp.write(char) writes out a new datagram and it is delivered right away. I was expecting that… I would do beginPacket, write, write, write, endPacket and then the packet is sent. In this case you would have sent 3 packets. In order to send one datagram I need to have my own buffer/array and pass the Udp.write(buffer, size).

Anomaly 2: More than one packet on the received sitting on the buffer is processed as though it is one datagram. Spark core receives 2 datagrams both 12 bytes in my case from two different sources. A call to parsePacket seems to process the datagrams as one giant datagram 24 bytes. I can only retrieve source information such as ip and source port from the first datagram and not the second.

Anomaly 3: Whenever the above spark core program is running flashing the core via web ide is a pain. I basically have two choices. 1. Use my phone to try to push tinker to the core and then re-flash what I want. OR 2. Use the dfu to do it. What happens is the core starts to flash, does a small pause with the magenta light then goes dark. The core then resets running last program.

I am also expecting that if the receive buffer is full and more data is received that the new data is lost and does not crater the core. I haven’t tested this yet, but didn’t know if anyone has or not?

I find it really frustrating to see on waffle that the udp problems are sitting in the ideas category when udp and tcp functionality is what the spark core is mainly about.

I know I asked a lot of questions here but I am looking for some insight. Maybe I need to code things differently etc.

Thanks again all

bko · June 2, 2014, 5:00pm

Hi @SomeFixItDude

Since you never pick up the UDP data sent to the core, the TI CC3000 driver will run out of buffers in the shared buffer pool for UDP eventually and signal an error in some way, I think. I know they moved the pool used for ARP in the recent patch out of the TCP rx buffer pool, but I don’t recall where they moved it to.

Anomaly 1: With upd.write() you are talking almost directly to the TI CC3000 driver’s socket interface. The way it works now is how it is designed to work since it avoids the UDP object at the user level having a write buffer. If you need to buffer stuff up, have your own buffer and write it all in one shot. Obviously it could be designed to work the other way with udp.endPacket flushing a buffer.

Anomaly 2: Discussed in the forum and as an issue at

@peekay123 and I have asked the Spark team to investigate and I know it is on their list. The TI CC3000 datasheet errata seems to say that this may be hard to fix.

Feel free to add your comments to the issue.

Anomaly 3: Normally when I see this (and I do see this) it means that my code has overwritten something important in RAM and reflashing Tinker seems to be required. You may have some luck with using an input pin to put the main loop() into just calling SPARK_WLAN_Loop() so you can flash.

SomeFixItDude · June 2, 2014, 7:41pm

@bko thanks for your reply. I appreciate your insight.

I think more is going on here, only because I don't keep sending more data to the spark core. So all I have really taken up is 12 bytes + header. The nodejs program only sends one datagram and I don't have to send multiple to send the core in to CFOD tissy. Then again maybe the core/cc3000 does not function this way? Also, I have modified the program to read anything in the buffer and discard it and I still CFOD. So I suspect something else is wrong here.

No complaints here, just not sure if it is working as intended. I have since switched my code to buffer it myself. Which way does the arduino behave?

This was the issue I was referring to that I think is perceived to be on the back burner.

I like your work around, but the core should be flashable really. The above code isn't very complex and I am guessing uses very little ram. If I had something larger and was messing around with pointers, sure.

Thanks Again for your reply

AndyW · June 2, 2014, 7:53pm

I believe the way the CC3000 functions, for better or for worse, is that an incoming packet consumes a buffer, regardless of the packet length. So a 12 byte packet ties up as much buffer space as a 1500 byte packet, at least within the CC3000.

The CC3000 expects that packets are drained from the CC3000 as quickly as practical, in order to free resources on the module.

With TCP, the CC3000 can play games with the window to control the rate of incoming data; obviously there is no such alternative with UDP. However, the preferred action should be to drop an inbound packet; not get so mangled that the has to declare CFOD.

SomeFixItDude · June 2, 2014, 11:40pm

Thanks all for the information.

I would however like some help figuring out how I can transfer udp packets without hitting the CFOD. If you modify the above program to dump whatever it receives you still CFOD. Why? Is there some way to determine what happened? Also, I upgraded the firmware to 1.28. Are there host driver changes that needed to be implemented for the CC3000 to work properly? All in all I love the product, just would love it more if the core was not resetting every 10 - 15 minutes.

Thanks Again,
Brian

zachary · June 3, 2014, 6:30am

Haven’t had a chance to research yet with Maker Faire and JSConf, but wanted to chime in that the UDP issues are near the top of the list. Thank you and keep the feedback coming! Some behavior here is surely fixable. Totally agree that, even if we can’t control some of these issues, expected behavior needs to be documented very clearly.

bko · June 3, 2014, 11:25am

I just wanted to add that I played around with this for a while last night but not get any good results.

Adding instrumenting code to the above, I was able to see that for my tests, it would send 6 packets and then the user loop no long ran, and then things went further south. You do have to look at the return value from udp.write() since it can and does return -1 and -2 values indicating problems (I will write some doc on this). But in my tests, this only happened occasionally.

I don’t know what is different about this faster transmit rate vs. my NTP UDP code that sends typically one packet per hour that runs for days.

AndyW · June 3, 2014, 12:20pm

Hmm - 6 is a magic number.

There are approximately 6 buffers free when your average CC3000 starts. I suspect something is not ensuring the buffers are freed.

This could be coincidence, but I wanted to document this.

SomeFixItDude · June 3, 2014, 12:51pm

@bko thank you! Glad you ran the program and had the same problems. Well glad I am not nuts or my patching was not defective, you get the idea. What do the -1 and -2 values represent? And if you get that should we be taking some action or just keep carrying on? Also is there any merit to the comment about patching to 1.28 but didn’t update any of the host driver files. Not sure what version we have in the spark firmware but when I get home I’ll see if there are any diffs.

@AndyW Nice. How did you figure that out? Maybe I could add monitoring to that in the loop and we could see if it is a buffer issue.

AndyW · June 3, 2014, 4:10pm

There should be no host driver mods required for this CC3000 firmware patch - at least that is my understanding.

I spent a lot of time decoding the SPI traffic between the STM32 and the CC3000 as part of the other CFOD investigation.

SomeFixItDude · June 3, 2014, 8:35pm

Trying to do some deeper troubleshooting. How do I get the core debug prints to work? I tried defining DEBUG_BUILD and the output function but no luck… Anyone? And by the codes here I am getting HardFault

/********************* Red Flash Codes *********************
1(Faults,RGB_COLOR_RED,HardFault) 
12 (Faults,RGB_COLOR_RED,NMIFault)
3 (Faults,RGB_COLOR_RED,MemManage)
4 (Faults,RGB_COLOR_RED,BusFault)
5 (Faults,RGB_COLOR_RED,UsageFault)
6 (Cloud,RGB_COLOR_RED,InvalidLenth)
7 (System,RGB_COLOR_RED,Exit)
8 (System,RGB_COLOR_RED,OutOfHeap)
9 (System,RGB_COLOR_RED,SPIOverRun)
10 (Softare,RGB_COLOR_RED,AssertionFailure)
11 (Softare,RGB_COLOR_RED,InvalidCase)
************************************************************/

#define DEBUG_BUILD

UDP Udp;

unsigned char TxMsg[12] = { 1, 254, 1, 254, 1, 254, 43, 212, 71, 184, 3, 252 };
unsigned char recbuf[12];
bool isDebugReady = false;

void debug_output_(const char *p) { 
    if (isDebugReady)
        Serial.print(p); 
}

void setup() {
    Serial.begin(9600);
    pinMode(D7, OUTPUT);
    digitalWrite(D7, LOW);

    Udp.begin(9000);
    while (!Serial.available()) {
        RGB.color(0, 255, 0);
        delay(100);
        RGB.color(0, 0, 255);
        delay(100);
        RGB.color(255, 0, 0);
        delay(100);
        Serial.println("Send some data to continue to loop.");
    };
    isDebugReady = true;
}

void loop() {
    int32_t packetLen = Udp.parsePacket();
    
    //Dump any packets RX
    while (packetLen > 0) {
        Udp.read(recbuf, packetLen >= 12 ? 12 : packetLen);
        packetLen = Udp.parsePacket();
    }

    //TX send our test
    Udp.beginPacket(IPAddress(10,0,0,3), 9000);
    Udp.write(TxMsg, 12);
    Udp.endPacket();
    
    //I am still alive
    digitalWrite(D7, LOW);
    delay(200);
    digitalWrite(D7, HIGH);
}

bko · June 3, 2014, 9:01pm

Hi @SomeFixItDude

The red flashing panic codes are doc’ed here

You will need to do local builds in order to do a DEBUG_BUILD, I believe. Local build instructions are here:

SomeFixItDude · June 3, 2014, 11:39pm

I have the local build working but what I can’t find is how to enable the debug build. I defined it in the top of my program same one I just posted. I am somewhat of a cpp newbie. Every time I issue the make command I see in the parameters -DRELEASE_BUILD How do I actually get it to use the debug output statement I put in there. I think this would help debug not only this issue but other issues.

Thanks Again,
Brian

bko · June 4, 2014, 2:16am

Try this:

make DEBUG_BUILD=y clean all

SomeFixItDude · June 4, 2014, 4:37am

@bko that did it! Here are the results. The program ran a little bit longer but I think that was because I was staring at it.

Debug Output:

earlier:<DEBUG> int Spark_Receive(unsigned char*, int) (419):bytes_received 2
earlier:<DEBUG> int Spark_Receive(unsigned char*, int) (419):bytes_received 16

    0000339797:<DEBUG> virtual size_t UDP::write(const uint8_t*, size_t) (148):sendto(buffer=20000000, size=12)=12
    0000340013:<DEBUG> virtual size_t UDP::write(const uint8_t*, size_t) (148):sendto(buffer=20000000, size=12)=12
    0000340230:<DEBUG> virtual size_t UDP::write(const uint8_t*, size_t) (148):sendto(buffer=20000000, size=12)=12
    0000340445:<DEBUG> virtual size_t UDP::write(const uint8_t*, size_t) (148):sendto(buffer=20000000, size=12)=12
    0000340662:<DEBUG> virtual size_t UDP::write(const uint8_t*, size_t) (148):sendto(buffer=20000000, size=12)=12
    0000340873:<DEBUG> int Spark_Connect() (671):sparkSocket Now =0
    0000340879:<DEBUG> int Spark_Disconnect() (742):
    0000340885:<DEBUG> int Spark_Disconnect() (751):Close
    0000344490:<DEBUG> set_socket_active_status (837):Sd=0, Status SOCKET_STATUS_IACTIVE
    0000344499:<DEBUG> int Spark_Disconnect() (753):Closed retVal=0
    0000344506:<DEBUG> set_socket_active_status (837):Sd=0, Status SOCKET_STATUS_ACTIVE
    0000344515:<DEBUG> int Spark_Connect() (678):socketed sparkSocket=0
    0000344526:<DEBUG> int Spark_Connect() (732):connect
    0000352532:<ERROR> hci_event_handler (257):Timeout now 352532 start 344532 elapsed 8000 cc3000__event_timeout_ms 8000
    0000352544:<ERROR> hci_event_handler (258):Timeout waiting on tSLInformation.usRxEventOpcode 0x1007
    0000352554:<DEBUG> int Spark_Connect() (734):connected connect=-1
    0000352561:<DEBUG> int Internet_Test() (624):socket
    0000352567:<DEBUG> set_socket_active_status (837):Sd=2, Status SOCKET_STATUS_ACTIVE
    0000352576:<DEBUG> int Internet_Test() (626):socketed testSocket=2
    0000352583:<DEBUG> int Internet_Test() (648):connect
    0000360589:<ERROR> hci_event_handler (257):Timeout now 360589 start 352589 elapsed 8000 cc3000__event_timeout_ms 8000
    0000360601:<ERROR> hci_event_handler (258):Timeout waiting on tSLInformation.usRxEventOpcode 0x1007
    0000360611:<DEBUG> int Internet_Test() (650):connected testResult=-1
    0000360619:<DEBUG> int Internet_Test() (660):Close
    0000360625:<DEBUG> set_socket_active_status (837):Sd=2, Status SOCKET_STATUS_IACTIVE
    0000360634:<DEBUG> int Internet_Test() (662):Closed rv=0
    0000360640:<ERROR> void SPARK_WLAN_Loop() (684):Resetting CC3000 due to 2 failed connect attempts
    0000360661:<DEBUG> virtual size_t UDP::write(const uint8_t*, size_t) (148):sendto(buffer=20000000, size=12)=12
    0000360873:<DEBUG> void SPARK_WLAN_Loop() (528):Resetting CC3000!

Going to get some rest, but will look at this output tomorrow. Not sure if much is gained by it or not.

-Bri

SomeFixItDude · June 4, 2014, 3:13pm

Just quick update. Unless someone else tries before me, I am thinking it would be a great test to try this with disabling the cloud. Just to see if using udp by itself causes the CC3000 to die. If the core works fine then maybe it is a problem with cloud service itself or how the core interacts with the cloud?

-Bri

SomeFixItDude · June 4, 2014, 8:12pm

The cloud hates me!

So if I call Spark.disconnect() as my last line of the setup function, I have no problems at all with resetting. I decreased my delay and have sent lots of packets to and from the core (discarding them and not discarding them) and I don’t freeze or reset. I have been going strong now for 4+ hours.

So this makes me think the problem is in the cloud service layer or the heartbeat with the core. Perhaps when we are having communication problems with cloud instead of just resetting take a tiered approach. First try to rebuild the connection, next reattempt after some delay. Maybe even continually doing that retry loop all while allowing the user code to run. I think auto-resetting is a little harsh and doesn’t allow me to take care of any tasks to prepare for reset.

Anyway I think we should concentrate on the actual cloud interaction. I don’t seem to be tripping up the core with any network activity that is local. Which I think is great news, I think we can more easily change the cloud interactions.

Thoughts? Maybe some suggestions on how to debug cloud interactions?

psb777 · June 5, 2014, 2:17am

I know you describe how it works in Spark UDP but the OP is correct in his assumption that the no data should be sent until UDP.endPacket() is called. When, in the underlying C code, a sendto() is called on a UDP socket it is then that one UDP datagram is sent. Spark UDP is broken in this regard as the documentation says (or implies unmistakably) that one UDP datagram is sent for each beginPacket() endPacket() pair. It is imperative that the Spark UDP class does its own buffering, and that sendto() is called only once per UDP datagram.

psb777 · June 5, 2014, 2:20am

With UDP I would expect that the newer data should replace the older data - the unread buffer should be cleared if there is no more buffer space rather than the newer data being lost. With TCP the older data in the full buffer should be retained and the newer data discarded.

AndyW · June 5, 2014, 2:50am

Well, the CC3000 doesn’t always do what is expected, and we work with TI to try and fix as many of these problems as possible,

Topic		Replies	Views
Strange UDP bug	32	6432	March 10, 2014
I'm having a problem getting UDP traffic from my spark cores Troubleshooting	13	3323	July 13, 2014
Unreliable UDP: crashes/freezes when sending at high frequency Troubleshooting	28	10739	June 5, 2014
Spark Cloud: UDP Instability? Troubleshooting	0	1137	January 27, 2015
UDP problems with #include "spark_disable_cloud.h" Troubleshooting	11	2677	May 31, 2014

Simple UDP program breaks the core

Related topics