Bug bounty: Kill the 'Cyan flash of death'

kennethlimcp · January 19, 2014, 6:26am

@dorth I’m running it at 1 minute intervals for 3 hours now since earlier on without any issues yet. Will keep it running till the first CFOD occurs XD

kennethlimcp · January 19, 2014, 12:18pm

Been running the python script on my laptop since: 19/01/14 11:37:07 AM, UTC +8 (9 hours and counting)

Time between each API CALL: 1 minutes
Location: Singapore
Internet connection: 200mbps Fibre broadband
Router: ASUS RT-N56U
Authentication Method: WPA-Personal
WPA Encryption: TKIP
Wireless mode: Auto
b/g protection: Auto
Channel bandwidth: 20/40MHZ
Channel: 6

I checked the log and there has not been a failure up till the time I’m posting this.

Running the Core in my room and router outside. Approx 5-7m apart.

My idea is that polling the Core at shorter intervals should not see a CFOD as a connection is kept alive with the polling, indirectly keeping the connection between the Core and Cloud.

Leaving it to run till tomorrow morning (before going for class) and see the results. I’ll probably run the python script at longer intervals to verify the results further this week!

kennethlimcp · January 19, 2014, 12:21pm

Another testing method i have in mind is to:

Connect the core to the cloud after a power cycle
Turn off your router and see if the Core still gives a Cyan colour (or maybe plug off the wan cable)

Maybe someone can tell us the results cos i can’t do the test since im doing the python script testing!

nika8991 · January 19, 2014, 4:14pm

Hi,

Yesterday evening my SparkCore was very instable, I got the CFoD continously. When I went to bed I gave it a reset once more and put my Macbook in the sleepmode. This morning when I woke up it was still running but I had other things to do so I left it. Then, more than 12 hours later it was still running. After that I ran the make command on my computer, meaning quite some processor activity, I saw the CFoD again :-(.
The SparkCore is connected to my Macbook via the USB cable, at that moment only for the power supply.

Could it be that disturbance on the power supply causes the CFoD?

Best regards,
Henk

BTW.
I now have removed the USB cable and use a battery for power supply. I also put the external antenna away from the SparkCore (one with uFL connector), as far as possible (~ 15 cm / ~6 inches). My first impression is that it behaves more stable (more than 30 minutes without CFoD).

dorth · January 19, 2014, 6:19pm

I've been running mine via an external power source and I haven't noticed any difference in behavior. I do know, however, that if I don't poll the Spark from the cloud, it will run for a long time (greater than 24 hours) without a CFOD. If I then make just one Cloud "get" call, it will sometimes CFOD immediately.

I'd be interested in how your Spark behaves when polling every 10 seconds. Mine will CFOD within 30 minutes.

Dave O

wgbartley · January 19, 2014, 7:42pm

I've been using the same 5V 1A USB power supply the entire time. If I join the to my main network, it will CFOD within half an hour or so. If it's on my guest network, well, I haven't had it CFOD on my guest network yet (after a couple of weeks at least).

nika8991 · January 19, 2014, 7:45pm

@dorth,
In my situation the SparkCore is polled every 5th second and it is still running on the battery without problems for more than 4 hours now.

Br.,
Henk

kennethlimcp · January 19, 2014, 10:54pm

So i managed to have the Core connected to the Cloud from 19/01/14 11:37:07 AM to 20/01/14 06:52:31 AM, UTC +8 without any CFOD. That’s almost 19 hours @ 1 minute intervals…

I suspect that the issue might have to do with the internet connection…

Did a separate test as follows:

Reboot the router

Condition - Lost of Wifi network
Time taken for Core to react: 4s
Core feedback: Flashing Green

‘Pause’ the internet connection (ie. no WAN IP)

Condition - Lost of Wifi network
Time taken for Core to react: 5-10s
Core feedback: Flashing Cyan

‘Resume’ the internet connection

Condition - internet connection available
Time taken for Core to react: 16-22s
Core feedback: Flashing red to Breathing Cyan

Have fun!

mtnscott · January 19, 2014, 11:14pm

Very strange behavior, I just cleared out my local repository and git’d new sources for the firmware, now i’ve been running over 1 hr and have not received the CFOD, however I have lost my serial connection, I don’t get any output and when I disconnect and reconnect the serial port the data comes out of sequence. STRANGE. Maybe I downloaded a ‘in the works’ version from the repository? How do I get a ‘stable’ build?

UPDATE: The core entered CFOD after 1:38. So - unfortunately since I got my core several weeks ago I can’t seem to keep it running more than 30m - 1:38.

dermotos · January 20, 2014, 1:54am

Ok, so I setup a guest network on my Airport Extreme (local clients cannot see each other) and my core is up about 48 hours with no failure... yet. The most I've got previously on the standard WiFi network of my Airport is about 24 - 30 hours.

wgbartley · January 20, 2014, 4:36am

It seems as if there is some network chatter going on that causes the to CFOD. I wish I could gradually take offline and reintroduce devices to my network to see if something triggers it, but I’m pretty sure my wife and daughters would not tolerate their devices and movies and whatever else being down for hours or days. :-/

timb · January 20, 2014, 5:55am

Go the other way! Setup a guest network (with client to client communication allowed) and start only with the Spark Core, then switch various devices over to that network one by one.

zach · January 20, 2014, 6:17am

Hey all,

After much debugging, it seems that the source of this bug is internal to the CC3000. It’s possible that this is a Texas Instruments firmware bug, and it’s also possible that this is due to some issue in our implementation of their driver.

I just posted a thread on TI’s forum to get some help from their engineers; if you are so inclined, you can follow the thread here:

http://e2e.ti.com/support/low_power_rf/f/851/t/315682.aspx

The thread includes a detailed summary of everything we’ve learned from this thread and from our own testing. Stay tuned for more info!

wgbartley · January 20, 2014, 6:20am

Unfortunately, I am a nerd. My ecosystem assumes a single network for wired and wireless devices. If I move the phones to a guest network, they won’t be able to control stuff like the Rokus or AppleTV on the wired network. And I can’t switch the wired network to the guest network. For a good test, I’d need to be able to switch those devices as well, since I’m sure they also generate their own broadcast chatter throughout the subnet.

If I could just talk my wife into leaving the house with both kids for several hours…

BDub · January 20, 2014, 8:26pm

I’m following the TI response and without those logs it’s going to be hard to keep them motivated I think. Here’s a proposal to get to the debug pins:

sjunnesson · January 20, 2014, 8:52pm

@BDub looking at the datasheet for CC3000 I think the debugging pins is two pins higher up on the board. More precisely pin 6 for WL_RS232_TX and 8 for WL_RS232_RX. So that would mean cutting a couple of more traces if you want to reach them from behind.

Another approach could be to reach them from the top. Not sure on how the CC3000 looks under the shielding but their might be some way to access the needed pins.

Dragonsshout · January 20, 2014, 8:54pm

It’s not clear which are the right pins… 2, 4, 6 or 8 ?

I read in the comments (CC3000 Logger) thats: “NS_UART_DBG (Driver Logger) […] is connected to J4 Pin 2 on the TI CC3000 Module. WL_UART_DBG (Firmware Logger) […] is connected to J4 pin 4 on the TI CC3000 module.”

In the datasheet we have those informations:

BDub · January 20, 2014, 10:39pm

You guys gotta dig deeper than that!

@zach said, “We cannot capture these logs, as the debug pins are not exposed on our hardware.”

http://processors.wiki.ti.com/index.php/CC3000_Logger

http://e2e.ti.com/support/low_power_rf/f/851/p/251957/887108.aspx

timb · January 20, 2014, 11:26pm

I think the easiest way to do this would be using hot air to remove the CC3000, then rig up a little board with pogo pins on the bottom and mating headers for TI’s CC3000 EVM on the top. Then you just clamp the pogo board to the Core, in place of the CC3000 module!

I could have a working board up on Upverter in about an hour if anyone is interested. Shouldn’t be too expensive either.

mohit · January 20, 2014, 11:34pm

We do have an old version of the Core with the exposed debug pads but I would still have to add the buffer on the SPI channel that was added on the production version of the Core.

I’m planning to use a FTDI breakout board from Sparkfun to interface it to the PC. (the CC3000 requires 1MBaud serial chnannel)

Will update as soon as I make some progress.

p.s.: I wonder if taking off the metal shield from the top of CC3000 will expose these pins?

Topic		Replies	Views
Dropping the connection to spark cloud Troubleshooting	94	13152	February 21, 2016
Can I Turn the CC3000 OFF? [solved] Troubleshooting	137	17269	July 18, 2015
Continuous losing connection Troubleshooting	31	9241	March 28, 2014
Functioning with wifi and internet drops Troubleshooting	40	6753	May 30, 2014
Simple UDP program breaks the core Troubleshooting	89	15585	October 9, 2014

Bug bounty: Kill the 'Cyan flash of death'

Related topics