Bug bounty: Kill the 'Cyan flash of death'

No improvement. Still got 'em…

:frowning:

Sometimes LED fades saying “i’m ok” but loop() has stopped!

Frido.

1 Like

@timb, would you mind telling me what display you have ? I have been looking for a nice display for the spark and was looking at serial displays, but I2C sounds much better.

1 Like

Symptom: CFOD
Router: Asus RT-N66U running dd-wrt
Wireless Protocol: 802.11n
Location: PA
Network Security: WPA2 passphrase

1 Like

Can you tell us how the Spark.Core is suppose to behave when the network is lost? I understand the CBoD that occurs when it is trying to reconnect, but why does the user application stop? I would think that we want the Spark.Core to continue running the user application and continue to try and re-establish a connection with the cloud. It would be great if we could register an event handler that gets notified when the cloud connection is either lost or re-established. Having a device that hangs when the network is down is not very useful. I understand the current problem with CBoD is related to having a good connection, but when happens when the real connection is down, we can’t have the system go into a state where the user application does not get any cycles. Anyway those are my thoughts.

So I continue to get the CBOD using the updated firmware base. I have a Linksys E4200 router and a 1.5Mb DSL connection (sigh - I used to have a 40Mb connection back East). My application just blinks LEDs and responds to Cloud function calls to change the state and rate of blinking. It outputs an iteration counter to the serial port.

Hey @mtnscott,

There is separate work in progress to decouple the wifi connection from user code. See this thread:

1 Like

I don’t know if this report will help, but I have not seen a CFOD since I stopped calling TCPClient.stop. My playing around does not currently use the cloud features other than for the build IDE and downloading to the core. I have several little toy applications that scape web pages or RSS feeds for interesting things and display the info on a 2x16 LCD.

I was having the core keep a count of the number of times it hit a web page which I scheduled for every 5 minutes and display that count on the LCD. The biggest number I saw was 109, which is just over 9 hours of uptime.

I also have a UDP NTP client that I have been working on and it runs overnight without crashing as well. I don’t recall ever seeing UDP have a CFOD.

My apps DO spontaneously crash sometimes, but the core reboots gracefully and reconnects in a normal way as if I hit the reset button. During development, I have walked off the end of memory and had to reset the WiFi credentials and reflash tinker to get it to work sometimes. I would say these crashes are my fault, to the best of my understanding.

I am using an el-cheapo Netgear router with WPA2, since WEP didn’t work for me. I am having the core print its MAC and IP addresses to the display at startup, so I know that I am getting at 10.x.x.x address.

I don’t know if my good luck comes from the router (doubt it) or the lack of cloud IO or the lack of TCPClient.stop calls.

@mtnscott If you want a nice little graphical backpack that will work over UART, I2C or SPI, I’d take a look at Digole! They sell both a range of LCD’s and OLEDs with integrated backpack and just the backpacks by themselves. I’m using the 1.3" White OLED with the Core right now and it’s working great! One of the nice things is that the protocol is universal amongst all the different displays, so one library fits all.

I’d also recommend checking out the 1.8" Color OLED Module, 2.7" Backlit LCD Module, 1.8" White Backlight LCD Module and the Universal KS0108 Adapter.

Another handy feature is the built-in (user replaceable) UG8-compatible fonts and the ability to upload a startup bitmap or animation.

The raw command set is pretty simple and generally consists of ASCII characters followed by X number of option bytes. For example:

Wire.beginTransmission(0x4E);
Wire.print("CL");
Wire.print("SF");
Wire.write(18);
Wire.print("TT");
Wire.print("Cloud Uptime");
Wire.write(0x00);
Wire.endTransmission();

CL = Clear
SF = Set Font, 18 = Font
TT = Text (Followed by the text you want to display and 0x00 for EOL.)

Anyway, I’ve almost got the full Digole Arduino Library ported over to the Spark Core. If you want to step up to a graphic display and don’t need a touchscreen, I highly recommend picking up at least one of the Digole OLED units! As an aside, Digole ships displays from Canada and China, it should tell you somewhere on the product page; the stuff coming from Canada normally ships to the US in about a week!

I am not sure whether my report will be helpful but I am experiencing CFOD as well since today.

I have not started using the web IDE yet but I used the core with the relay shield and Tinker. It worked normally before, then I did not powered it up for a few days. Today, I powered it up, sent a few command from Tinker, and it worked for a few seconds before the CFOD occured. I thought something was wrong and I used the reflash Tinker command. And after that, none of any command went through.

The LED cycle: white -> green -> breathing cyan for 30 seconds -> flashing cyan for a few minutes -> red -> flashing cyan again.

I do not think the problem is with the router since it worked before, but here is the wifi details:

Router: TP-Link TL-WR41N
Mode: 11bgn mixed
Channel: Auto (current channel 7)
Location: MY
Network Range: 192.168.1.1/24
Security: WPA/WPA2 - Personal

I think that this problem is more than just losing connection with the cloud. Polling a variable read will CFOD my core consistently. Not polling it - but still running the same program - and it runs for days (although from my previous post you will see that it apepars on line but a variabel read returned nonsense).

I was looking into this but so far I am unable to duplicate running my RGB brightness demo for 96 hours on the jtag shield with 1A usb wall power supply. I have never seen this problem occur actually.

Can anyone provide a reliable way to duplicate? Running tinker and polling variables at a fixed rate?

@dorth, Can you provide the python script your using to poll the core and report uptimes?

In my case, the PWS makes no difference, I have used a 1A PWS in addition to using it connected to my Macbook Pro. I expose some functions, no variables to the cloud. I get CBOD within 30m consistently, once it lasted for 1h, but never longer than that. If I let it sit, I will get a flashing red briefly during the CBOD, it then goes back to CBOD.

Here is the python script I run with cron (every minute) to query an analog value from the Tinker app running on the core. If I poll every 15 minutes, my core will run for well over 24 hours. If I run cron every minute, it will CFOD within a few hours (see my post above for my results).

The script uses "requests" for REST handling and you need to have that installed (http://docs.python-requests.org/en/latest/. This is running on an Ubuntu box, but should be fairly portable.

Python Script to Poll Spark

Dave O

You just rocked my world with this find. This will actually be perfect for another project I have in my queue.

@dorth Thanks for posting the python script!

I have flashed the code from https://github.com/spark/core-firmware/blob/feature/debug-cfod/build/core-firmware.bin to my core.

The script is running at intervals of 1 minute. Hopefully it all turns out well with no disconnection :smiley:

Will keep you guys updated!

Awesome! I know a few other members here have had nothing but positive experiences with their displays. The backpacks are actually rather beefy little 64MHz PIC micros, so it handles all the actual drawing functions. You can just tell it “draw box of this size at these coordinates” and it handles the rest, which takes a ton of math off your device.

Oh, another cool feature is all of the displays (sans the color OLED) feature between 5 and 8 extra I/O ports! You use the DOUT command followed by a byte. (Each bit of the byte controls the state of each port.)

Finally, you can pass raw data to the LCD/OLED controller with the MCD (Manual Command) and MDT (Manual Data) commands.

They really are neat little devices and very fairly priced, too!

Another (simpler) method is just to use a curl call (see http://docs.spark.io/#/api/basic-functions) in a shell loop that cycles every 10 seconds or so. That should pretty much replicate what the Python script is doing. I ended up with the script because I had wanted to log a bunch of data (1-wire temperature readings), but kept CFOD-ing the core.

Cycling every 10 seconds definitely reduces the core running time until CFOD. Mine will not run over 30 minutes when polling this frequently.

I also noticed (and I believe someone else mentioned) that the CFOD is interrupted every 2 minutes with 2 RED flashes - then back to CFOD. Just figured I mention that as this is the first time I noticed this.

Dave O

@dorth I’m running it at 1 minute intervals for 3 hours now since earlier on without any issues yet. Will keep it running till the first CFOD occurs XD

Been running the python script on my laptop since: 19/01/14 11:37:07 AM, UTC +8 (9 hours and counting)

Time between each API CALL: 1 minutes
Location: Singapore
Internet connection: 200mbps Fibre broadband
Router: ASUS RT-N56U
Authentication Method: WPA-Personal
WPA Encryption: TKIP
Wireless mode: Auto
b/g protection: Auto
Channel bandwidth: 20/40MHZ
Channel: 6

I checked the log and there has not been a failure up till the time I’m posting this.

Running the Core in my room and router outside. Approx 5-7m apart.

My idea is that polling the Core at shorter intervals should not see a CFOD as a connection is kept alive with the polling, indirectly keeping the connection between the Core and Cloud.

Leaving it to run till tomorrow morning (before going for class) and see the results. I’ll probably run the python script at longer intervals to verify the results further this week! :smiley:

Another testing method i have in mind is to:

  1. Connect the core to the cloud after a power cycle
  2. Turn off your router and see if the Core still gives a Cyan colour (or maybe plug off the wan cable)

Maybe someone can tell us the results cos i can’t do the test since im doing the python script testing!