Core losing firmware overnight?

I have been trying to run the same sketch on my Core for a couple of days now. It compiles and flashes fine, but sometime overnight, it just stops working. I receive an “Error: Function not found” when I try to call my exposed function from the cloud API. However, if I simply re-flash the firmware from the web IDE, it gets back up and running again. It happens sometime overnight. I haven’t been able to write a script to monitor for when it happens exactly. The sketch clocks in at 945 lines (with liberal spacing for readability).

I created a quick Gist at https://gist.github.com/wgbartley/8301007.

Has anyone noticed similar problems?

I haven’t tried running overnight yet… I just see the const flashing cyan issue sometimes if I’m blocking in code, and const magenta issue when reflashing. I haven’t had time to try and figure those out, just hit reset or factory reset to get back up and running. While there are issues, thank the Spark Gods for factory reset that always seems to work in a jam (fingers crossed)! Really good forward thinking.

Looking at your code… looks very nice! I like all of the routines you’ve implemented and your command parser. You might want to add the A,B,C constants, ADCrefV and MAX_ADC_VAL to your Thermistor constructor… that would be a very nice library for simple temperature measurement requirements.

Also looking at the code, I’m not sure if any of your sensors code is constantly running in the background or not, or just on-demand when call your Spark.function(), but one thing to look for is counters that are not defined with a large enough data type. That can easily cause delayed issues as they count up and overflow… not sure if you have any of that.

You also might want to let it run over night with just one sensor at a time to narrow it down…

It’s the first C class that I have ever written, so it was mostly thrown together as a learning experiment. When we can get all of the kinks worked out, I’ll definitely flesh it out even more. I’ll also do the same for the Photocell/LDR. Thanks for the tips!

There really aren’t many global variables/counters, but I’ll double-check the OneWire and DHT stuff to make sure those things aren’t overflowing somewhere.

I didn’t actually do any polling last night. And my poor little Raspberry Pi locked up at some point the night before, so polling from that stopped around 2:00 AM. It probably wouldn’t hurt to pare it down and add the sensors back in on a night-by-night basis. I’ll also write a poller/monitor on a stable server (as opposed to the RPi).

Thanks for the tips! I’ll report back as I progress.

1 Like

I’ve pared it down to just the thermistor for now and am polling for the value from my hosted server every minute. I’m also posting it to my Graphite install because you can’t just let good data go to waste!

I’m at 19 hours or so (which qualifies as overnight to me). No hiccups at all. I’ll add in the DS18B20 and let her run until tomorrow, or if I catch it having issues, whichever comes first.

Ran all night with the DS18B20. Next up is the DHT22!

I just realized that the way I’m polling the DHT22 is somewhat flawed. If I fire off a call to DHTHUMIDITY and then DHTTEMP back-to-back, it’s possible (probable?) that it will poll the sensor within the same 2-second window, which is warned against in everything I’ve read about the DHT22. I wouldn’t think it would cause the Spark to crash, but I’m going to fix it so that it only polls the sensor if the data is older than 2 seconds.

DHT22 temperature was also successful. Time for the final puzzle piece of polling DHT22 humidity.

Following your posts… so far so good! Keep at it :wink:

looking good! what are you using to log and graph the data?

I’m now storing the data using Graphite. I currently have a simple shell script that runs once per minute to poll the sensors and then push that data into Graphite via a custom web service that I wrote.

Now that I think about it, I could possibly use the TCP client stuff to push directly into Graphite without the need for shell scripts or web services.

It’s still alive! The only changes left to test are the LDR class (even though it was never called from my original poller) and the DHT fix. I’ll add the LDR back in (and poll for it this time).

1 Like

Even with the LDR, things ran fine overnight! I’m going to re-flash the original firmware that was disappearing and see if it’s still broke.

It’s kinda fun to see the sensors gradually being added back into the code.

1 Like

Ah ha! It finally died at 2014-01-17 07:29:11, so about 6 days or so. I just re-flashed it from the web build IDE with no problems and it’s back up. Is there a remote reset that could be issued? A function call to a non-responsive Spark would be, well, non-responsive

It’s a cron job making 4 function calls per minute, and every response value is logged in a giant text file (67,837 lines and growing). I should probably log to sqlite to be able to query better for more detailed analytics!