Case of the Missing Variables

I am experiencing a problem where my core suddenly gets into a state where it has no variables or functions exposed. The core is still flash-able in this state, but it is causing data loss for my data monitoring project!

I am running a script that contacts the core every 5 minutes to collect data. The requests sometimes timeout (I get 408’s) in which case I retry in 30 seconds and usually things come back within a couple of minutes. It will work fine for hours and then I start getting 404’s for all variable requests. The core is still breathing cyan. Resetting the core will bring things back, but it is annoying to have to constantly monitor my data monitoring! When it is in this state I am able to flash the Core.

What would cause the Core to get into this state!?!

https://api.spark.io/v1/devices/COREID?access_token=ACCESS_TOKEN:

{
“id”: “53ff6c066667574848242567”,
“name”: “SoftCore”,
“connected”: true,
“variables”: {},
“functions”: [],
“cc3000_patch_version”: “1.29”
}

Log from Script

2014-10-14T01:17:58.981000, 49.8, 20.6, 70.0, 20.8, 161
Sleeping 300…
Reading data from sensor…
2014-10-14T01:23:03.247000, 50.4, 20.4, 70.0, 20.9, 6590
Sleeping 300…
Reading data from sensor…
2014-10-14T01:28:07.076000, 51.9, 20.7, 70.0, 20.7, 6753
Sleeping 300…
Reading data from sensor…
Error reading data: HTTP Error 408: Request Time-out
Sleeping 30…
Reading data from sensor…
Error reading data: HTTP Error 408: Request Time-out
Sleeping 30…
Reading data from sensor…
Error reading data: HTTP Error 408: Request Time-out
Sleeping 30…
Reading data from sensor…
Error reading data: HTTP Error 408: Request Time-out
Sleeping 30…
Reading data from sensor…
2014-10-14T01:37:11.009000, 56.3, 20.8, 71.0, 20.6, 171
Sleeping 300…
Reading data from sensor…
Error reading data: HTTP Error 404: Not Found
Sleeping 30…
Reading data from sensor…
Error reading data: HTTP Error 404: Not Found
Sleeping 30…
.
.
.
It never recovers from here…

Hi @ubermensch; I am guessing that there’s some interaction between the code you wrote and the background code that’s handling communications; would you mind sharing your code, either by pasting it here or by pasting it in a gist and posting the link?

What type of interaction could it be? Is this a known failure mode? In what circumstances does it happen? Have you ever seen this issue before? It would seem odd that my code that is running for hours could have the effect of suddenly removing variables when there is not even API for removing variables.

Here is the code: https://gist.github.com/aznel/10cf0690f7cd6168828e

Hope it helps.

In the gist, it looks like you intend to have curly braces around the indented code in lines 108-118:

That shouldn’t cause the variables to disappear though—only assigning tempC and humidity to 999.

I don’t see anything in that particular part of your code that should have any effect on the variables. This is not a known failure mode. As far as what circumstances cause it to happen, you’ll have to gather some data and let us know! :smile: I’ll ask the cloud team whether they can think of any way this could happen in the cloud.

Hi @ubermensch,

It looks like we’re also talking about this over email, so I’ll post my response here:

It seems like the program you’ve been running registers more than 4 Spark variables (I counted 8), 4 I believe is the limit. I’ve also noticed your core is frequently disconnecting and reconnecting, and that sometimes there is significant lag to and from your core, is it possible your internet connection is dropping out?

I also noticed you’re using the DHT22 library, which can be tricky if you’re using more than one sensor, which version of the library are you using? Any chance you can share your full code so we can try to reproduce and debug the issue here?

EDIT: oops, just noticed the gist. reading… I don’t believe that was the original full code, was it?

Thanks,
David

I suggest trying this also :

Thanks for the link @Rockvole.

@ubermensch After some discussion with the Spark team, it sounds much more likely to be a firmware issue. Also the number of variables allowed is ten, as can be seen below in USER_VAR_MAX_COUNT, so your 8 should be good.

If anyone else encounters this issue, do let us know.

https://github.com/spark/firmware/blob/master/inc/spark_utilities.h#L47

Registering too many variables should not cause the variables to exist for 16 hours and then suddenly disappear. The same is true for the internet connection dropping out. Also, if these were the cause of the problem you probably would have seen this situation before?

This code posted is the minimal code that I have confirmed exhibits this bug. I am running the version of the DHT22 that the Web IDE provides. I have run this locally now and do see that the DHT22 code may be triggering the problem. The code is hanging after the “Retrieving information from DHT22 sensor”, presumably in the “while (DHT22.acquiring());” loop. However this is a different failure mode from my own test program that simply does an infinite loop. In that case the core is not flash-able and the core is reported as offline, which is a reasonable response from the API.

I suspect that the problem is probably related to the interrupt handling in the DHT22 code. Is there a synchronous version of the library available?

Heya @ubermensch,

The cloud asks for a list of variables/functions from your core two times, once about 1.25 seconds after connecting, and then again about 60 seconds later if the first time failed. If you’re blocking during setup, your firmware will miss the first call, and if you have random delays scattered elsewhere, you might be missing the second call occasionally. These would be erratic depending on your program, if the delays / timings are a bit unpredictable. If you moved your Spark.variable calls to the top of the setup function, and didn’t block in that function you should be in good shape.

I think the original DHT22 library was synchronous… lemme see if I can dig it up…

I hope that helps!

Thanks,
David

You might want to try the newer one :