Detecting a disconnected core

richlyon · March 10, 2014, 4:58pm

GET /v1/devices/{DEVICE_ID} returns {'connected':True} even when the device is powered off.

How do I detect through the API that the device is disconnected?

kennethlimcp · March 11, 2014, 12:19am

Did you try https://api.spark.io/v1/devices/{device_id}?access_token=XXXXXXXXXXXXXX ?

Mine shows false when the cores are offline. I believe the results are polled when you do the API call so it should not have the behavior you mentioned.

However, it’s good that you pointed out and we can see if there’s and issue.

Can you give it another shot? :

@Dave would like to bring this to your attention that this particular call takes like 15-20s for a response

https://api.spark.io/v1/devices/device_id?access_token=xxxxxxx

It fast for:

https://api.spark.io/v1/devices?access_token=xxxxxxx

I got a time-out using an online API requester tool and a slow response directly doing it on my browser

richlyon · March 11, 2014, 8:01am

Hi - thanks for the suggestion. My URL is:

https://api.spark.io/v1/devices/48ff71065067555059281587?access_token=XXXXXXXXXXXXXXXXXXXXXXXXX

and the response is:

{
id: "48ff71065067555059281587"
name: "mutant_jetpack"
connected: true
-variables: {
temperature: "double"
voltage_temp: "int32"
TEMP_LOW: "double"
TEMP_HIGH: "double"
TEMP_MIN: "double"
TEMP_MAX: "double"
}
-functions: [
"program"
]
}

I’m using Postman, and I get a response in around 240 ms.

kennethlimcp · March 11, 2014, 8:13am

So does it still report core connected as true even if you turn off the power?

richlyon · March 11, 2014, 8:16am

Interesting. I left it powered off overnight and ran the request this morning. It returned “connected:true” in 240 ms.

I ran the request again after I saw your reply (about 15 minutes after the first call) with it still disconnected, and now it shows “connected:false” with a 30 second response delay.

So the API does return ‘connected:false’ as expected, but there seems to be issues with caching and response time.

kennethlimcp · March 11, 2014, 8:28am

Did you use the exact same api request I posted?

I’m thinking it shouldn’t be happening since the API request checks before returning the result and shouldn’t be cached.

Wasnt able to replicate as well…

@dave would need to give us more info

richlyon · March 11, 2014, 8:30am

I used the api request I posted, which looks identical to yours

kennethlimcp · March 11, 2014, 8:42am

Yours has no Access token or maybe just a typo

richlyon · March 11, 2014, 9:46am

I don't understand. I'm using:

https://api.spark.io/v1/devices/48ff71065067555059281587?access_token=XXXXXXXXXXXXXXXXXXXXXXXXX

The access token is the part after ? that says access_token=XXXXXXXXXXXXXXXXXXXXXXXXX

Obviously I'm redacting my own access token when posting here for security reasons.

Am I missing something?

bko · March 11, 2014, 11:12am

HI @richlyon

I agree that the caching of state in the cloud is a bit mysterious, right now.

If you really need the state right now, what about using a challenge/response? You could have a Spark.function that takes a short string and returns the sum of that string as uint8_t’s? Spark.functions always take a String and return an int.

richlyon · March 11, 2014, 1:24pm

Thank you. My code locks up for a minute or so when I try to interrogate the core when it is offline. I though I’d try and detect its state from the Cloud and abandon script initialisation - challenge and response won’t help.

Dave · March 11, 2014, 3:12pm

@Dave here, happy to chime in and de-mystify the cloud -> core “are you online” message.

tl;dr: use “list devices” for a quick “is online” check

When you ask the API to see if your core is online, it checks to see if the server thinks it has a socket open to your core. When your core is connected and active, the response should be fairly quick, it’s when your core isn’t connected that things get interesting.

If your core disconnects with a broken socket, and doesn’t send a FIN packet, the socket on the server can stay open for a while, leading the API to think your core is still online. If your core reconnects quickly, the old socket is cleaned up right away. We decided to make the core heartbeats optional on the server (which would avoid this problem), so the core could be more flexible about how often it keeps its connection alive.

It also depends on which endpoint you’re checking. If you hit your “list devices” endpoint, the API is doing a ‘socket only’ check, and the timeouts are very short:

**fast-er**
https://api.spark.io/v1/devices/

If you’re doing a “device state” check, then the server needs to reach out to your core and see what variables / functions it has available. In this case, if the socket is half-disconnected, or your core is entirely offline, the cloud waits as long as as the maximum reasonable lag would be to hear back from your core, about 15-20 seconds.

**slow-er when a core is offline**
https://api.spark.io/v1/devices/my_core_id_or_name

When I implemented variable list and function list on the CLI, I first checked the list devices endpoint, and then only checked on cores that reported being online to get the quickest response.

Thanks!
David

richlyon · March 11, 2014, 3:49pm

David - the server is still reporting online cores at https://api.spark.io/v1/devices/ 15 minutes after pulling the plug on it.

You have a mechanism for handling when a core disconnects itself. There isn’t a mechanism for handling when a core is disconnected (physically, or by network interruption).

That’s a bit awkward - I’m relaying spark data via a python script to a database. That means I’m relaying corrupt data long after the core has gone down, because I can’t detect that it has.

bko · March 11, 2014, 3:55pm

Hi @richlyon

I thought that this was why the cloud includes a “last_heard” time stamp for variables. Without this, you wouldn’t know when the last update to the cloud version of variable was even when it was working right.

{
  "cmd": "VarReturn",
  "name": "temperature",
  "result": 79.8125,
  "coreInfo": {
    "last_app": "",
    "last_heard": "2014-03-11T15:51:01.380Z",
    "connected": true,
    "deviceID": "<something>"
    }
}

Dave · March 11, 2014, 4:02pm

Hmm. Well, I don’t want to require a strict 15 second heartbeat from the core on the server side, since that might cause undesired interactions, and I don’t want to start closing sockets on cores that are just being very quiet.

I’m surprised this hasn’t been an issue yet, but I could incorporate a server socket test when the socket has been completely quiet for more than a minute. This would mean the server would detect a broken socket in a 1-2 minute range, and shouldn’t have any side effects. Thoughts?’

Dave · March 11, 2014, 4:47pm

If nothing else, checking for the value of a variable, or calling a function on your core will definitely tell you if it’s online or not.

Thanks!
David

richlyon · March 11, 2014, 8:01pm

Dave - I can understand. And bko - I didn’t notice the ‘last_heard’ field and that will work. Thank you both.

Topic		Replies	Views
Checking if Core is online and API return codes Cloud	9	3325	December 2, 2014
Spyrk: Python module for Spark devices Cloud	4	3109	January 28, 2014
API / Cloud down? Troubleshooting	9	1299	April 5, 2015
Cloud API is Buggy Cloud	14	3035	August 17, 2014
No HTTP 404 when Core not connected Troubleshooting	8	1251	February 28, 2015

Detecting a disconnected core

Related topics