Detecting a disconnected core

GET /v1/devices/{DEVICE_ID} returns {'connected':True} even when the device is powered off.

How do I detect through the API that the device is disconnected?

@richlyon

Did you try https://api.spark.io/v1/devices/{device_id}?access_token=XXXXXXXXXXXXXX ?

Mine shows false when the cores are offline. I believe the results are polled when you do the API call so it should not have the behavior you mentioned.

However, it’s good that you pointed out and we can see if there’s and issue.

Can you give it another shot? ::smiley:

@Dave would like to bring this to your attention that this particular call takes like 15-20s for a response

https://api.spark.io/v1/devices/device_id?access_token=xxxxxxx

It fast for:

https://api.spark.io/v1/devices?access_token=xxxxxxx

I got a time-out using an online API requester tool and a slow response directly doing it on my browser

Hi - thanks for the suggestion. My URL is:

https://api.spark.io/v1/devices/48ff71065067555059281587?access_token=XXXXXXXXXXXXXXXXXXXXXXXXX

and the response is:

{
id: "48ff71065067555059281587"
name: "mutant_jetpack"
connected: true
-variables: {
temperature: "double"
voltage_temp: "int32"
TEMP_LOW: "double"
TEMP_HIGH: "double"
TEMP_MIN: "double"
TEMP_MAX: "double"
}
-functions: [
"program"
]
}

I’m using Postman, and I get a response in around 240 ms.

So does it still report core connected as true even if you turn off the power?

Interesting. I left it powered off overnight and ran the request this morning. It returned “connected:true” in 240 ms.

I ran the request again after I saw your reply (about 15 minutes after the first call) with it still disconnected, and now it shows “connected:false” with a 30 second response delay.

So the API does return ‘connected:false’ as expected, but there seems to be issues with caching and response time.

Did you use the exact same api request I posted?

I’m thinking it shouldn’t be happening since the API request checks before returning the result and shouldn’t be cached.

Wasnt able to replicate as well…

@dave would need to give us more info :slight_smile:

I used the api request I posted, which looks identical to yours

Yours has no Access token or maybe just a typo :slight_smile:

I don’t understand. I’m using:

https://api.spark.io/v1/devices/48ff71065067555059281587?access_token=XXXXXXXXXXXXXXXXXXXXXXXXX

The access token is the part after ? that says access_token=XXXXXXXXXXXXXXXXXXXXXXXXX

Obviously I’m redacting my own access token when posting here for security reasons.

Am I missing something?

HI @richlyon

I agree that the caching of state in the cloud is a bit mysterious, right now.

If you really need the state right now, what about using a challenge/response? You could have a Spark.function that takes a short string and returns the sum of that string as uint8_t’s? Spark.functions always take a String and return an int.

Thank you. My code locks up for a minute or so when I try to interrogate the core when it is offline. I though I’d try and detect its state from the Cloud and abandon script initialisation - challenge and response won’t help.

@Dave here, happy to chime in and de-mystify the cloud -> core “are you online” message.

tl;dr: use “list devices” for a quick “is online” check

When you ask the API to see if your core is online, it checks to see if the server thinks it has a socket open to your core. When your core is connected and active, the response should be fairly quick, it’s when your core isn’t connected that things get interesting.

If your core disconnects with a broken socket, and doesn’t send a FIN packet, the socket on the server can stay open for a while, leading the API to think your core is still online. If your core reconnects quickly, the old socket is cleaned up right away. We decided to make the core heartbeats optional on the server (which would avoid this problem), so the core could be more flexible about how often it keeps its connection alive.

It also depends on which endpoint you’re checking. If you hit your “list devices” endpoint, the API is doing a ‘socket only’ check, and the timeouts are very short:

**fast-er**
https://api.spark.io/v1/devices/

If you’re doing a “device state” check, then the server needs to reach out to your core and see what variables / functions it has available. In this case, if the socket is half-disconnected, or your core is entirely offline, the cloud waits as long as as the maximum reasonable lag would be to hear back from your core, about 15-20 seconds.

**slow-er when a core is offline**
https://api.spark.io/v1/devices/my_core_id_or_name

When I implemented variable list and function list on the CLI, I first checked the list devices endpoint, and then only checked on cores that reported being online to get the quickest response.

Thanks!
David

2 Likes

David - the server is still reporting online cores at https://api.spark.io/v1/devices/ 15 minutes after pulling the plug on it.

You have a mechanism for handling when a core disconnects itself. There isn’t a mechanism for handling when a core is disconnected (physically, or by network interruption).

That’s a bit awkward - I’m relaying spark data via a python script to a database. That means I’m relaying corrupt data long after the core has gone down, because I can’t detect that it has.

Hi @richlyon

I thought that this was why the cloud includes a “last_heard” time stamp for variables. Without this, you wouldn’t know when the last update to the cloud version of variable was even when it was working right.

{
  "cmd": "VarReturn",
  "name": "temperature",
  "result": 79.8125,
  "coreInfo": {
    "last_app": "",
    "last_heard": "2014-03-11T15:51:01.380Z",
    "connected": true,
    "deviceID": "<something>"
    }
}

Hmm. Well, I don’t want to require a strict 15 second heartbeat from the core on the server side, since that might cause undesired interactions, and I don’t want to start closing sockets on cores that are just being very quiet.

I’m surprised this hasn’t been an issue yet, but I could incorporate a server socket test when the socket has been completely quiet for more than a minute. This would mean the server would detect a broken socket in a 1-2 minute range, and shouldn’t have any side effects. Thoughts?’

If nothing else, checking for the value of a variable, or calling a function on your core will definitely tell you if it’s online or not.

Thanks!
David

Dave - I can understand. And bko - I didn’t notice the ‘last_heard’ field and that will work. Thank you both.