Detecting a disconnected core

GET /v1/devices/{DEVICE_ID} returns {'connected':True} even when the device is powered off.

How do I detect through the API that the device is disconnected?

@richlyon

Did you try https://api.spark.io/v1/devices/{device_id}?access_token=XXXXXXXXXXXXXX ?

Mine shows false when the cores are offline. I believe the results are polled when you do the API call so it should not have the behavior you mentioned.

However, itā€™s good that you pointed out and we can see if thereā€™s and issue.

Can you give it another shot? ::smiley:

@Dave would like to bring this to your attention that this particular call takes like 15-20s for a response

https://api.spark.io/v1/devices/device_id?access_token=xxxxxxx

It fast for:

https://api.spark.io/v1/devices?access_token=xxxxxxx

I got a time-out using an online API requester tool and a slow response directly doing it on my browser

Hi - thanks for the suggestion. My URL is:

https://api.spark.io/v1/devices/48ff71065067555059281587?access_token=XXXXXXXXXXXXXXXXXXXXXXXXX

and the response is:

{
id: "48ff71065067555059281587"
name: "mutant_jetpack"
connected: true
-variables: {
temperature: "double"
voltage_temp: "int32"
TEMP_LOW: "double"
TEMP_HIGH: "double"
TEMP_MIN: "double"
TEMP_MAX: "double"
}
-functions: [
"program"
]
}

Iā€™m using Postman, and I get a response in around 240 ms.

So does it still report core connected as true even if you turn off the power?

Interesting. I left it powered off overnight and ran the request this morning. It returned ā€œconnected:trueā€ in 240 ms.

I ran the request again after I saw your reply (about 15 minutes after the first call) with it still disconnected, and now it shows ā€œconnected:falseā€ with a 30 second response delay.

So the API does return ā€˜connected:falseā€™ as expected, but there seems to be issues with caching and response time.

Did you use the exact same api request I posted?

Iā€™m thinking it shouldnā€™t be happening since the API request checks before returning the result and shouldnā€™t be cached.

Wasnt able to replicate as wellā€¦

@dave would need to give us more info :slight_smile:

I used the api request I posted, which looks identical to yours

Yours has no Access token or maybe just a typo :slight_smile:

I don't understand. I'm using:

https://api.spark.io/v1/devices/48ff71065067555059281587?access_token=XXXXXXXXXXXXXXXXXXXXXXXXX

The access token is the part after ? that says access_token=XXXXXXXXXXXXXXXXXXXXXXXXX

Obviously I'm redacting my own access token when posting here for security reasons.

Am I missing something?

HI @richlyon

I agree that the caching of state in the cloud is a bit mysterious, right now.

If you really need the state right now, what about using a challenge/response? You could have a Spark.function that takes a short string and returns the sum of that string as uint8_tā€™s? Spark.functions always take a String and return an int.

Thank you. My code locks up for a minute or so when I try to interrogate the core when it is offline. I though Iā€™d try and detect its state from the Cloud and abandon script initialisation - challenge and response wonā€™t help.

@Dave here, happy to chime in and de-mystify the cloud -> core ā€œare you onlineā€ message.

tl;dr: use ā€œlist devicesā€ for a quick ā€œis onlineā€ check

When you ask the API to see if your core is online, it checks to see if the server thinks it has a socket open to your core. When your core is connected and active, the response should be fairly quick, itā€™s when your core isnā€™t connected that things get interesting.

If your core disconnects with a broken socket, and doesnā€™t send a FIN packet, the socket on the server can stay open for a while, leading the API to think your core is still online. If your core reconnects quickly, the old socket is cleaned up right away. We decided to make the core heartbeats optional on the server (which would avoid this problem), so the core could be more flexible about how often it keeps its connection alive.

It also depends on which endpoint youā€™re checking. If you hit your ā€œlist devicesā€ endpoint, the API is doing a ā€˜socket onlyā€™ check, and the timeouts are very short:

**fast-er**
https://api.spark.io/v1/devices/

If youā€™re doing a ā€œdevice stateā€ check, then the server needs to reach out to your core and see what variables / functions it has available. In this case, if the socket is half-disconnected, or your core is entirely offline, the cloud waits as long as as the maximum reasonable lag would be to hear back from your core, about 15-20 seconds.

**slow-er when a core is offline**
https://api.spark.io/v1/devices/my_core_id_or_name

When I implemented variable list and function list on the CLI, I first checked the list devices endpoint, and then only checked on cores that reported being online to get the quickest response.

Thanks!
David

2 Likes

David - the server is still reporting online cores at https://api.spark.io/v1/devices/ 15 minutes after pulling the plug on it.

You have a mechanism for handling when a core disconnects itself. There isnā€™t a mechanism for handling when a core is disconnected (physically, or by network interruption).

Thatā€™s a bit awkward - Iā€™m relaying spark data via a python script to a database. That means Iā€™m relaying corrupt data long after the core has gone down, because I canā€™t detect that it has.

Hi @richlyon

I thought that this was why the cloud includes a ā€œlast_heardā€ time stamp for variables. Without this, you wouldnā€™t know when the last update to the cloud version of variable was even when it was working right.

{
  "cmd": "VarReturn",
  "name": "temperature",
  "result": 79.8125,
  "coreInfo": {
    "last_app": "",
    "last_heard": "2014-03-11T15:51:01.380Z",
    "connected": true,
    "deviceID": "<something>"
    }
}

Hmm. Well, I donā€™t want to require a strict 15 second heartbeat from the core on the server side, since that might cause undesired interactions, and I donā€™t want to start closing sockets on cores that are just being very quiet.

Iā€™m surprised this hasnā€™t been an issue yet, but I could incorporate a server socket test when the socket has been completely quiet for more than a minute. This would mean the server would detect a broken socket in a 1-2 minute range, and shouldnā€™t have any side effects. Thoughts?ā€™

If nothing else, checking for the value of a variable, or calling a function on your core will definitely tell you if itā€™s online or not.

Thanks!
David

Dave - I can understand. And bko - I didnā€™t notice the ā€˜last_heardā€™ field and that will work. Thank you both.