Two different formats for one and the same event

tanuki · April 9, 2015, 7:40pm

Hi,
I have here a very strange effect. I am new to Spark and the cloud stuff and I am experimenting with events. My partner set up a simple trigger, which fires every 10 seconds and sends alternately true and false. So far so good. I register to this event and get:

"10\r\nevent: Movement\n70\r\ndata: {"data":"true","ttl":"60","published_at":"2015-04-09T19:30:55.535Z","coreid":"xxxxxxxxxxxxxxxxxxxxxxx"}"

Fine. Easily to parse. No problem... if only this were the only format. Once in a while I get:

"80\r\nevent: Movement\ndata: {"data":"true","ttl":"60","published_at":"2015-04-09T19:31:15.533Z","coreid":"xxxxxxxxxxxxxxxxxxxxxxxx"}"

Again, not problem to parse... The question is: Why? Why do I have to anticipate two different event formats? The second form roughly seems to arrive randomly.

marcus · April 9, 2015, 7:58pm

Okay,

Just a wild guess: Is it possible that the event:Movement sometimes does not send the amount off movement? Resulting in an empty part ‘70’ ?

tanuki · April 9, 2015, 8:04pm

I don’t know. Hard for me to test. My partner with his spark sits in Norway, I on the other hand sit in Germany.
But is such a bug even possible? Why should something like that happen randomly? And the missing 70 in front of the data is added to the event name at the beginning of the line. Apart from that everything looks normal… timestamps, device ids.

marcus · April 9, 2015, 8:10pm

Well @tanuki
I live in Sweden So I am the man in the middle. Can you share the code ? copy and paste it in the reply, then apply formatting as is explained in this post.
Ask your Norwegian friend to do the same, than we can have a look at it.

tanuki · April 9, 2015, 8:16pm

Thank you. But not today. Hi is a bit sick.

bko · April 9, 2015, 8:28pm

Hi @tanuki

If you are getting this from a subscribed core over the serial port, I think it is very likely there is some other Serial.print statement that is messing things up, like in an interrupt handler etc.

You should also check your terminal program (again assuming serial data from a core) to see if it is changing the line terminating characters. Usually there are settings for this.

I can see your Movement events in the public event stream and I logged them using curl. The published data on the web interface only has 0x0a linefeed characters (\n) and never has any 0x0d (\r) characters. I suspect very strongly that a core subscribing to these events would not get any 0x0d (\r) characters either.

tanuki · April 9, 2015, 8:47pm

I will pass this to my partner. He has access to the hardware. I am the high-level guy on the other end of the cloud. At the moment I just added a second regexp rule. Dirty, but works for now

bko · April 9, 2015, 8:55pm

So you are reading this from the cloud with your own program, something is wrong with your program. When I log these events using curl I do not see any of the problem you are having.

tanuki · April 9, 2015, 9:14pm

Strange. I see it in the raw, unchanged data I get directly out of the socket. Usually I am extremely careful with 'impossible', but in this case? Bugs in my code might skip an event, i.e. my regexp is false and certain lines get dropped since they don't match. But here a bug would have to rewrite a line. Taking the 70 from data and add it to the 10 of the event name.

10\r\nevent: Movement\n70\r\ndata: {\

to

80\r\nevent: Movement\ndata:

This is no simple corruption due to data loss. I would have to invest actual work if I wanted this effect.

bko · April 9, 2015, 9:22pm

My advice is to try it with curl and check the results.

tanuki · April 9, 2015, 9:28pm

Curl does not help here... at least not the way I use it. Is there a switch, which lets me see the raw data?

When I use:

curl -H "Authorization: Bearer <my access_token" https://api.spark.io/v1/devices//events/Movement

I get:

event: Movement
data: {"data":"false","ttl":"60","published_at":"2015-04-09T21:26:05.353Z","coreid":"51ff70065082554930061487"}

I don't see the linebreak chars or any indices. Both forms probably look identical in curl.

Ok.... More info. I used curl and dumped the binary data with --trace into a file. There I was able to see exactly the same effect. So it is not only my program, but also curl. I think now we can rule out the problem on my side?

bko · April 9, 2015, 9:41pm

Try the --verbose flag to curl.

I think I see what your problem is: the Transfer-Encoding is chunked so you are getting the chunk sizes periodically.

Does your program handle chunked HTTP connections?

tanuki · April 9, 2015, 9:46pm

Yes, it handles chunked HTTP connections. As I wrote above… I now see exactly the same when I use curl and look at the raw data.

bko · April 9, 2015, 9:49pm

Hi @tanuki

OK so curl and Javascript and Chrome and lots of other things work fine with this stream and don’t have these problems in its output, so there is something curl is doing that your program is not doing and you have to figure out what that is.

You understand that the chunk sizes are sent in the stream periodically, right?

tanuki · April 9, 2015, 9:58pm

I know chunked transfers.
And now with my workaround my program does not have problems either. The question is and was: Why do I need this workaround? For me it looks like there are two different, but valid types of protocols mixed in the stream. There isn’t just now and then a chunk size added to the stream. And if both types really are different, but valid protocol versions, it would not be surprising that curl, Javascript, or Chrome can handle them. It would only be surprising to have both versions in one and the same stream.

tanuki · April 9, 2015, 10:09pm

Ok, I think I found the problem…
Looks like I get now an then some stuff, which I did not expect. This really did corrupt my chunk handling.

bko · April 9, 2015, 10:09pm

The 10 and 70 and the 80 are the chunk sizes in ASCII hex and are correct for the data being sent.

If you are asking why the chunk sizes are different rather than always being 80 or 81, I think the answer is that encoding host is free to use whatever chunk size they want and the receiver has to deal with it.

tanuki · April 9, 2015, 10:19pm

Yep, I 'tailored' my code too close to the movement events. And yes, I got something wrong... not the initial chunk sizes, but the terminating chunk. I thought it must be 0, but in this stream it is 1.

0000: 31 0d 0a 0a 0d 0a 1.....

This botched my algorithm a bit.
Adjusted... now I don't need my workaround anymore.
Thanks for your help.

Hmm... little addendum... I am googeling now for quite a while about the last chunk.
According to HTTP/1.1: Protocol Parameters 3.6.1 Chunked Transfer Coding the last chunk must be:

last-chunk = 1*("0") [ chunk-extension ] CRLF

I was looking if I find something about "1*("1") [ chunk-extension ] CRLF", which caused my problems (this and my organisational blindness), but nothing. Do I still overlook something? Curl, Javascript, or Chrome don't seem to have problems with this wrong (?) last chunk.

bko · April 9, 2015, 11:03pm

In theory this stream never closes and sends “keep alives” so there is not a real last chunk.

I don’t know how the “keep alives” are sent, they show up as CRLF in the browser. Maybe @Dave knows if they are sent with a chunk size 1.

tanuki · April 10, 2015, 8:56am

Last mystery solved.

From:

Note: The comment line can be used to prevent connections from timing out; a server can send a comment periodically to keep the connection alive.

But:

Note: If a line doesn't contain a colon, the entire line is treated as the field name, with an empty value string.

And empty value strings are ignored, effectively can be used as another way to keep a connection alive.

So, bko, you were right with 'keep alives'.

Topic		Replies	Views
Javascript getEventStream returns poorly formatted data Cloud	6	1774	December 8, 2014
getEventStream data type and processing Cloud	5	2274	October 17, 2014
Registering A Callback Cloud	64	8527	April 23, 2015
Using Spark.publish() with Simple JSON Data Tutorials	119	35987	November 11, 2017
Subscribe to two Spark Core boards and decided which one published data first Cloud	50	4948	April 26, 2016

Two different formats for one and the same event

Related topics