Websockets client!

Can you tell me what time took passing message to spacebrew and then receive in on Spark and parse on it in Your code?
I've testing this with clicking on html page with SpacebrewJS and then pass boolean message to spark. And message arriving so fast to spark then in parses about half of second... Thinking how to speed up this...

I thought to merge your comment here into this thread. Yes, I'm getting the same timings as you mentioned. Details below.

I have a button wired to D3 that sends a change in its state as a new message as a publisher to SpaceBrew.

I also have an LED wired to D1 that listens to incoming messages as a subscriber.


Timings and observations

Spacebrew to Spark (via HTML/JS boolean button example):

  1. Spacebrew JS button test - Pressed
  2. (0.6 seconds)
  3. serial onMessage displays valid JSON
  4. (0 seconds - almost instantaneous)
  5. LED lights up
  6. Spacebrew JS button test - Released
  7. (0.6 seconds)
  8. serial onMessage displays valid JSON

When do the freezes (and subsequent 'disconnects' from Spacebrew) happen:

  • clicking the HTML button rapidly to simulate a larger chunk of JSON data coming in
  • use another Spacebrew source that generates streaming data into the Spark's subscriber

Spark to Spacebrew:

  1. Button connected to Spark - Pressed
  2. (0.4 seconds)
  3. serial onMessage displays valid JSON
  4. (0.4 seconds)
  5. Spacebrew JS button test - background turns green
  6. Button connected to Spark - Released
  7. (0.4 seconds)
  8. serial onMessage displays valid JSON
  9. (0.4 seconds)
  10. Spacebrew JS button test - background turns grey

When do the freezes (and subsequent 'disconnects' from Spacebrew) happen:

  • just by pressing the button most of the time, freezes very easily

The 0.4s delay between the Spark button press and the onMessage serial output is interesting, because this delay is only present when the connection is good. There is no delay when the Spark recovers from the freeze, serial onMessage traces are output almost instantaneously (but it doesn't do anything besides tracing it in serial).

As i see in my code 0x0A comes from server, and that's is an answer on ping(pong).

Do You understand that CC3000 has his own little "brain" and as I understant it can "discuss" with cloud and hold an tcp connection by itself. And only warn main processor when it need to be warned (in case of arriving message, for example). So there is somewhere your send code more buggy than my, or maybe you using this function in wrong way %) Not shure. because after reading all send functions in Spacebrew and websocket cpp's, i've seen same as my code.

[quote="chuank, post:20, topic:3090"]
Event-triggered sends work, but not streamed data (it hangs up after the first/second send).
[/quote] What do You mean under "Streamed data" continous sending? I never tried this ...

[quote="chuank, post:20, topic:3090"]
Whenever these hangs happen, my spacebrew admin page subsequently drops the Spark Core from the list of publishers/subscribers in it's next refresh cycle.
[/quote] ofcourse, that because it not got pong on his ping.

[quote="chuank, post:20, topic:3090"]
my spacebrew server received malformed websocket data, causing a panic on the spacebrew end? Thing is, I did not notice anything odd in my spacebrew logs.
[/quote] I'm not shure. what will happen, because it's driven by standart ws library, so as i think - it just will ignore bad packets.

[quote="chuank, post:20, topic:3090"]
could there have been a bug in the spacebrew server code, or just some incompatibilities in the websocket library that's messing up the communication?
[/quote] But my is works.. Or You still mean streamed data?

[quote="chuank, post:20, topic:3090"]
and yes, I am definitely moving into local USB flashing as I've pretty much had it with the long waits between each test
[/quote]Welcome! :)))

I have almost same timings, but if you enter some prints on serial in arriving of message - You will see, that message arrived almost as button pressed, but 0.5-0.4 second took it's parsing. So this mades me sad and thinking how to speed this up...

[quote="chuank, post:21, topic:3090"]
There is no delay when the Spark recovers from the freeze, serial onMessage traces are output almost instantaneously
[/quote] Hmmm! This is interesting. This can mean, that there is lag in TCPClient itself... not in parsing/compiling JSON code..

Found lag place:

  for(int i = 0; i < len; i++) {
    _packet[i] = nextByte();
  }

trying to change it to something like:

_client.read(_&packet, len);

But there is a problems with wrong types, still thinking about how to make it compatible

Hell yeah!
if you will replace in my code:

  for(int i = 0; i < len; i++) {
    _packet[i] = nextByte();
  }

With

	uint8_t temppack[len];
	_client.read(temppack, len);

  for(int i = 0; i < len; i++) {
    _packet[i] = (char)temppack[i];
  }

You will get much faster proccess of receiving messages.
Let’s send in this way to speed this up too, if there is a problems

Ah, by the way. Info of virtual int read(uint8_t *buffer, size_t size); i’ve get from file spark_wiring_tcpclient.h :smile: this file contents this interesting things:

virtual size_t write(uint8_t);
virtual size_t write(const uint8_t *buffer, size_t size);
virtual int read();
virtual int read(uint8_t *buffer, size_t size);
1 Like

By the way @dave i’m sorry to worrying you, but you already answered on this post:)
Tell somebody to fix documentation in TCPClient.send - send(buffer,len) is undocumented!

1 Like

I don't think we ever get a 0x0A from the spacebrew server, but rather, 0x09, as the ping. The library sends 0x8A out as the pong in response.

In the original Arduino Websocket library that you ported from, the outgoing pong opcode is also 0x8A. However, RFC6455 specifies a 0x0A for the pong. I'm aware that the library isn't 100% RFC6455 compliant, which is not crucial at this point anyway, but I was just curious about these opcodes as I went through the code trying to figure out what's causing my lockups.

[quote="ekbduffy, post:22, topic:3090"]
Do You understand that CC3000 has his own little "brain" and as I understant it can "discuss" with cloud and hold an tcp connection by itself. And only warn main processor when it need to be warned (in case of arriving message, for example). So there is somewhere your send code more buggy than my, or maybe you using this function in wrong way %) Not shure. because after reading all send functions in Spacebrew and websocket cpp's, i've seen same as my code.[/quote]

Yes I'm aware of the way the CC3000 works and how it processing cycles are delegated between the user's main firmware loop and Spark.process(). This was where I'm left wondering, whether the shared TCP access between the Spark and the firmware processes is preventing websocket traffic from completely transmitting, before the Spark processes kick in. As you've looked at my code, yup, it's all just basic boolean sends/receives for me at the moment.

I've even tried SYSTEM_MODE(MANUAL); and manually handled the Spark.connect() and Spark.process() calls, but that didn't resolve the issue.

I'm also intentionally testing this setup on my spacebrew server a few continents away from where my Sparks are, in order to simulate a worst-case lag scenario. This could be contributing to the issues and I'm testing a local setup today to see if things improve. Will post updates on this later.

[quote="ekbduffy, post:22, topic:3090"]
What do You mean under "Streamed data" continous sending? I never tried this ...[/quote]

Basically just polling, for example, an analog input and posting it as a spacebrew range datatype at short intervals. Knowing how the Spark splits processing cycles between cloud-specific code and user firmware, I tried relatively short intervals (100ms) first, all the way past our current measured lag time (600ms to 1s).

This is a repeatable error that will cause my Spark to freeze after the first/second message is sent.

[quote="ekbduffy, post:22, topic:3090"]
ofcourse, that because it not got pong on his ping.[/quote]

Exactly, which is why I'm wondering: why would my Spark not return a pong? The ws connection is supposedly still alive when this stall in communication happens.

I can confirm that the WebSocketClient::monitor() function still runs, via serial. However, a clue here is with client.available() in the function – my Spark's 'freeze' is happening because client.avaliable() turns out to be zero bytes, therefore causing the rest of the WebSocketClient::monitor() function not to execute.

It sounds to me like the Spacebrew server was still expecting websocket data to conclude, leading both parties listening on each other for a response, and therefore no new pings from the Spacebrew server.

During the times when it's working (basically when I don't send any data out, but what's the point then? :)), I see the pongs going back to my spacebrew server without issue.

[quote="ekbduffy, post:22, topic:3090"]
But my is works.. Or You still mean streamed data?[/quote]

Could you try hooking up a button to send boolean messages, but then stress-testing it by rapidly pressing it to send a succession of spacebrew messages to the server? Just to check if your Spark locks up? It's the same as 'streaming' data to spacebrew at short intervals.

[quote="ekbduffy, post:22, topic:3090"]
I have almost same timings, but if you enter some prints on serial in arriving of message - You will see, that message arrived almost as button pressed, but 0.5-0.4 second took it's parsing. So this mades me sad and thinking how to speed this up...[/quote]

Yes indeed, same for me, the string parsing and concatenation to send data out as a publisher is near instantaneous.

[quote="ekbduffy, post:22, topic:3090"]
Hmmm! This is interesting. This can mean, that there is lag in TCPClient itself... not in parsing/compiling JSON code..[/quote]

Correct, whenever my Spark freezes up, outgoing JSON strings parsed by the WebSocketClient::send function executes as expected, hardly any delays. When my spacebrew is connected, I do get the 0.5s delays as mentioned above.

Very interesting discovery of the virtual functions!

Replacing the code you suggested speeds up the responsiveness, now there’s noticeably less lag between incoming spacebrew data and traced output on the spark = excellent. :thumbsup:

However, data transmission issues mentioned in my previous post above are still present. :worried:

Try loading the Spacebrew rangeslider JS example, and have your Spark spacebrew code configured with a range subscriber, printing out received range values.

Drag the JS slider around – this should create a queue of websocket data destined for the Spark.

I get consistent hard fault SOS codes every time. Do you get the same result, and are you running Spacebrew on localhost?

Ok, tried running Spacebrew on my local machine. As expected, a much better response all round, but the same freezes happen.

After replacing with _client.read(temppack,len) as per your suggestion, I now get hard fault SOSes very consistently, so despite the critical error, I consider this a step closer towards finding the culprit.

I did an analysis of the incoming message *char, here's a snippet of my serial output:

(length:86) : {"message":{"name":"LED","type":"boolean","value":"false","clientName":"NormalNinja"}}
(length:85) : {"message":{"name":"LED","type":"boolean","value":"true","clientName":"NormalNinja"}}
(length:33) : {"message":{"name":"LED","t

(I added a message length count at the beginning of each received message, as I wanted to monitor and keep all messages within 126 bytes - the maximum websocket packet size without resorting to payload extensions)

In the third line above, it looks almost as if the incoming data got truncated halfway in, perhaps with the next data frame. Again this is hinting towards potential data transmission conflicts, perhaps between the websocket library and TCPClient?

@ekbduffy it'll be great if you can run through the above tests and see if you can reproduce the 'freezes'.

1 Like

I think I now understand the 0x8A used within the ‘pong’ byte – it’s actually 0x0A OR’ed with the FIN opcode 0x80 to complete the ‘pong’ response to the websocket server.


More updates to my tests… calling Spark.disconnect() on startup has a positive effect in stabilising the websocket connection. Since I’m uploading firmware from dfu, staying connected to the cloud in this case is no longer important.

Also tried an alternative websocket library from @leo3linbeck. After some adjustments to make it talk to the spacebrew library, I actually encounter similar freezes. It doesn’t look good, as I suspect this has to do with the buffer size in TCPClient.

Looking at spark_wiring_tcpclient.h, I noticed the TCPCLIENT_BUF_MAX_SIZE is set to 128 bytes. Could bumping this up provide a bit more legroom? Typical payloads for a single Spacebrew subscriber comes in at around 85 bytes per frame, which will very likely overflow especially if you have multiple spacebrew subscribers set up in the Spark Core, and/or short-interval updates (slider range values, for example) coming in to the little guy.

@satishgn, is there any chance that TCPCLIENT_BUF_MAX_SIZE be increased in a future update? And how large can this buffer conceivably go?

@ekbduffy i suppose finding ways to optimise the websocket library’s read calls from TCPClient is what I’ll be looking into now…

Yes, I can use the excellent Spark Cloud functionality to build up my sensor network, and I’ll definitely use those for relatively ‘closed’ projects. The websocket-based Spacebrew to me however is the ideal glue in setting up sensor ‘playgrounds’ (existing JS, P5, ofx, Unity implementations) without having to write up everything from scratch, and I think it has the possibility of becoming very good buddies with Spark devices – once this websocket/TCPClient issue is resolved. :sunny:

2 Likes

Checked, my too. If I'm using rage input from html page - it can crash core even on third message. And it looks just like cutted message and core SOS'ed by this unexpected end of string. So we just need to make checks on all length of message parts in this , as i think.

Hmm, as i see format, we can just check to existance of “}}” in message :smile:

if(strstr(message, “}}”) != NULL)

In begin of void Spacebrew::onWSMessage helped! But not so much, until there was error on websocket receiving :wink:
Got error:

Masking not yet supported (RFC 6455 section 5.3)

and core SOS’ed again :smile:

removed “free(_packet)” in monitor() of websocketclient - this removed SOS message from core, but it still stucks.

So, as i understand:
Core when_buffer of TCPClient reach TCPCLIENT_BUF_MAX_SIZE just overwrite it. Or i’m wrong? @satishgn Your writed this part of core firmware as i see in copyrights, can you tell us how can we fix problem?

I also managed to confirm the error is happening on the Spark Core side.

Here's a snippet of my spacebrew server log. Just before the SOS, the websocket message was sent out in its entirety:

info: [wss.onmessage] new message received
info: [wss.onmessage] text message content: {"message":{"clientName":"TestButton","name":"buttonPress","type":"boolean","value":"false"}}
info: [handleMessageMessage] message sent to: 'NormalNinja' msg: {"message":{"name":"LED","type":"boolean","value":"false","clientName":"NormalNinja"}}
> info: [handleMessageMessage] message sent to: 'TestButton' msg: {"message":{"name":"toggleBackground","type":"boolean","value":"false","clientName":"TestButton"}}

Spacebrew::onWSMessage gets the data from WebSocketClient, which relies on TCPClient methods, so we should see how we can handle the buffer overruns in TCPClient. I'm exploring all means first before resorting to changing TCPCLIENT_BUF_MAX_SIZE and recompiling the build binaries.

My work-in-progress so far:

I also did something similar to your strstr check, but put it within WebSocketClient::monitor, and just before sending out _onMessage. Got the same results as you. I have not encountered the masking error this way (which eventually, will have to be resolved together with the Base64 encoding authentication. That's for later :smile: ).

Doing strstr checks to reject incomplete messages will break any Spacebrew messages that rely on multiple packets (payloads > 125 bytes) – this is especially true for incoming string messages. Try sending overall messages that are >125 bytes – you'll see what I mean.

I suppose a limit can be set so that messages > 125 bytes are discarded as 'malformed', but it's not ideal moving forward, if Spacebrew's websocket implementation allows multi-frame data.

Yay, some progress made! Thanks for your input @ekbduffy, two heads better than one! :smile:

WebsocketClient::monitor: all calls to read off the TCPClient buffer larger than 1 byte no longer uses nextByte() (it's a continuation to your previous suggestion of using a temporary uint8_t array to read in data chunks of length len).

I've implemented a rather kludgy mechanism to catch buffer overruns. Just before a buffer overrun, a flush() is called, followed by a disconnect from the spacebrew server. WebSocketClient::monitor() then attempts reconnection – and sends an 'apology' empty spacebrew data frame to flush the server-side buffer, finally reestablishing connection with Spacebrew. No more SOS, and the Spark now recovers from the overrun! There's a 1sec delay between reconnects, but it works for now.

Due to these changes, my version of the websocket library is now somewhat specific only towards Spacebrew servers. I'll see if I can clean this up and contribute a Spacebrew library for the Spark community.


Some additional notes:

Short-interval outgoing data 'streams' (tested from 0-10ms delays) no longer cause buffer overruns. In fact the latency is really low here – the JS spacebrew sliders were butter smooth as it reacted to the published analog inputs.

I'm getting much more consistent and reliable connections now, and the SOS has been eliminated through my fix described above.

As it stands now, the buffer overruns occur when there's a large amount of incoming data. I'd say adjusting TCPCLIENT_BUF_MAX_SIZE is the next thing to try. Managing the frequency of incoming data from other apps will be the other recourse.

(fyi, I tried this on a locally hosted spacebrew server; haven't tested this on my remote server yet.)

my test setup:

Publishers
1 boolean output publisher, simple button – OK
1 analog output, simple voltage divider between 2 LDRs constant output at 0-10ms intervals – OK

Subscribers
1 boolean listener, output to an LED <-- buffer overrun with rapid-succession JS button clicks. wsclient reconnects, recovers!
1 string listener, output to serial <-- can't send messages > 125 bytes (due to malformed data checks)
1 range listener, output to serial <-- guaranteed buffer overrun when JS slider dragged and spewing range (int) values into the subscriber, wsclient reconnects, recovers!

1 Like

I’ve tried many ways to fix my issues, but even on almost to actions after data arrived core still can go to SOS mode.

Strange situation:
running code in loop:

if (_client.available() > 2) {
	_total =  _client.available();
	#ifdef TRACE
	Serial.print("total = ");
	Serial.println(_total);
	#endif
	_buffer = (uint8_t*) calloc (_total,sizeof(uint8_t));
	_total = _client.read(_buffer, _total);
	#ifdef TRACE
	Serial.print("realtotal = ");
	Serial.println(_total);
	#endif
	#ifdef STEPBYSTEP
		delay(100); 
	#endif

	#ifdef TRACE
	Serial.print("message = ");
	Serial.println((char*)_buffer);
	#endif

And I’m getting this responce on pings from websocket server:

So, i’m stuck, where i’m wrong, why _buffer got strange attachments at the end?

fixed. I’ve start to add 0x0 at the end of _buffer:

_buffer[_total] = 0x0;
3 Likes

sorry @ekbduffy, didn’t have the time to get back to my core, with other projects getting in the way.

i’m looking forward to resuming work on the code soon, and it’s great to know that you resolved your bug.

i’m curious: are you rewriting a generic websocketclient implementation?

Yes, I’m still in trying to fix issues and speed up work of generic websocketclient.
I’ve went this way:
Copy all contents of tcpclient buffer to one string, then pass to all preparying functions…
This works pretty fast, but still have issues on burstly arriving data. I’ll update my github repo right now.

I’m having trouble getting the library to work. I get a compiler error just on the sample code via the web ide.

I created a blank application, included the Spark-Websockets library (0.0.5), and replaced my .ino file contents with that of the library’s test.cpp.

The compiler vomits this:

In file included from ../inc/spark_wiring.h:29:0,
from ../inc/spark_wiring_stream.h:36,
from ../inc/spark_wiring_usbserial.h:30,
from Spark-Websockets/Spark-Websockets.cpp:69:
../../core-common-lib/SPARK_Firmware_Driver/inc/config.h:12:2: warning: #warning "Defaulting to Release Build" [-Wcpp]
#warning "Defaulting to Release Build"
^
Spark-Websockets/Spark-Websockets.cpp: In member function 'void WebSocketClient::reconnect()':
Spark-Websockets/Spark-Websockets.cpp:111:70: warning: format '%hu' expects argument of type 'short unsigned int*', but argument 3 has type 'byte* {aka unsigned char*}' [-Wformat=]
sscanf(_hostname, "%hu.%hu.%hu.%hu", &ip[0], &ip[1], &ip[2], &ip[3]);
^
Spark-Websockets/Spark-Websockets.cpp:111:70: warning: format '%hu' expects argument of type 'short unsigned int*', but argument 4 has type 'byte* {aka unsigned char*}' [-Wformat=]
Spark-Websockets/Spark-Websockets.cpp:111:70: warning: format '%hu' expects argument of type 'short unsigned int*', but argument 5 has type 'byte* {aka unsigned char*}' [-Wformat=]
Spark-Websockets/Spark-Websockets.cpp:111:70: warning: format '%hu' expects argument of type 'short unsigned int*', but argument 6 has type 'byte* {aka unsigned char*}' [-Wformat=]
Spark-Websockets/Spark-Websockets.cpp:111:70: warning: format '%hu' expects argument of type 'short unsigned int*', but argument 3 has type 'byte* {aka unsigned char*}' [-Wformat=]
Spark-Websockets/Spark-Websockets.cpp:111:70: warning: format '%hu' expects argument of type 'short unsigned int*', but argument 4 has type 'byte* {aka unsigned char*}' [-Wformat=]
Spark-Websockets/Spark-Websockets.cpp:111:70: warning: format '%hu' expects argument of type 'short unsigned int*', but argument 5 has type 'byte* {aka unsigned char*}' [-Wformat=]
Spark-Websockets/Spark-Websockets.cpp:111:70: warning: format '%hu' expects argument of type 'short unsigned int*', but argument 6 has type 'byte* {aka unsigned char*}' [-Wformat=]
Spark-Websockets/Spark-Websockets.cpp:131:43: warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings]
_onError(*this, "Connection Failed!");
^
Spark-Websockets/Spark-Websockets.cpp: In member function 'void WebSocketClient::monitor()':
Spark-Websockets/Spark-Websockets.cpp:229:48: warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings]
_onError(*this, "Masking not supported");
^
Spark-Websockets/Spark-Websockets.cpp:271:26: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for(int i = 0; i < _packetLength; i++) {
^
Spark-Websockets/Spark-Websockets.cpp:315:58: warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings]
_onError(*this, "Binary Messages not supported");
^
Spark-Websockets/Spark-Websockets.cpp: In member function 'size_t WebSocketClient::base64Encode(byte*, size_t, char*, size_t)':
Spark-Websockets/Spark-Websockets.cpp:495:47: warning: array subscript has type 'char' [-Wchar-subscripts]
target[datalength++] = b64Alphabet[output[0]];
^
Spark-Websockets/Spark-Websockets.cpp:496:47: warning: array subscript has type 'char' [-Wchar-subscripts]
target[datalength++] = b64Alphabet[output[1]];
^
Spark-Websockets/Spark-Websockets.cpp:497:47: warning: array subscript has type 'char' [-Wchar-subscripts]
target[datalength++] = b64Alphabet[output[2]];
^
Spark-Websockets/Spark-Websockets.cpp:498:47: warning: array subscript has type 'char' [-Wchar-subscripts]
target[datalength++] = b64Alphabet[output[3]];
^
Spark-Websockets/Spark-Websockets.cpp:516:47: warning: array subscript has type 'char' [-Wchar-subscripts]
target[datalength++] = b64Alphabet[output[0]];
^
Spark-Websockets/Spark-Websockets.cpp:517:47: warning: array subscript has type 'char' [-Wchar-subscripts]
target[datalength++] = b64Alphabet[output[1]];
^
Spark-Websockets/Spark-Websockets.cpp:521:48: warning: array subscript has type 'char' [-Wchar-subscripts]
target[datalength++] = b64Alphabet[output[2]];
^
In file included from ../inc/spark_wiring.h:29:0,
from ../inc/application.h:29,
from dirtysocks.cpp:2:
../../core-common-lib/SPARK_Firmware_Driver/inc/config.h:12:2: warning: #warning "Defaulting to Release Build" [-Wcpp]
#warning "Defaulting to Release Build"
^
dirtysocks.cpp:3:16: error: variable or field 'onMessage' declared void
void onMessage(WebSocketClient client, char* message);
^
dirtysocks.cpp:3:16: error: 'WebSocketClient' was not declared in this scope
dirtysocks.cpp:3:40: error: expected primary-expression before 'char'
void onMessage(WebSocketClient client, char* message);
^
dirtysocks.cpp: In function 'void loop()':
dirtysocks.cpp:28:28: warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings]
}
^
make: *** [dirtysocks.o] Error 1