Particle.publish() failing even tho ACK received

I’ve gleamed heaps of good advice and solved a lot of my issues with existing topics in this forum. Thanks all! Hoping you can come through for me with this curly one.

Context:

  • Electron
  • Firmware 0.8.0-rc.10
  • Testing from Bangladesh (in case carrier issues are involved)

I was having trouble with Particle.publish() events not making it to the cloud. Small events (data 5 chars, name 16 chars) consistently make it but slightly large ones (data 47/48/49 chars) would intermittently not turn up. After trying a lot of different things I added the WITH_ACK flag to publish() and suddenly that fixed it. But that’s not ideal as publish() regularly blocks my code for 20 seconds, and when looking through the logs with trace enabled it looks like the ACK was received.

Also, when publish(..., WITH_ACK) returns false I retry but more often than not in this situation the event did actually make it to the cloud. So I end up with duplicated events (which I know is always possible but it’s now happening for me more than half the time).

At the end of this post is the consecutive trace log for one of these events failing for real (event did not make it to the cloud). The relevant sections (in my mind) are:

The publish being called (first time):

0000028390 [comm.coap] TRACE: sending message id=19a
0000028391 [system] TRACE: send 115

The ACK being received 7 seconds later:

0000035765 [system] TRACE: received 33
0000035765 [comm.coap] TRACE: recieved ACK for message id=19a
0000035767 [comm.protocol] INFO: rcv'd message type=15

Publish returns false 13 seconds after that:

0000048407 [app.metricsPublisher] ERROR: Publish failed, will retry

So I send the same event again:

0000048428 [comm.coap] TRACE: sending message id=19b
0000048428 [system] TRACE: send 102

That succeeds as expected:

0000051700 [system] TRACE: received 33
0000051700 [comm.coap] TRACE: recieved ACK for message id=19b
0000051702 [comm.protocol] INFO: rcv'd message type=13
0000051702 [app.metricsPublisher] INFO: Published 48 chars, 0 bytes remaining

I don’t know why the retry attempt reports a smaller ‘send’ size (102 vs. 115) than the first attempt. I’m passing the same params to publish(). But that is maybe why the 2nd attempt always succeeds? (as per above, smaller events always work).

Questions:

  • How come the ACK can be received but publish() doesn’t return?
  • When I receive the ACK, but the event doesn’t appear in the Particle console, or my webhook, or the output of CLI particle subscribe, does the ACK message contain a ‘failure’ status inside it?

And this last one might only be a symptom of publish() not ‘handling’ the ACK (from 1st question) but:

  • When the event makes it to the cloud and I receive the ACK why does publish(..., WITH_ACK) regularly then return false in this situation?

After experiencing this problem about 20+ times throughout yesterday afternoon and evening, coming back to today to finish this post and create a small reproducible example I can no longer make it happen (with the same code as last night). So maybe it was triggered by a network glitch (wouldn’t be surprising in this part of the world). But still, I think given what the logs show above my questions are still valid.Either I’m misunderstanding something or ACK isn’t being handled by publish properly.


Complete log from sending the message, the first event not making it to the cloud, and then the retry succeeding:

0000028389 [app.metricsPublisher] TRACE: Returning next event of 1 entries (48 chars):
1539346221,4.1,9e+01,0.2867,0.2868,0.2846,0.2862
0000028390 [comm.coap] TRACE: sending message id=19a
0000028391 [system] TRACE: send 115
socketSendTo(0,35.153.73.148,5684,,115)
    28.390 AT send      37 "AT+USOST=0,\"35.153.73.148\",5684,115\r\n"
    28.401 AT read  >    3 "\r\n@"
    28.451 AT send     115 "\xfe\xfe\xfd\x00\x01\x00\x00\x00\x00\x01\xf6\x00Y\x00\x01\x00\x00\x00\x00\x01\xf6\xa2A\x81\x99\xff\x81\x9fN=\x1b\"\xc8\xe7\xb5+L\x06_\xcb!\xd1ew\x147\x00\x83\xa3w\xf3\xde\xf1-0\xd9\v\x91\xc73\xaf\xfbY\x8b\xa2\xffy\x11\xae\xa9\xd0\xef\x96_|\xbb1l\x11\xcc`m\x10chd\x17[8\x85\x8d\x16q\x1e\xa0\xeb\xc1aB6Z\xed(\x00#\x00\x0eG734641\f"
    28.594 AT read  +   17 "\r\n+USOST: 0,115\r\n"
    28.604 AT read OK    6 "\r\nOK\r\n"
    28.604 AT send      14 "AT+USORF=0,0\r\n"
    28.614 AT read  +   15 "\r\n+USORF: 0,0\r\n"
Socket 0: handle 0 has 0 bytes pending
    28.624 AT read OK    6 "\r\nOK\r\n"
    28.844 AT read  +   17 "\r\n+UUSORD: 0,33\r\n"
Socket 0: handle 0 has 33 bytes pending
    28.856 AT send      17 "AT+USORF=0,1024\r\n"
    28.876 AT read  +   71 "\r\n+USORF: 0,\"35.153.73.148\",5684,33,\"\x17\xfe\xfd\x00\x01\x00\x00\x00\x00\x002\x00\x14\x00\x01\x00\x00\x00\x00\x002U\xa4\x93\xc5\xf35\t\x17D\xee\x89|\""
    28.890 AT read OK    6 "\r\nOK\r\n"
    28.890 AT send      14 "AT+USORF=0,0\r\n"
    28.900 AT read  +   15 "\r\n+USORF: 0,0\r\n"
Socket 0: handle 0 has 0 bytes pending
    28.910 AT read OK    6 "\r\nOK\r\n"
0000028911 [system] TRACE: received 33
0000028911 [comm.coap] TRACE: recieved ACK for message id=199
0000028913 [comm.protocol] INFO: rcv'd message type=13
    29.122 AT read  +   17 "\r\n+UUSORD: 0,33\r\n"
Socket 0: handle 0 has 33 bytes pending
    29.134 AT send      17 "AT+USORF=0,1024\r\n"
    29.154 AT read  +   71 "\r\n+USORF: 0,\"35.153.73.148\",5684,33,\"\x17\xfe\xfd\x00\x01\x00\x00\x00\x00\x002\x00\x14\x00\x01\x00\x00\x00\x00\x002U$\x93\xc6\xb6q\aA\xc1\xef\x9c\b\""
    29.168 AT read OK    6 "\r\nOK\r\n"
    29.168 AT send      14 "AT+USORF=0,0\r\n"
    29.178 AT read  +   15 "\r\n+USORF: 0,0\r\n"
Socket 0: handle 0 has 0 bytes pending
    29.188 AT read OK    6 "\r\nOK\r\n"
0000029189 [system] TRACE: received 33
mbedtls/library/ssl_tls.c:3806: replayed record
mbedtls/library/ssl_tls.c:4003: discarding unexpected record (header)
0000034041 [system] TRACE: send 102
socketSendTo(0,35.153.73.148,5684,,102)
    34.040 AT send      37 "AT+USOST=0,\"35.153.73.148\",5684,102\r\n"
    34.052 AT read  >    3 "\r\n@"
    34.102 AT send     102 "\x17\xfe\xfd\x00\x01\x00\x00\x00\x00\x01\xf7\x00Y\x00\x01\x00\x00\x00\x00\x01\xf7)\xaa\xe81\x1eLp\xc9K4Z\x9d\x16\xa5\xef\xe11\xf9\xed\x15{h\xf1\xaf\xd1\f\xfb\xe0r\xb4:\"@\xb4\x01\x80[)9X\xf5\x9bP\xcf\xc7\x98\xb9\x94\x9e\xbe\xb0\xc8\x05\x80\xf7\x9b\xb0C\x9b\x96\x0e\xf9>6\xb1\xe2{{\xd6 \xc7\xae\x9b~\xeds\xce\x9e\xa48I"
    34.248 AT read  +   17 "\r\n+USOST: 0,102\r\n"
    34.258 AT read OK    6 "\r\nOK\r\n"
    34.258 AT send      14 "AT+USORF=0,0\r\n"
    34.268 AT read  +   15 "\r\n+USORF: 0,0\r\n"
Socket 0: handle 0 has 0 bytes pending
    34.278 AT read OK    6 "\r\nOK\r\n"
    35.698 AT read  +   17 "\r\n+UUSORD: 0,33\r\n"
Socket 0: handle 0 has 33 bytes pending
    35.710 AT send      17 "AT+USORF=0,1024\r\n"
    35.730 AT read  +   71 "\r\n+USORF: 0,\"35.153.73.148\",5684,33,\"\x17\xfe\xfd\x00\x01\x00\x00\x00\x00\x003\x00\x14\x00\x01\x00\x00\x00\x00\x003\xd6R\xc4\x80\xbf\x06\x9e\x9e\xf6\xea\xfa\xbc\""
    35.744 AT read OK    6 "\r\nOK\r\n"
    35.744 AT send      14 "AT+USORF=0,0\r\n"
    35.754 AT read  +   15 "\r\n+USORF: 0,0\r\n"
Socket 0: handle 0 has 0 bytes pending
    35.764 AT read OK    6 "\r\nOK\r\n"
0000035765 [system] TRACE: received 33
0000035765 [comm.coap] TRACE: recieved ACK for message id=19a
0000035767 [comm.protocol] INFO: rcv'd message type=15
0000048407 [app.metricsPublisher] ERROR: Publish failed, will retry
0000048427 [app.metricsPublisher] TRACE: Returning next event of 1 entries (48 chars):
1539346221,4.1,9e+01,0.2867,0.2868,0.2846,0.2862
0000048428 [comm.coap] TRACE: sending message id=19b
0000048428 [system] TRACE: send 102
socketSendTo(0,35.153.73.148,5684,,102)
    48.428 AT send      37 "AT+USOST=0,\"35.153.73.148\",5684,102\r\n"
    49.979 AT read  >    3 "\r\n@"
    50.029 AT send     102 "\x17\xfe\xfd\x00\x01\x00\x00\x00\x00\x01\xf8\x00Y\x00\x01\x00\x00\x00\x00\x01\xf8\xc2O\x03Om\x0f\xd8\xb5\x1f\xdd/\xf2\xec\x8a\x15\xaa\x13\xc6\x913`\xfa~\xcd\xe4\xdf\xb1s\xcb\t<\x95c\x90\x18\xb4\x8d\xf5\xb27\x81\xe6\xf1\xe5\xa6\xa7\x85@\x0e\x82o\xaa\x83S\x9a\xfe\xaa\xd7\xeaf\xa8x\xde<\xc0\fG\xe8\xa1LgX\x94\xff\xdc\xc7W\xacR\xaf\xab"
    50.172 AT read  +   17 "\r\n+USOST: 0,102\r\n"
    50.182 AT read OK    6 "\r\nOK\r\n"
    50.182 AT send      14 "AT+USORF=0,0\r\n"
    50.192 AT read  +   15 "\r\n+USORF: 0,0\r\n"
Socket 0: handle 0 has 0 bytes pending
    50.203 AT read OK    6 "\r\nOK\r\n"
    51.633 AT read  +   17 "\r\n+UUSORD: 0,33\r\n"
Socket 0: handle 0 has 33 bytes pending
    51.645 AT send      17 "AT+USORF=0,1024\r\n"
    51.665 AT read  +   71 "\r\n+USORF: 0,\"35.153.73.148\",5684,33,\"\x17\xfe\xfd\x00\x01\x00\x00\x00\x00\x004\x00\x14\x00\x01\x00\x00\x00\x00\x004\x1c\xb6E\xf1\xc1\xb85\xc12\x19\x0f\xe0\""
    51.679 AT read OK    6 "\r\nOK\r\n"
    51.679 AT send      14 "AT+USORF=0,0\r\n"
    51.689 AT read  +   15 "\r\n+USORF: 0,0\r\n"
Socket 0: handle 0 has 0 bytes pending
    51.699 AT read OK    6 "\r\nOK\r\n"
0000051700 [system] TRACE: received 33
0000051700 [comm.coap] TRACE: recieved ACK for message id=19b
0000051702 [comm.protocol] INFO: rcv'd message type=13
0000051702 [app.metricsPublisher] INFO: Published 48 chars, 0 bytes remaining

For future prosperity after discussion with Particle support it turns out this was probably due to a server error. Still seems strange to get into this situation but it hasn’t happened since.