ACK not working with 3rd party SIM?

I’m using the Electron with a 3rd party SIM. In general, it works. I deep sleep 10 min, then wake and publish a battery voltage (about 41 bytes total application data if I calculated correctly). However I see two problems:

  1. Overhead data usage is higher than with Particle SIM. Similar stuff found here (Electron Data Usage Overhead), but I’m using automatic mode which seems to help. Still not as good as with Particle SIM. Could it be a double ACK for 3rd party SIM or a ping plus an ACK? Any idea why data usage for 3rd party SIM is different (overhead ~148 bytes TX, 122 bytes RX) compared to Particle (65 bytes TX, 61 bytes RX)?

  2. My larger problem is that I occasionally lose published data. I am not using NO_ACK, so I assume that Electron should get an ACK and retry if the publish fails. I don’t see that. The packet just doesn’t appear in the cloud. I modified my code to track data usage for each publish and found that I don’t get any RX data when a packet is dropped. Any suggestions on what I can do to ensure packets are received in the cloud?

Here’s a snippet of my Electron’s serial output (IP address masked). Notice that I normally TX 189 bytes and RX 122 bytes (see last two values deltaTX, deltaRX). When I see TX 189 and RX 0, I know I’ve lost a packet. It’s missing from the cloud (M2X in this case). [Note: I can confirm the data usage from the 3rd party SIM cloud provider. Because I deep sleep, each publish is listed as a separate session. Usually it shows 311 bytes (189 + 122), but on fail it shows 189 bytes.]
I get True for Cellular.ready(). Publish() returns True (success). RSSI (-75) and QUAL (43) don’t really change.

<,Timestamp,Init_CID,Tx,Rx,Tx,Rx,.ready,localIP,Reset_CID,Tx,Rx,Tx,Rx,batt_value,pubSuccess,RSSI,QUAL,Publish_CID,T,R,T,R,deltaTx,deltaRx,>
<,2016-10-30T17:26:25Z,31,74,0,74,0,true,10.52.xxx.xxx,31,0,0,0,0,3.9450,1,-75,43,31,189,122,189,122,189,122,>
<,2016-10-30T17:37:00Z,31,74,0,74,0,true,10.52.xxx.xxx,31,0,0,0,0,3.9450,1,-73,43,31,189,122,189,122,189,122,>
<,2016-10-30T17:47:35Z,31,74,0,74,0,true,10.52.xxx.xxx,31,0,0,0,0,3.9450,1,-75,43,31,189,0,189,0,189,0,>
<,2016-10-30T17:58:12Z,31,74,0,74,0,true,10.52.xxx.xxx,31,0,0,0,0,3.9450,1,-75,43,31,189,122,189,122,189,122,>

Code below. Running 0.6.0-rc.1.

//rev1 10/28/16: Adds .ready(), .publish return, localIP, RSSI, and QUAL
//rev2 10/29/16: Streamline output to use csv format (all data on single line)
//  <,Timestamp,Init_CID,T,R,T,R,.ready,localIP,Reset_CID,T,R,T,R,batt_value,pubSuccess,RSSI,QUAL,Publish_CID,T,R,T,R,deltaT,deltaR,>

#include "cellular_hal.h"
STARTUP(cellular_credentials_set("apn.konekt.io", "", "", NULL));
// Connects to a cellular network by APN only

void setup()
{
    bool readyOK;
    Serial.begin(9600);
    delay(5000);    //delay 5 sec to allow user serial connection
    //Serial.print("******** Init **********");
    Serial.print("<,");
    Serial.print(Time.format(Time.now(), TIME_FORMAT_ISO8601_FULL));
    Serial.print(",");
    PrintCellUsage();   //values on init
    Serial.print(",");
    Cellular.resetDataUsage();
    if(Cellular.ready())
        Serial.print("true");
    else
        Serial.print("false");
    //Serial.print("localIP = ");
    Serial.print(",");
    Serial.print(Cellular.localIP());
}


void loop()
{
    FuelGauge fuel;
    float value;
    bool success;
    int tx1, rx1, tx2, rx2, deltatx, deltarx;
    
    //Serial.print("Before Publish: ");
    Serial.print(",");
    CellularData data;
    if (!Cellular.getDataUsage(data)) {
        Serial.print("-1,-1,-1,-1,-1");
    }
    else {
        Serial.print(data); // printed as CID,TX,RX,TX,RX
    }
             
    tx1 = data.tx_total;
    rx1 = data.rx_total;
                
    value = fuel.getVCell();
    String output = "{\"batt-value\": \"" + String(value) + "\"}";
    Serial.print(",");
    Serial.print(value,4);
    success = Particle.publish("fuel-level1", output);
    //Serial.print("publish success = ");
    Serial.print(",");
    Serial.print(success);
    CellularSignal sig = Cellular.RSSI();
   // Serial.print("RSSI,QUAL = ");
    Serial.print(",");
    Serial.print(sig);
    //delay(1);
    delay(1000);    //flush serial buffer
    Serial.print(",");
    if (!Cellular.getDataUsage(data)) {
        Serial.print("-1,-1,-1,-1,-1");
    }
    else {
        Serial.print(data); // data usage counter adter publish
    }
             
    tx2 = data.tx_total;
    rx2 = data.rx_total;
    
    deltatx = tx2 - tx1;
    deltarx = rx2 - rx1;
    Serial.print(",");
    Serial.print(deltatx);
    Serial.print(",");
    Serial.print(deltarx);
    Serial.println(",>");    //END OF CSV LINE
    delay(1000);    //allow serial flush before sleep
    
    //NOTE: This currently bricks the device for OTA... 
    System.sleep(SLEEP_MODE_DEEP, 600);    //1200 = 20 min
    
    
}

void PrintCellUsage() {
            CellularData data;
            if (!Cellular.getDataUsage(data)) {
                Serial.print("-1,-1,-1,-1,-1");
            }
            else {
                Serial.print(data); // printed as CID,TX,RX,TX,RX
            }
}

Thanks all. :smile:

Could you test with 0.6.0-rc.2, since I’m not sure if the fix to finish all queued UDP packets before going to sleep already was part of 0.6.0-rc1.
You could also strech the time from publish to sleep to 5sec which should be enough to flush the UDP queue.

The extra data usage might come from the shorter keep alive periode of 3rd party services. With Particle SIMs you can sleep up to 23 minutes before you need to do a full handshake.

Thanks for the ideas. The firmware notes lead me to believe the UDP flush was fixed in rc1. Is it this one?: UDP.flush() and TCP.flush() now conform to the Stream.flush() behavior from Arduino 1.0 Wiring. The current (correct) behavior is to wait until all data has been transmitted. Previous behavior discarded data in the buffer.

I updated to rc2 anyway and am running the test now. If that fails, I’ll increase the delay before sleep too. I’ll post in a day or two with results.

Another thing I just thought of (duh) as a workaround is to use the RX data usage to detect the problem and manually retry to publish. I’d still like to explain the problem though.

Update:

  1. Changing to 0.6.0-rc.2 did not fix the problem. For a dropped packet, data usage showed 189 TX, 0 RX and the packet was not in the cloud. Interestingly, the carrier usage log did not show ANY entry for that session. It was just a gap in the data. So the Electron thought it pushed 189 Bytes, but the carrier didn’t charge for it.

  2. Changed code to add additional delay after publish. Still testing.
    2a) For the first hour, the data usage was huge (relatively speaking). Serial output below. Notice the huge TX and RX numbers after the timestamp. I have no clue why this happened or why it stopped after one hour. These numbers were roughly corroborated by carrier usage data (below) of 4354, 1795, 1795, 1795, and 1795 bytes before returning to normal (311 bytes).
    2b) Haven’t seen a dropped packet so far (13 hours).

Serial output
<,2016-11-02T07:03:19Z,31,2723,2075,2723,2075,true,10.52.168.193,31,0,0,0,0,4.0170,1,-83,25 ,31,102,61,102,61,102,61,>
<,2016-11-02T07:14:03Z,31,1288,443,1288,443,true,10.52.168.193,31,0,0,0,0,3.9389,1,-75,43 ,31,102,61,102,61,102,61,>
<,2016-11-02T07:24:47Z,31,1288,443,1288,443,true,10.52.168.193,31,0,0,0,0,3.9377,1,-75,37 ,31,102,61,102,61,102,61,>
<,2016-11-02T07:35:32Z,31,1288,443,1288,443,true,10.52.168.193,31,0,0,0,0,3.9377,1,-75,43 ,31,102,61,102,61,102,61,>
<,2016-11-02T07:46:19Z,31,1210,443,1210,443,true,10.52.168.193,31,0,0,0,0,3.9377,1,-75,43 ,31,180,61,180,61,180,61,>
<,2016-11-02T07:56:59Z,31,74,0,74,0,true,10.52.168.193,31,0,0,0,0,3.9377,1,-75,43 ,31,189,122,189,122,189,122,>
<,2016-11-02T08:07:39Z,31,74,0,74,0,true,10.52.168.193,31,0,0,0,0,3.9377,1,-75,43 ,31,189,122,189,122,189,122,>
<,2016-11-02T08:18:19Z,31,74,0,74,0,true,10.52.168.193,31,0,0,0,0,3.9377,1,-75,43 ,31,189,122,189,122,189,122,>
Carrier data
{"linkid":9985,"record_id":62897625,"session_begin":"2016-11-02 07:56:55","timestamp":"2016-11-02 07:57:10","bytes":311,"network_name":"AT&T Mobility - CG"},
{"linkid":9985,"record_id":62897544,"session_begin":"2016-11-02 08:07:35","timestamp":"2016-11-02 08:07:50","bytes":311,"network_name":"AT&T Mobility - CG"},
{"linkid":9985,"record_id":62895251,"session_begin":"2016-11-02 07:46:08","timestamp":"2016-11-02 07:46:29","bytes":1795,"network_name":"AT&T Mobility - CG"},
{"linkid":9985,"record_id":62894292,"session_begin":"2016-11-02 07:24:39","timestamp":"2016-11-02 07:24:58","bytes":1795,"network_name":"AT&T Mobility - CG"},
{"linkid":9985,"record_id":62891887,"session_begin":"2016-11-02 07:35:23","timestamp":"2016-11-02 07:35:42","bytes":1795,"network_name":"AT&T Mobility - CG"},
{"linkid":9985,"record_id":62889709,"session_begin":"2016-11-02 07:13:55","timestamp":"2016-11-02 07:14:13","bytes":1795,"network_name":"AT&T Mobility - CG"},
{"linkid":9985,"record_id":62889623,"session_begin":"2016-11-02 07:03:04","timestamp":"2016-11-02 07:03:28","bytes":4354,"network_name":"AT&T Mobility - CG"}
Code
...
    success = Particle.publish("fuel-level1", output);
    Serial.print(",");
    Serial.print(success);
    CellularSignal sig = Cellular.RSSI();
    Serial.print(",");
    Serial.print(sig);
    delay(5000);    //flush serial buffer (and UDP buffer?)
    Serial.print(",");
...