@ScruffR - Yes… definitely something with PublishQueue. I think it’s time for someone smarter than me (i.e. @rickkas7) to see what might be happening. Do I submit a support ticket to Particle somehow or what is the next steps to engage the Particle Engineering/Support team for this. I have full log files to share as well just let me know what I can do to help! Hopefully Rick can replicate the issue and see what is happening given the info below:
My only course of action right now is to go back to SYSTEM_THREAD(ENABLED) and put up with what then seems like a pseudo reset event that occurs what seems to be at the same time. I will go back to Systeup_threaded(enabled) with logging set to ALL and see if I can then capture one of the pseudo reset events while logging.
I can now replicated the issue quite frequently in the bare bones sketch shared below. The cloud disconnects are quite persistent when using PublishQueue() and non-existent when using Particle.Publish()
My setup:
- Particle Boron running LTS 2.0.1
- Particle Boron in free space (not physically connected to hardware
-Standard USB cable connected from usb port on boron to USB port on PC
- Serial Monitoring being done via PuTTY
- 6600 mAh Lipo battery connected directly to JST connector
- Reported cellular signal strength when conducting test
-Publish Event is being kicked off by manually calling the function “Stat” from the Console.Particle.IO
"signal":{
"at":"LTE Cat-M1"
"strength":32.99
"strength_units":"%"
"strengthv":-109
"strengthv_units":"dBm"
"strengthv_type":"RSRP"
"quality":38.23
"quality_units":"%"
"qualityv":-13.5
"qualityv_units":"dB"
"qualityv_type":"RSRQ"
}
I am also happy to say the issue is repeatable while serial monitoring when log level is set to “LOG_LEVEL_ALL”. So I was able to capture some logs. I am happy to provide the complete log file but not sure easiest way to do that. Do I email it to somewhere? Here is some quick snippets of the log file around the issue. Top line log file indicates Event Loop Error 3 typically when it’s trying to immediately publish an event from publishQueue.Publish() or Event loop error 18 when it’s publishing an event from the queue.
0000063727 [app] INFO: stat message: {"1":1609818841,"2":22,"3":33,"4":44,"5":55,"6":66,"7":1}
0000063728 [app.pubq] INFO: queueing eventName=Stat data={"1":1609818841,"2":22,"3":33,"4":44,"5":55,"6":66,"7":1} ttl=60 flags1=1 flags2=8 size=72
0000063729 [app.pubq] INFO: publishing Stat {"1":1609818841,"2":22,"3":33,"4":44,"5":55,"6":66,"7":1} ttl=60 flags=9
0000063730 [comm.protocol] TRACE: rcv'd message type=2
0000063733 [comm.protocol] ERROR: Event loop error 3
0000063736 [system] WARN: Communication loop error, closing cloud socket
0000063740 [system] INFO: Cloud: disconnecting
0000063743 [system] INFO: Cloud: disconnected
0000063747 [system] INFO: Cloud: connecting
I also receive this trace:
0000114930 [app.pubq] INFO: publishing Stat {"1":1609818893,"2":22,"3":33,"4":44,"5":55,"6":66,"7":11} ttl=60 flags=9
0000114941 [comm.protocol] ERROR: Event loop error 18
0000114941 [system] WARN: Communication loop error, closing cloud socket
0000114944 [system] INFO: Cloud: disconnecting
0000114948 [system] INFO: Cloud: disconnected
0000114948 [system] INFO: Cloud: connecting
Here is the exact sketch I was using to test this:
/*
*
* Description: Bare bones sketch to replicate issue with using PublishQueue in LTS 2.0.1
* Author: jgskarda
* Date: 1/4/2021
*/
//Include necassary libraries
#include "Particle.h"
#include "PublishQueueAsyncRK.h"
#include <JsonParserGeneratorRK.h>
SerialLogHandler logHandler(LOG_LEVEL_ALL);
STARTUP(System.enableFeature(FEATURE_RESET_INFO));
STARTUP(System.enableFeature(FEATURE_RETAINED_MEMORY));
//define the system modes to use
//System threading is disabled. When system threading is enabled, the frequency is much less however, instead of a cloud disconnect, it results in a pseudo reset event where setup() is called agian.
// SYSTEM_THREAD(ENABLED);
SYSTEM_MODE(MANUAL);
//Define variables needed for Publish Queue library.
retained uint8_t publishQueueRetainedBuffer[2048];
PublishQueueAsync publishQueue(publishQueueRetainedBuffer, sizeof(publishQueueRetainedBuffer));
int8_t msgCnt = 0;
/*************************************************************************************/
// SETUP
/*************************************************************************************/
void setup()
{
//Define all particle functions.
Particle.function("Stat", Stat);
//Delay 10 seconds to give time for serial debugging
delay(10s);
publishQueue.setup(); // Needed for queueing of Publish Events
}
/*************************************************************************************/
// LOOP
/*************************************************************************************/
void loop()
{
Particle.process();
static uint32_t msLastConTime = 0;
if (!Particle.connected() && millis() - msLastConTime < 60000) {
Particle.connect();
msLastConTime = millis();
}
static uint32_t msLastPublish = 0;
if (!Particle.connected() && millis() - msLastPublish < 10000) {
Stat("args");
msLastPublish = millis();
}
}
/*******************************************************************************
* Function Name : stat()
* Description :
* Return : Publishes event to the cloud
*******************************************************************************/
int Stat(String arg){
// This creates a buffer to hold up to 256 bytes of JSON data (good for Particle.publish)
JsonWriterStatic<256> jw;
{
JsonWriterAutoObject obj(&jw);
jw.insertKeyValue(String(1), Time.now());
jw.insertKeyValue(String(2), 22);
jw.insertKeyValue(String(3), 33);
jw.insertKeyValue(String(4), 44);
jw.insertKeyValue(String(5), 55);
jw.insertKeyValue(String(6), 66);
jw.insertKeyValue(String(7), msgCnt++);
}
Log.info("stat message: %s", jw.getBuffer());
//Using publishQueue.Publish will result in frequent cloud disconnects
publishQueue.publish("Stat", jw.getBuffer(), 60, PRIVATE, WITH_ACK);
//Using Particle.Publish() will run without issue even hammering on the request
// Particle.publish("Stat", jw.getBuffer(), 60, PRIVATE, WITH_ACK);
return 1;
}
A few other noteworthy observations:
-
Does using NO_ACK vs WITH_ACK flag have an impact? No - Slightly different behavior but disconnects still occur. As you’d expect, the events are no longer duplicated to the cloud over and over again however, the device still disconnects on nearly every call to PublishQueue
-
Does different logging levels have an impact?: YES - it appears to. Issue impacted by different logging levels but unable to correct issue. Just occurs at different frequencies and level of impact. This test was done with my full original program, Threading: NOT Enabled, , Only using publishQueue. Observations:
LOG_LEVEL_ALL : - Disconnect constantly and continues to try and publish the same event to the cloud.
LOG_LEVEL_TRACE : Happens but not very frequent. One event in a 3 hour period.
LOG_LEVEL_INFO : Happens but not as frequent about the same as Log_Level_Trace
LOG_LEVEL_WARN : Disconnect constantly and continues to try and publish the same event to the cloud.
LOG_LEVEL_ERROR : Disconnect constantly and continues to try and publish the same event to the cloud.
LOG_LEVEL_NONE : Disconnect constantly and continues to try and publish the same event to the cloud.