Is this a Particle OS bug, or my bad? Serial vs Log

I have a program built that started acting funny today out of nowhere. Things were humming along nicely when all of a sudden the device appears to have gone into an infinite loop that caused it to spew out a mixture of Particle OS debug information and my own debug information. I came back to my desk about 1 hour after it started and it had spewed about 300 MB of this...

Here is the relevant debug log snippet that shows the debug output just before and just after entering the spew loop of doom:

[2023-10-03 08:18:42.326] Link measurements:
[2023-10-03 08:18:42.326] Bus Voltage: 24938 mV | Charging power: 33.9 W | MPPT: -0.3 A (-8.0 W)
[2023-10-03 08:18:42.326] 
[2023-10-03 08:18:42.326] 0042030142 [app.linknet] TRACE: Checking for new LinkNet nodes...
[2023-10-03 08:18:42.326] 0042030143 [app.linknet] TRACE: linknet transaction - cmd = DIRECTORY_BROADCAST (0X01) | addr = 0XFE | pyld_tx_size = 18 | pyld_rx_size = 18 | timeout = 1000 ms
[2023-10-03 08:18:43.323] 
[2023-10-03 08:18:43.326] 0042031147 [app.linknet] TRACE: linknet_rsp_wait returned NO_RESPONSE (0xE0) - delay of 1001 ms
[2023-10-03 08:18:43.326] 0042031147 [app.linknet] INFO: No unadressed linknet nodes detected
[2023-10-03 08:18:43.326] 0042031148 [app.boss] TRACE: Updating Boss 0 (update #20842)
[2023-10-03 08:18:43.329] 0042031148 [app.linknet] TRACE: linknet transaction - cmd = WEBHOOK_GET_TRACKER (0X30) | addr = 0X00 | pyld_tx_size = 0 | pyld_rx_size = 16 | timeout = 2000 ms
[2023-10-03 08:18:43.447] 0042031270 [app.linknet] TRACE: linknet_rsp_wait returned NONE (0x00) - delay of 120 ms
[2023-10-03 08:18:43.449] 0042031270 [app.linknet] TRACE: linknet transaction - cmd = APP_GET_ANALOG_READINGS (0X10) | addr = 0X00 | pyld_tx_size = 0 | pyld_rx_size = 1928 | timeout = 2000 ms
[2023-10-03 08:18:43.800] 0042031613 [app.linknet] TRACE: linknet_rsp_wait returned NONE (0x00) - delay of 340 ms
[2023-10-03 08:18:43.800] 0042031614 [app.linknet] TRACE: linknet transaction - cmd = APP_SET_CHARGING_CONFIG (0X21) | addr = 0X00 | pyld_tx_size = 17 | pyld_rx_size = 0 | timeout = 2000 ms
[2023-10-03 08:18:43.942] 0042031761 [app.linknet] TRACE: linknet_rsp_wait returned NONE (0x00) - delay of 144 ms
[2023-10-03 08:18:43.942] 0042031761 [app.boss] INFO: requesting Boss linknet error tracker...
[2023-10-03 08:18:43.948] 0042031762 [app.linknet] TRACE: linknet transaction - cmd = APP_GET_NETWORK_STATS (0X14) | addr = 0X00 | pyld_tx_size = 0 | pyld_rx_size = 48 | timeout = 2000 ms
[2023-10-03 08:18:44.178] 0042032000 [app.linknet] TRACE: linknet_rsp_wait returned NONE (0x00) - delay of 236 ms
[2023-10-03 08:18:44.178] 0042032000 [app.boss] TRACE: pkt_rx_starts = 66768
[2023-10-03 08:18:44.182] 0042032001 [app.boss] TRACE: pkt_headers_received = 66768
[2023-10-03 08:18:44.182] 0042032001 [app.boss] TRACE: pkts_received_total = 66768
[2023-10-03 08:18:44.182] 0042032002 [app.boss] TRACE: pkts_received_this = 66768
[2023-10-03 08:18:44.182] 0042032002 [app.boss] TRACE: pkts_received_other = 0
[2023-10-03 08:18:44.186] 0042032003 [app.boss] TRACE: rx_overflows = 0
[2023-10-03 08:18:44.186] 0042032003 [app.boss] TRACE: rx_timeouts = 0
[2023-10-03 08:18:44.186] 0042032003 [app.boss] TRACE: rx_bytes_missed = 0
[2023-10-03 08:18:44.186] 0042032004 [app.boss] TRACE: bad_crcs = 0
[2023-10-03 08:18:44.186] 0042032004 [app.boss] TRACE: uart_errors = 0
[2023-10-03 08:18:44.186] 0042032004 [app.boss] TRACE: max_linknet_latency_ms = 263
[2023-10-03 08:18:44.186] 0042032005 [app.boss] TRACE: max_subnet_latency_ms = 262
[2023-10-03 08:18:44.186] 0042032005 [app.boss] INFO: ------
[2023-10-03 08:18:44.186] 0042032005 [app.boss] INFO: Failed transactions to this boss: 119
[2023-10-03 08:18:44.186] 0042032006 [app.linknet] TRACE: linknet transaction - cmd = APP_GET_HEARTBEAT (0X13) | addr = 0X00 | pyld_tx_size = 0 | pyld_rx_size = 612 | timeout = 2000 ms
[2023-10-03 08:18:44.448] 0042032271 [app.linknet] TRACE: linknet_rsp_wait returned NONE (0x00) - delay of 263 ms
[2023-10-03 08:18:44.448] ========================================
[2023-10-03 08:18:44.462] =============Boss Heartbeat=============
[2023-10-03 08:18:44.462] ========================================
[2023-10-03 08:18:44.462] tstamp_start = Tuesday, October 3rd, 2023 at 12:37 AM
[2023-10-03 08:18:44.462] 0042032275 [comm.protocol] TRACE: Reply recieved: type=2, code=0
[2023-10-03 08:18:44.462] 0042032275 [comm.papp_flags_sticky = 0X00000000
[2023-10-03 08:18:44.462] page_erase_kb = 0
[2023-10-03 08:18:44.462] reset_count = 0
[2023-10-03 08:18:44.462] por_count = 0
[2023-10-03 08:18:44.462] crash_count = 0
[2023-10-03 08:18:44.462] pack_sessions = 1
[2023-10-03 08:18:44.462] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.462] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.462] webhook_bytes_generated = 0
[2023-10-03 08:18:44.462] webhooks_delivered = 0
[2023-10-03 08:18:44.462] self_check_watts = 0
[2023-10-03 08:18:44.462] self_check_loaded_mv = 0
[2023-10-03 08:18:44.488] self_check_unlpack_sessions = 1
[2023-10-03 08:18:44.488] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.488] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.488] webhook_bytes_generated = 0
[2023-10-03 08:18:44.488] webhooks_delivered = 0
[2023-10-03 08:18:44.488] self_check_watts = 0
[2023-10-03 08:18:44.488] self_check_loaded_mv = 0
[2023-10-03 08:18:44.488] self_check_unlpack_sessions = 1
[2023-10-03 08:18:44.488] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.488] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.488] webhook_bytes_generated = 0
[2023-10-03 08:18:44.488] webhooks_delivered = 0
[2023-10-03 08:18:44.488] self_check_watts = 0
[2023-10-03 08:18:44.488] self_check_loaded_mv = 0
[2023-10-03 08:18:44.488] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.488] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.488] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.488] webhook_bytes_generated = 0
[2023-10-03 08:18:44.488] webhooks_delivered = 0
[2023-10-03 08:18:44.488] self_check_watts = 0
[2023-10-03 08:18:44.488] self_check_loaded_mv = 0
[2023-10-03 08:18:44.488] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.488] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.488] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.488] webhook_bytes_generated = 0
[2023-10-03 08:18:44.488] webhooks_delivered = 0
[2023-10-03 08:18:44.503] self_check_watts = 0
[2023-10-03 08:18:44.503] self_check_loaded_mv = 0
[2023-10-03 08:18:44.503] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.503] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.503] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.503] webhook_bytes_generated = 0
[2023-10-03 08:18:44.503] webhooks_delivered = 0
[2023-10-03 08:18:44.503] self_check_watts = 0
[2023-10-03 08:18:44.503] self_check_loaded_mv = 0
[2023-10-03 08:18:44.503] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.503] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.503] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.503] webhook_bytes_generated = 0
[2023-10-03 08:18:44.503] webhooks_delivered = 0
[2023-10-03 08:18:44.503] self_check_watts = 0
[2023-10-03 08:18:44.503] self_check_loaded_mv = 0
[2023-10-03 08:18:44.503] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.503] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.503] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.503] webhook_bytes_generated = 0
[2023-10-03 08:18:44.503] webhooks_delivered = 0
[2023-10-03 08:18:44.503] self_check_watts = 0
[2023-10-03 08:18:44.503] self_check_loaded_mv = 0
[2023-10-03 08:18:44.503] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.503] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.503] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.503] webhook_bytes_generated = 0
[2023-10-03 08:18:44.520] webhooks_delivered = 0
[2023-10-03 08:18:44.520] self_check_watts = 0
[2023-10-03 08:18:44.520] self_check_loaded_mv = 0
[2023-10-03 08:18:44.520] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.520] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.520] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.520] webhook_bytes_generated = 0
[2023-10-03 08:18:44.520] webhooks_delivered = 0
[2023-10-03 08:18:44.520] self_check_watts = 0
[2023-10-03 08:18:44.520] self_check_loaded_mv = 0
[2023-10-03 08:18:44.520] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.520] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.520] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.520] webhook_bytes_generated = 0
[2023-10-03 08:18:44.520] webhooks_delivered = 0
[2023-10-03 08:18:44.520] self_check_watts = 0
[2023-10-03 08:18:44.520] self_check_loaded_mv = 0
[2023-10-03 08:18:44.520] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.520] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.520] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.520] webhook_bytes_generated = 0
[2023-10-03 08:18:44.520] webhooks_delivered = 0
[2023-10-03 08:18:44.520] self_check_watts = 0
[2023-10-03 08:18:44.520] self_check_loaded_mv = 0
[2023-10-03 08:18:44.520] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.520] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.520] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.534] webhook_bytes_generated = 0
[2023-10-03 08:18:44.534] webhooks_delivered = 0
[2023-10-03 08:18:44.534] self_check_watts = 0
[2023-10-03 08:18:44.534] self_check_loaded_mv = 0
[2023-10-03 08:18:44.534] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.534] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.534] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.534] webhook_bytes_generated = 0
[2023-10-03 08:18:44.534] webhooks_delivered = 0
[2023-10-03 08:18:44.534] self_check_watts = 0
[2023-10-03 08:18:44.534] self_check_loaded_mv = 0
[2023-10-03 08:18:44.534] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.534] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.534] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.534] webhook_bytes_generated = 0
[2023-10-03 08:18:44.534] webhooks_delivered = 0
[2023-10-03 08:18:44.534] self_check_watts = 0
[2023-10-03 08:18:44.534] self_check_loaded_mv = 0
[2023-10-03 08:18:44.534] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.534] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.534] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.534] webhook_bytes_generated = 0
[2023-10-03 08:18:44.534] webhooks_delivered = 0
[2023-10-03 08:18:44.534] self_check_watts = 0
[2023-10-03 08:18:44.534] self_check_loaded_mv = 0
[2023-10-03 08:18:44.534] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.534] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.534] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.564] webhook_bytes_generated = 0
[2023-10-03 08:18:44.564] webhooks_delivered = 0
[2023-10-03 08:18:44.564] self_check_watts = 0
[2023-10-03 08:18:44.564] self_check_loaded_mv = 0
[2023-10-03 08:18:44.564] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.564] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.564] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.564] webhook_bytes_generated = 0
[2023-10-03 08:18:44.564] webhooks_delivered = 0
[2023-10-03 08:18:44.564] self_check_watts = 0
[2023-10-03 08:18:44.564] self_check_loaded_mv = 0
[2023-10-03 08:18:44.564] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.564] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00
[2023-10-03 08:18:44.564] 0042032277 [comm.protocol] TRACE: rcv'd message = 0
[2023-10-03 08:18:44.564] webhook_bytes_generated = 0
[2023-10-03 08:18:44.564] webhooks_delivered = 0
[2023-10-03 08:18:44.564] self_check_watts = 0
[2023-10-03 08:18:44.564] self_check_loaded_mv = 0
[2023-10-03 08:18:44.564] self_check_unloack_sessions = 1
[2023-10-03 08:18:44.564] webhooks_generatedrotocol] TRACE: message id 1691 complete with code 0.00

The relevant spot in source code that matches the debug output in my app is here:

	printHeaderBreak("Boss Heartbeat");
	Serial.printlnf("tstamp_start = %s", getTimeStr(this->beat.tstamp_start));
	Serial.printlnf(
		"app_flags_sticky = 0X%08lx\n"
		"page_erase_kb = %lu\n"
		"reset_count = %lu\n"
		"por_count = %lu\n"
		"crash_count = %lu\n"
		"pack_sessions = %lu\n"
		"webhooks_generated = %lu\n"
		"webhook_bytes_generated = %lu\n"
		"webhooks_delivered = %lu\n"
		"self_check_watts = %lu\n"
		"self_check_loaded_mv = %lu\n"
		"self_check_unloaded_mv = %lu\n"
		,this->beat.app_flags_sticky.val
		,this->beat.page_erase_kb
		,this->beat.reset_count
		,this->beat.por_count
		,this->beat.crash_count
		,this->beat.pack_sessions
		,this->beat.webhooks_generated
		,this->beat.webhook_bytes_generated
		,this->beat.webhooks_delivered
		,this->beat.self_check_watts
		,this->beat.self_check_loaded_mv
		,this->beat.self_check_unloaded_mv
	);
	Serial.flush();

(and so on... for 300 MB).

Pressing the reset button fixed everything.

I'm guessing that Particle OS is in fact not in factreceiving messages constantly as is implied, and that my application is not in fact repeatedly printing out the Serial.printlnf() statement either.

Could this behavior be due to mixing of the Particle Logging class and the Serial class on the same USB peripheral? On the Serial.print vs. Log.info webpage it seems to indicate that bad things happen when you mix Serial and Logging together:

Serial is not thread-safe. It's possible that if you log simultaneously with both Serial and Logging calls, the device may crash.

I tried cloning the Particle OS repo on my local machine and doing some digging on the webhooks_generatedrotocol log messages but I couldn't find any matches. Expanding the search, a bit, it seems likely that this message comes from code that gets auto-generated by a python script during the build process....

Should I go through my whole app and replace all of the Serial.print statements with logging statements?

Also, if I do need to replace all of my Serial.print statements with Log statements, then how do I refactor code that prints out data structures using loops (i.e. when I don't want the new line and the Logging class prepended timestamp/loggername format)?

For example, how would I print out the following without using Serial.print() structures? Is there a way to make Serial.print thread-safe with Logging?

------------------------
    CHANNEL-->     1   |
------------------------
-
  b0 matrix--> 6019990 |
-
  b0 cMode -->    FULL |
-
 b0 pckp_mW-->     -26 |
 b0 pckp_mV-->   13232 |
  b0 bat_mV-->   12471 |
  b0 bat_mA-->      -2 |
b0 a_bat_mA-->      -3 |
-
     b0 tap-->       0 |
b0 tChanges-->    1445 |
b0 FET_temp-->    24.1 |
b0 BMS_temp-->    28.9 |
-
     b0 eIn-->   63.88 |
    b0 eOut-->   58.60 |
  b0 degCap-->    5414 |
b0 desgnCap-->    6000 |
   b0 VFSOC-->      88 |
  b0 REPSOC-->      99 |
-
  b0 cell_1-->    4148 |
  b0 cell_2-->    4158 |
  b0 cell_3-->    4158 |
-
     b0 bms-->       1 |
b0 protstat-->    2000 |
   b0 flags-->45E81880 |
    b0 vers-->   10435 |
-
 b0 chargeT-->    7381 |
   b0 FULL -->     377 |
 b0 NORMAL -->    7382 |
b0 BMS_OFF -->       0 |
  b0 CTEMP -->       0 |
  b0 JTEMP -->       0 |
    b0 ECO -->       0 |
  b0 SPORT -->       0 |
b0 BESSWRN -->       0 |
b0 BESSLOK -->       0 |
------------------------

One thing you could do is separate out the trace logs from the OS so you're less likely to be interrupted (even though you're using Serial.print() statements),

SerialLogHandler logHandler(115200, LOG_LEVEL_INFO, { // Logging level for non-application messages
    { "app", LOG_LEVEL_ALL } // Logging level for application messages
});

I was trying to use the logger the other day any during certain points near startup, it wouldn't log no matter what I did. I thought the system logging might be using Serial while I was trying to, and I was able to get around this by using a mutex,

WITH_LOCK(Serial) {
      Log.info("log message");
}

I'd imagine you can just do the same thing with Serial.print() to prevent the OS from interrupting you, may at least be worth a try.

It's not thread-safe to mix Serial.print and Log.* calls. At best, the messages will intermix, and at worse the device will crash. The weird behavior you are seeing could be related, but possibly it's something else.

Locking Serial around Log calls won't help, because Serial.print doesn't lock the serial port.

We recommend using the standard logging format because that's what's expected by formatted loggers like the syslog logger, but if you're only outputting to serial, you can output unformatted data using Log.print(). That call does not add the end-of-line characters.

2 Likes

How can I flush the Logger in between Log.print() calls when I am printing multiple long buffers to the Log sequentially?

Can I use Serial.flush(); ?

That looks like it should work because the SerialLogHandler::write() method just calls the stream write method. You might still have a threading issue, because there won't be a lock around your multiple Log.print calls, so the data will still be jumbled. But it probably won't crash.

OK seems to be mostly working.

What about printing out long printf stuff to the debug monitor?

It seems that when using Logger class instead of Serial, when I want to dump the contents of, for example, a 1000-character webhook being sent, then Logger always clips the characters and appends a

~

to the output... like this:

{"link":{"ms":1180015,"hw":1001002,"fw":10102,"fl":0},"tm":1696560347,"pid":28789,"ad":0,"fw":32,"id":"00313953323251343037303439303330","bcrc":"25EF186A","oc~

How can I dump the whole contents of a long character buffer like this to the serial terminal in a thread-safe way using Logger class without getting the output clipped?

my best idea so far is something like this:

	myLog.printf(
		"app_flags_sticky = 0X%08lx\n"
		"page_erase_kb = %lu\n"
		"reset_count = %lu\n"
		"por_count = %lu\n"
		"crash_count = %lu\n"
		"pack_sessions = %lu\n"
		,this->beat.app_flags_sticky.val
		,this->beat.page_erase_kb
		,this->beat.reset_count
		,this->beat.por_count
		,this->beat.crash_count
		,this->beat.pack_sessions
	);	
	Serial.flush();
	myLog.printf(
		"webhooks_generated = %lu\n"
		"webhook_bytes_generated = %lu\n"
		"webhooks_delivered = %lu\n"
		"self_check_watts = %lu\n"
		"self_check_loaded_mv = %lu\n"
		"self_check_unloaded_mv = %lu\n"
		,this->beat.webhooks_generated
		,this->beat.webhook_bytes_generated
		,this->beat.webhooks_delivered
		,this->beat.self_check_watts
		,this->beat.self_check_loaded_mv
		,this->beat.self_check_unloaded_mv
	);

... but this will require a lot of refactoring....

You can use Log.write(const char* data, size_t size) or Log.print(const char* data). This goes through logging thread-safety mechanisms, and directs the whole data buffer to the output log handler:

constexpr size_t logBufSize = 4096; // arbitrarily large enough buffer size
auto tmp = std::make_unique<char[]>(logBufSize);
if (tmp) {
    auto outSize = snprintf(tmp, logBufSize, "<format>", arg1, arg2, arg2, ...);
    if (outSize > 0) {
        Log.write(tmp, outSize);
    }
}
1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.