[SOLVED] sscanf slow on large character arrays, looking for alternatives

Hey all,

I’m writing a program where one of the problems I need to solve is how to decode hundreds of records in an Intel hex file.

Here is an example of a record:

:100000000C9437010C945F010C945F010C945F0118

My app has two modes of operation:

  1. Decode/send hex file stored on local SD card
  2. Decode/send hex file stored in Electron program memory

I have got the program working in both modes, but now am looking to optimize it. One strange thing I noticed is that its taking the program memory version of the app about 45 seconds to decode/send a 15KB file, while the SD version only takes abotu 12 seconds.

This is indeed strange because in theory progmem version should be faster (doesn’t have to load hex file records from the SD card… it already has it all in program memory).

Putting in a crap-load of serial debugging log information has revealed that the extra processing time is due to a dramatic increase in the time it is taking to execute sscanf() calls in the program memory version.

Here is the hex file record decoding function/class:

/*=============================================>>>>>
=  Class for a storing a hex file record and decoding it into consituent parts =
===============================================>>>>>*/
class HexFileRecord{
public:
   bool decode(); //Decodes ascii hex file record into constituent parts
   const char* ascii_line; //Pointer to the character string containing the ascii representation of this hex record
   byte byteCount = 0; //Number of data bytes
   uint16_t address = 0;   //beginning memmory address offset of the data block (2-byte word-oriented)
   byte recordType = 0; //0x46 = flash
   const char* data = 0; //pointer to where the data bytes start
   byte checkSum = 0;
};
/*=============================================>>>>>
= Function to decode an ascii hex file string into its constituent elements =
===============================================>>>>>*/
bool HexFileRecord::decode(){
   if((char)ascii_line[0] == ':'){
      //Decode byte count
      uint32_t microsTimer = micros();
      if(1 == sscanf(ascii_line + 1, "%2x", &byteCount)) {
         myLog.trace("sscanf took %lu µs", (micros() - microsTimer));
         // myLog.trace("byteCount = %#02X", byteCount);
         //Decode address
         if(1 == sscanf(ascii_line + 3, "%4x", &address)) {
            // myLog.trace("address = %#02X", address);
            //Decode record type
            if(1 == sscanf(ascii_line + 7, "%2x", &recordType)) {
               // myLog.trace("recordType = %#02X", recordType);
               //Calculate location of data
               data = ascii_line + 9;
               // myLog.trace("data = %.*s", byteCount*2, data);
               //Decode checksum
               if(1 == sscanf(data + (byteCount*2) , "%2x", &checkSum)) {
                  //TODO: see if checksum makes sense
                  // myLog.trace("checkSum = %#02X", checkSum);
                  return true;
               }
            }
         }
      }
   }
   Serial.print("Invalid hex file record: ");
   Serial.println(sdBuf);
   return false;

In the SD version, the ascii_line member of the HexFileRecord object points to a buffer that is 45 characters long.

In the progmem version, the acii_line member of the HexFileRecord object points to a moving location somewhere in the middle of a 14KB char array

Here are the differences in decode times for these two cases:

SD Version

[Thu Nov 15 10:40:15.761 2018] 0000005241 [app.STK500] TRACE: sscanf took 25 µs
[Thu Nov 15 10:40:15.847 2018] 0000005242 [app.STK500] TRACE: Hex record took 429 µs to decode

Program Memory Version

[Thu Nov 15 10:49:16.321 2018] 0000054651 [app.STK500] TRACE: sscanf took 1716 µs
[Thu Nov 15 10:49:16.328 2018] 0000054656 [app.STK500] TRACE: Hex record took 7079 µs to decode

So, I’m guessing that under the hood, arm-gcc is doing some crazy wizardry to implement sscanf which is making processing of huge char arrays infeasible using sscanf.

Is there an alternative that I could use, or should I just copy the hex file record chars into a temporary buffer and use sscanf on that small manageable buffer?

When you are using sscanf() why do you not use a combined format string?
e.g. sscanf(ascii_line + 1, "%2x%4x%2x", &byteCount, &address, &recordType)

Also reading from flash is slower than reading from RAM.

2 Likes

Hi @jaza_tom

I think with your in memory version, sscanf is reading until the end of the Intel Hex file every time you call it.

Here is what I would try, as your alternative for a small buffer above says, make a char array of length 45 (or 46 for the zero) and use strncpy to get just the part you want into it, then call sscanf on that. If you allocate the smaller char array in a local function, you could have heap fragmentation problems, so I favor global allocation for things like this.

1 Like

Okay, thanks for the suggestions.

What I ended up doing is refactoring my code so that the hex file stored in program memory is pre-divided into hex records, and then each hex record is assigned to the ascii_line, instead of the whole huge char array.

So instead of declaring my hex file in progmem like this:

const char progMemImage[] = {
   ":100000000C949C000C94AE000C94AE000C94AE00CA"
   ":100010000C94AE000C94AE000C94AE000C94AE00A8"
   ":100020000C94AE000C94AE000C94AE000C94AE0098"
   ":00000001FF"
};

I have declared it like this:

const char progMemHexRecords[][MAX_CHARS_PER_HEX_RECORD] = {
   ":100000000C9437010C945F010C945F010C945F0118",
   ":100010000C945F010C94F3060C945F010C945F0147",
   ":100020000C945F010C941D070C945F010C945F010C",
   ":0839A00000050913020D0A00E5",
   ":00000001FF",
};

That brought the DFU process back down to about 12 seconds from 45!

(Also @ScruffR I implemented your tweak to do all the sscanf parsing on a single line, however it doesn’t appear to have sped things up at all).

1 Like