Particle asset read() very slow after initial read

I am using the Particle OTA Asset feature set to package binary files to be sent to co-processor as part of my firmware release.

From my testing, it seems that the time it takes to execute the read() function on the asset is extremely slow after the asset has been read through once.

For example, I have a function which calculates the CRC of the asset, and it executes in about 500 ms the first time I calculate the CRC of the asset.

The second time it takes about 24,000 ms!

Here is the function:

static Logger myLog("app.asset.crc");


uint32_t compute_CRC32(
	ApplicationAsset& asset, uint32_t pad_sz, crc32_parameters_t *params
){
    myLog.info("Calculating asset CRC of %s", asset.name().c_str());
    asset.reset();
    myLog.info("Done resetting asset");
    uint32_t crc_result = params->initval;
    unsigned int loops = 0;
    unsigned int total_asset_bytes = asset.size();
    unsigned int asset_bytes_remaining = total_asset_bytes;
    unsigned int total_bytes = total_asset_bytes + pad_sz;
    unsigned int total_bytes_remaining = total_bytes;
    bool did_debug = false;
    while (asset_bytes_remaining) {
        uint8_t buf[2048];
        //Determine how many bytes to read from the asset
        size_t bytes_to_read = min(asset_bytes_remaining, sizeof(buf));
        int read = asset.read((char*) buf, bytes_to_read);
        if (read < 0) {
            LOG(ERROR, "Error %d reading binary from asset", read);
        }
        if(!did_debug) {
            did_debug = true;
            //Print out first few bytes of the asset
            myLog.info("First 5 bytes of asset are: %02X %02X %02X %02X %02X",
                buf[0], buf[1], buf[2], buf[3], buf[4]
            );
        }
        crc_result = compute_CRC32(
            buf, bytes_to_read, 0, loops == 0, params
        );
        asset_bytes_remaining -= bytes_to_read;
        loops++;
        if (loops % 10 == 0) {
            myLog.info("CRC32 loop %u, progress: %3.1f %%", loops,  (float(total_asset_bytes - asset_bytes_remaining)*100)/total_asset_bytes );
        }
    }
    if(pad_sz > 0) {
        crc_result = compute_CRC32(
            nullptr, 0, pad_sz, loops == 0, params
        );
    }
    return crc_result;
}

And here is the output from the first CRC calculation:

[2025-06-12 14:09:06.358] 0000001678 [app.asset.crc] INFO: Calculating asset CRC of BossApp.bin
[2025-06-12 14:09:06.363] 0000001679 [app.asset.crc] INFO: Done resetting asset
[2025-06-12 14:09:06.375] 0000001690 [app.asset.crc] INFO: First 5 bytes of asset are: B8 58 02 20 59
[2025-06-12 14:09:06.443] 0000001759 [app.asset.crc] INFO: CRC32 loop 10, progress: 12.3 %
[2025-06-12 14:09:06.518] 0000001836 [app.asset.crc] INFO: CRC32 loop 20, progress: 24.6 %
[2025-06-12 14:09:06.593] 0000001908 [app.asset.crc] INFO: CRC32 loop 30, progress: 36.9 %
[2025-06-12 14:09:06.656] 0000001972 [app.asset.crc] INFO: CRC32 loop 40, progress: 49.2 %
[2025-06-12 14:09:06.725] 0000002040 [app.asset.crc] INFO: CRC32 loop 50, progress: 61.4 %
[2025-06-12 14:09:06.781] 0000002095 [app.asset.crc] INFO: CRC32 loop 60, progress: 73.7 %
[2025-06-12 14:09:06.818] 0000002138 [app.asset.crc] INFO: CRC32 loop 70, progress: 86.0 %
[2025-06-12 14:09:06.857] 0000002172 [app.asset.crc] INFO: CRC32 loop 80, progress: 98.3 %
[2025-06-12 14:09:07.007] 0000002322 [app.boss] INFO: BossApp img bytes: 0x28B04  -  pad bytes: 0x174FC  - total: 0x40000 - crc = 66c9f394

and here is the second calculation:

[2025-06-12 14:09:07.843] 0000003159 [app.asset.crc] INFO: Calculating asset CRC of BossApp.bin
[2025-06-12 14:09:07.843] 0000003160 [app.asset.crc] INFO: Done resetting asset
[2025-06-12 14:09:07.950] 0000003267 [app.asset.crc] INFO: First 5 bytes of asset are: B8 58 02 20 59
[2025-06-12 14:09:09.175] 0000004490 [app.asset.crc] INFO: CRC32 loop 10, progress: 12.3 %
[2025-06-12 14:09:11.515] 0000006831 [app.asset.crc] INFO: CRC32 loop 20, progress: 24.6 %
[2025-06-12 14:09:15.412] 0000010727 [app.asset.crc] INFO: CRC32 loop 30, progress: 36.9 %
[2025-06-12 14:09:18.889] 0000014204 [app.asset.crc] INFO: CRC32 loop 40, progress: 49.2 %
[2025-06-12 14:09:22.366] 0000017682 [app.asset.crc] INFO: CRC32 loop 50, progress: 61.4 %
[2025-06-12 14:09:27.585] 0000021508 [app.asset.crc] INFO: CRC32 loop 60, progress: 73.7 %
[2025-06-12 14:09:31.835] 0000027150 [app.asset.crc] INFO: CRC32 loop 70, progress: 86.0 %
[2025-06-12 14:09:31.906] 0000027227 [app.asset.crc] INFO: CRC32 loop 80, progress: 98.3 %
[2025-06-12 14:09:32.359] 0000027674 [app.dfu] INFO: Done calculating expected CRC of new image: 66C9F394

Why does reading from an asset work so much faster the first time, and so much slower on subsequent readthroughs?

@rickkas7

I don't know why that would occur. I could see some variability because the assets are stored compressed on the flash file system, but that does seem slower than what I would expect.

I have been trying to pin down at what point in my app the readback from asset gets slowed down.

It seems to be after calling PublishQueuePosix::instance().setup(); which initializes the PublishQueuePosix library.

I'm guessing that this library has some kind of conflict with assets?

It shouldn't conflict, but because it's doing stuff in the background, it could be slowing things down. But it probably should not slow down that much. It will probably be necessary to isolate exactly what is affecting the speed because it's not obvious what it could be yet.

I think I figured it out:

If i used a scheme such as this:


void function1(){
	auto assets = System.assetsAvailable();
	for (auto& asset: assets) {
		if (asset.name() == "Test.bin") {
			bencmark_asset_read(asset);
		}
	}
	function2();
}

void function2(){
	auto assets = System.assetsAvailable();
	for (auto& asset: assets) {
		if (asset.name() == "Test.bin") {
			bencmark_asset_read(asset);
		}
	}
}


Then the benchmark in function2() would run much more slowly. I assume this is because of some type of block or mutex thing happening in the background when there is an unfreed variable storing the System.assetsAvailable() value.

I was able to solve the issue in my case by amending the code to have enclosing braces around the assets variable once I was done with it. Such as this:


void function1(){
	{
		auto assets = System.assetsAvailable();
		for (auto& asset: assets) {
			if (asset.name() == "Test.bin") {
				bencmark_asset_read(asset);
			}
		}
	}
	function2();
}

void function2(){
	auto assets = System.assetsAvailable();
	for (auto& asset: assets) {
		if (asset.name() == "Test.bin") {
			bencmark_asset_read(asset);
		}
	}
}


This appears to have caused the original assets variable to get freed from the stack before the next one got invoked in function2()

I have been trying to isolate and diagnose this asset reading latency issue for about ten hours, and have distilled things down to the following example program, which demonstrates how tiny seemingly insignificant change in code result in asset reading performance latency going up by more than 60x.

Here is a repository I created which contains a sample project which shows this:

@rickkas7 if you have any ideas about what I can do to eliminate this latency issue, it would be greatly appreciated!

Project properties

name=AssetTrouble
assetOtaDir=assets
dependencies.PublishQueuePosixRK=0.0.7

AssetTrouble.cpp

// Include Particle Device OS APIs
#include "Particle.h"

// Let Device OS manage the connection to the Particle Cloud
SYSTEM_MODE(AUTOMATIC);

// Run the application and system concurrently in separate threads
SYSTEM_THREAD(ENABLED);


//Declare some helper types and functions for calculating CRC32
typedef struct{
	uint32_t polynomial; //The polynomial to use in the CRC calculation
	uint32_t initval; //The "reset" value of the CRC calculation
	bool input_reflect; //Chooses whether bit-reversal is performed on bytes that are fed into the calculation
	bool output_reflect; //Chooses whether bit-reversal is performed on the result of the calculation before returning it
	uint32_t output_xor; //Chooses the value to xor result of the calculation with when returning it
}crc32_parameters_t;
uint32_t compute_CRC32( ApplicationAsset& asset, uint32_t pad_sz, crc32_parameters_t *params );
uint32_t compute_CRC32(const void *data, uint32_t data_sz, uint32_t pad_sz, bool reset, crc32_parameters_t *params);

static uint8_t pad_byte = 0xFF;
crc32_parameters_t crc32_mpeg2 = {
    .polynomial = 0x04C11DB7,
    .initval = 0xFFFFFFFF,
    .input_reflect = false,
    .output_reflect = false,
    .output_xor = 0x00000000
};


//Declare funciton for benchmarking the asset readback process
void asset_readback_benchmark(unsigned int linenum);


//Declare logging function to output log info to serial monitor
SerialLogHandler logHandler(LOG_LEVEL_INFO);




//This function's benchmarks are fast at first (sub 300 ms) but jump to more than 16000 ms towards end of setup function
void setup1() {
    // Put initialization like pinMode and begin functions here
    Serial.begin(230400);
    waitFor(Serial.isConnected, 1000);
    asset_readback_benchmark(__LINE__); //! 245 ms
    //Proxy for linknet_init();
    Serial1.begin(230400);
    asset_readback_benchmark(__LINE__); //! 239 ms
    //Calculate the CRC32 of the Asset
    auto assets = System.assetsAvailable();
    for (auto& asset: assets) {
        if (asset.name() == "BossApp.bin") {
            uint32_t crcResult;
            crcResult = compute_CRC32(
                asset, (512*1024) - asset.size(), &crc32_mpeg2
            );
            Log.info("CRC32 of %s is 0x%08X", asset.name().c_str(), crcResult);
        }
    }
    asset_readback_benchmark(__LINE__); //! 16595 ms
}


//This function's benchmarks are fast throughout the function after adding enclosing scope brackets around the auto assets block
void setup2() {
    // Put initialization like pinMode and begin functions here
    Serial.begin(230400);
    waitFor(Serial.isConnected, 1000);
    asset_readback_benchmark(__LINE__); //! 233 ms
    //Proxy for linknet_init();
    Serial1.begin(230400);
    asset_readback_benchmark(__LINE__); //! 234 ms
    //Calculate the CRC32 of the Asset
    { //! <--- The only difference between setup1() and setup2() is that this block is wrapped in scope brackets
        auto assets = System.assetsAvailable();
        for (auto& asset: assets) {
            if (asset.name() == "BossApp.bin") {
                uint32_t crcResult = compute_CRC32(
                    asset, (512*1024) - asset.size(), &crc32_mpeg2
                );
                Log.info("CRC32 of %s is 0x%08lX", asset.name().c_str(), crcResult);
            }
        }
    } //! <--- The only difference between setup1() and setup2() is that this block is wrapped in scope brackets
    asset_readback_benchmark(__LINE__); //! 270 ms (still fast)
}


//This function's 
void setup3() {
    // Put initialization like pinMode and begin functions here
    Serial.begin(230400);
    waitFor(Serial.isConnected, 1000);
    asset_readback_benchmark(__LINE__); //! 233 ms
    //Proxy for linknet_init();
    Serial1.begin(230400);
    asset_readback_benchmark(__LINE__); //! 234 ms
    //Calculate the CRC32 of the Asset
    auto assets = System.assetsAvailable();
    for (auto& asset: assets) {
        if (asset.name() == "BossApp.bin") {
            uint32_t crcResult = 0x12345678;  //! <---- This time we are not calculating the CRC32, but just setting a dummy value
            Log.info("CRC32 of %s is 0x%08lX", asset.name().c_str(), crcResult);
        }
    }
    asset_readback_benchmark(__LINE__); //! 240 ms (still fast)
}

void setup(){
    // setup1(); //Benchmark slow at end of setup, but fast at beginning, and back to fast again during loop
    // setup2(); //Remains fast throughout setup and loop, because of addition of enclosing scope brackets?
    setup3(); //Remains fast throughout setup and loop, because no CRC32 calculation was performed?
}

// loop() runs over and over again, as quickly as it can execute.
void loop() {
  if(Serial.available()){
    System.dfu();
  }
  delay(1000);
  asset_readback_benchmark(__LINE__); //! 429 ms
}


unsigned int asset_readback_test(ApplicationAsset& asset)
{
    //Print out information about the asset
    asset.reset();
    size_t assetSize = asset.size();
    Log.info("------------------------------------");
    Log.info("Asset name: %s", asset.name().c_str());
    Log.info("Asset size: %u bytes", assetSize);
    //Now loop throug the asset, reading 1024 bytes at a time
    char buffer[1024];
    size_t bytesRead = 0;
    unsigned int loops = 0;
    unsigned int startTime = millis();
    while (bytesRead < assetSize) {
        Watchdog.refresh(); // Reset the watchdog timer
        size_t bytesToRead = min(assetSize - bytesRead, sizeof(buffer));
        int readResult = asset.read(buffer, bytesToRead);
        if (readResult != (int)bytesToRead) {
            Log.error("Failed to read %u bytes from asset, got %d error code", bytesToRead, readResult);
            break;
        }
        bytesRead += bytesToRead;
    }
    Log.info("Read %u bytes from asset in %u ms", bytesRead, millis() - startTime);
    Log.info("------------------------------------");
   return millis() - startTime;
}

void asset_readback_benchmark(unsigned int linenum)
{
    //This function simply loops through the OTA assets, and tries to read back the first 10 bytes of each 
    //bin file and print them to the log.
    Log.info("Starting asset readback unit test (linenum %u)", linenum);
    auto assets = System.assetsAvailable();
    for (auto& asset: assets) {
        if (asset.name() == "BossApp.bin") {
            asset_readback_test(asset);
        }
    }
}


uint32_t compute_CRC32(
	ApplicationAsset& asset, uint32_t pad_sz, crc32_parameters_t *params
){

    
    asset.reset();
    Log.info("Calculating asset CRC of %s", asset.name().c_str());
    uint32_t crc_result = params->initval;
    unsigned int loops = 0;
    unsigned int total_asset_bytes = asset.size();
    unsigned int asset_bytes_remaining = total_asset_bytes;
    unsigned int total_bytes = total_asset_bytes + pad_sz;
    unsigned int total_bytes_remaining = total_bytes;
    while (asset_bytes_remaining) {
        uint8_t buf[2048];
        //Determine how many bytes to read from the asset
        size_t bytes_to_read = min(asset_bytes_remaining, sizeof(buf));
        int read = asset.read((char*) buf, bytes_to_read);
        if (read < 0) {
            Log.error("Error %d reading binary from asset", read);
            return 0; // Return 0 on error
        }
        crc_result = compute_CRC32(
            buf, bytes_to_read, 0, loops == 0, params
        );
        asset_bytes_remaining -= bytes_to_read;
        loops++;
        if (loops % 10 == 0) {
            Log.info("CRC32 loop %u, progress: %3.1f %%", loops,  (float(total_asset_bytes - asset_bytes_remaining)*100)/total_asset_bytes );
        }
    }
    if(pad_sz > 0) {
        crc_result = compute_CRC32(
            nullptr, 0, pad_sz, loops == 0, params
        );
    }
    return crc_result;
}

static uint8_t Reflect8(uint8_t val)
{
	uint8_t res = 0;
	for (int i = 0; i < 8; i++) {
		if (val & (1 << i)) {
			res |= (1 << (7 - i));
		}
	}
	return res;
}

static uint32_t Reflect32(uint32_t val)
{
	uint32_t res = 0;
	for (int i = 0; i < 32; i++) {
		if (val & (1 << i)) {
			res |= (1 << (31 - i));
		}
	}
	return res;
}

uint32_t compute_CRC32(
	const void *data, uint32_t data_sz, uint32_t pad_sz, bool reset, crc32_parameters_t *params
)
{
	static uint32_t _crc32 = params->initval;
	uint8_t *read_ptr;
	if(data)
		read_ptr = (uint8_t *)data;

	if (reset)
		_crc32 = params->initval;
	uint8_t Byte;
	uint8_t curByte;
	do {
		if (data_sz) {
			Byte = *(read_ptr++);
			data_sz--;
		}
		else {
			Byte = pad_byte;
			pad_sz--;
		}
		/* reflect input byte if specified, otherwise input byte is taken as it is */
		curByte = (params->input_reflect ? Reflect8(Byte) : Byte);
		_crc32 = _crc32 ^ (curByte << 24);
		for (int bit = 0; bit < 8; bit++) {
			if (_crc32 & (1L << 31))
				_crc32 = (_crc32 << 1) ^ (params->polynomial);
			else
				_crc32 = (_crc32 << 1);
		}
	} while (data_sz || pad_sz);
	/* reflect result _crc32if specified, otherwise calculated _crc32value is taken as it is */
	_crc32 = (params->output_reflect ? Reflect32(_crc32) : _crc32);
    /* Xor the crc value with specified final XOR value before returning */
    return (uint32_t)(_crc32 ^ params->output_xor);
}

Debug outputs

I compiled and ran this project 3 times, using the 3 different setup() functions (setup1() , setup2() , setup3() )

There are three debug output logs which illustrate the various results:

setup1() log

setup2() log

In setup2(), the only difference is that enclosing scope brackets were added around the code block that calculates the crc32 of the asset

setup3() log