Method to reduce data operations | Publish an array of JSON

jgskarda · August 21, 2021, 6:38pm

I posted this question to the community earlier in the year on a conceptual method to reduce data operations. I finally had some quiet time to focus on it. The prior post is closed so I figured I'd start a new thread. I thought others might find this beneficial for similar use cases and if you see a way to make this better, cleaner feel free to post here as well.

Background:

Basically What I want to do is instead of publishing something like this using PublishQueue.publish() 12 times where each reading is 5 minutes apart:
{ "Sensor1": 1234, "Sensor2": 5678, DateTime:1623000246}
{ "Sensor1": 1234, "Sensor2": 5678, DateTime:1623000546}
{ "Sensor1": 1234, "Sensor2": 5678, DateTime:1623000846}
…

I would publish a single JSON array like this once:
[ {"Sensor1": 1234, "Sensor2": 5678, DateTime:1623000246}, {"Sensor1": 1234, "Sensor2": 5678, DateTime:1623000546}, {"Sensor1": 1234, "Sensor2": 5678, DateTime:1623000846}...]

I'd update my backend to then process the array of data instead of individual web hooks of every data point. The benefit is: 1) reduce data operations from the Particle Cloud now that Data Operations is the unit of measure. 2) Reduced load on my backend as it only needs to process 1 web hook that contained an array of data rather than 12 separate web hooks.

Here is the code I used:

#include <fcntl.h>
#include <sys/stat.h>

Step 1 (Create a JSON and write it to the file system): Call this as many times as needed (once every 5 minutes) as the fileCnt++ increments the file name. 1 file per JSON of Sensor Data.

          JsonWriterStatic<256> jwInner;
          {
            JsonWriterAutoObject obj(&jwInner);
            jwInner.insertKeyValue("Sensor1", 1234);
            jwInner.insertKeyValue("Sensor2", 5678);
            jwInner.insertKeyValue("DateTime", 1623000846);
          }

          JsonWriterStatic<256> jwOuter;
          {
            JsonWriterAutoObject obj(&jwOuter);
            jwOuter.insertKeyJson("Data", jwInner.getBuffer());
          }

          String Data = jwOuter.getBuffer();

          //Increment the file counter by 1 for each new file that gets written
          //Index is reset to 0 when all data is published
          fileCnt++;

          //Open the file, if it doesn't exist create it, truncate it to length 0, then write the data to the file and close it. 
          int fd = open(String(fileCnt), O_RDWR | O_CREAT | O_TRUNC);
          if (fd != -1) {
              write(fd, Data.c_str(), Data.length());
              close(fd);
          }

          //Publish the data: COMMENTED OUT AS NOW WRITING A FILE TO THE FILE SYSTEM
          // Log.info("Lora message: %s", jwInner.getBuffer());
          // publishQueue.publish("Data", jwInner.getBuffer(), 60, PRIVATE, NO_ACK);

Step 2: Create a function to publish all files. This function checks for the max publish length (Requires Device OS3.1 or you could just hardcode it. It'll keep reading files and appending them to the JSON array until the max publish length is reached, once it's reached, it'll publish the event and clear the buffer, if it is never reached, it'll publish the event once all files are read. This will continue until all files are published.

/*******************************************************************************
 * Function Name  : publishFiles()
 * Description    :  
 * Return         : Publishes all files from the File System as a JSON Array
 *******************************************************************************/
int publishFiles(){

  //Obtain the maxEventDatasize. This can change based on firmware version, device and modem firmware
  int maxEventDataSize = Particle.maxEventDataSize();
  int remainingBytes;
  Log.info("eventDataSize=%d", Particle.maxEventDataSize());
  
  JsonWriterStatic<1024> jw;
  {
    JsonWriterAutoObject obj(&jw);
    jw.init();
    jw.startArray();

  //For each file, open it, read the contents, add the contents to the JSON array and close the file. 
  for(int i = 0; i < fileCnt; i++){
    int fd = open(String(i+1), O_RDONLY);
    if (fd != -1) {
      char fileData[256];
      memset(fileData, 0, sizeof(fileData));

      //Determine the length of the file we are reading
      struct stat statbuf;
      int result = fstat(fd, &statbuf);
      if (result == 0) {
        //Read the entire file
        read(fd, fileData, statbuf.st_size);
      }
      close(fd);

      Log.info("maxEventDataSize: %i | Current Write Offset: %i | Current File Size: %i", maxEventDataSize, jw.getOffset(), statbuf.st_size);
      remainingBytes = maxEventDataSize-jw.getOffset()-statbuf.st_size;

      //If less than 10 bytes are left after adding the current file, before adding the file, publish the Array to the queue and then re-initialize the buffer 
      if(remainingBytes < 10 ){
        Log.info("Buffer is full: Publish Buffer and reset it");
        jw.finishObjectOrArray();
        publishQueue.publish("DataArray", jw.getBuffer(), 60, PRIVATE, NO_ACK);
        jw.init();
        jw.startArray();
      }
      
      jw.insertCheckSeparator();
      jw.insertJson(fileData);
    }
  }
  
  jw.finishObjectOrArray();
  }

  //Now publish the JSON Array to the cloud. 
  publishQueue.publish("DataArray", jw.getBuffer(), 60, PRIVATE, NO_ACK);
  fileCnt = 0;

  return 1;
 }

Once the device is connected to the cloud I simply call the function to publish all the files:
publishFiles();

Now instead of 12 individual publish events like this:

I now receive 1 or 2 of these properly formatted JSON arrays: (If total bytes is > 1012, it is split up into multiple publish events):

I'll probably limit the number of files to something like 100. If it ever gets that full without publishing the data to the cloud first, I'm OK with just overwriting prior data. That way I shouldn't be able to fill the file system.

Next step for me is to update my backend to ingest the array of JSON data rather than individual elements. Should be fairly easy with a Python for Loop.

A few questions for those smarter than me:

Is there an easier way to do this? I liked the file system as it seemed easy to keep track and can store the entire JSON right to it.
Is there an impact on using the file system repeatedly like this. I.e. read/write a file called the same name over and over again (once every every 5 minutes).
Should the file number be something other than just an integer?
Should I make the files more of a "Ring Buffer" I.e. write new files continually from 0 --> 100 before looping back around to 0? I currently start over at 0 every time I publish data. So file 1 gets a lot more used than file 20 and file 50+ may never get used. Would a ring buffer be better on longevity of reading/writing to memory?
Would simply deleting after reading it but an OK method instead of a ring buffer?
Any other thoughts for improvement?

jgskarda · August 24, 2021, 4:07am

That seemed to work quite well and was super simple to also update my backend. Saw a significant drop in the Event Traffic coming from the device:

The spike on the 21st is when I was doing the development and continually testing it but I’d say it cut my event traffic to at least 1/3 of what it was maybe even 1/4 of what it was before. The time it takes my own backend to process each webbook did increase a little (since it now has to process several data packets as an array). That went from ~80-90 ms per event to ~ 200ms/event. Overall, I’ll take 1/4 of the events to process even if a single event takes 2x. This should work well for my application as long as there is no long term impacts on continuously using the file system.

joel · August 25, 2021, 3:46pm

I believe the flash is rated at 100k write cycles across a 2MB embedded filesystem. Luckily the filesystem does employ some very basic wear leveling (stochastic in the free flash). Many small files that you frequently create/delete is pretty inefficient in terms of flash utilization and will result in a lot more overhead versus some of the alternatives But at one every 5 minutes I wouldn’t expect it to be a large concern.

jgskarda · August 27, 2021, 9:05pm

@joel Thanks for letting me know. After thinking about it some more… I think I’m better off with a simpler approach and not use the file system. Rather than writing files to the file system I decided to just continue to append a JSON to an Array using the very nice library JsonParserGeneratorRK. I just declare the JSON Writer to be global.

Here’s the current snippets of code that I put together.

Before Setup - Create a Global Object:

//Define Buffer for JSON Array and Initialize it
JsonWriterStatic<1024> JSONArray;

//Start with a max Event Data size of 256. Update it once the cellular modem is on and connected. 
int maxEventDataSize = 256;

In Setup() Initialize the start of the Array:

  //start an array object
  JSONArray.startArray();

Each time I want to add data to the JSON array: (Typically every 5 minu

//Create a new locally scoped JSON object to Insert into the Array
JsonWriterStatic<256> jw;
          {
            JsonWriterAutoObject obj(&jw);
            jw.insertKeyValue("Datetime", Time.now());
            jw.insertKeyValue("Sensor1", 123);
            jw.insertKeyValue("Sensor2", 456);
            jw.insertKeyValue("Sensor3", 789);
          }

          //If adding the new JSON object extends the array beyond the max bytes of a Publish event, close the array, publish it via PublishQueue, clear it and start another array. 8 bytes are left as spare for Particle pre/ or post data. 
          if ((maxEventDataSize - int(JSONArray.getOffset()) - int(jw.getOffset())) < 8 ){
            JSONArray.finishObjectOrArray();
            publishQueue.publish("DataArray", JSONArray.getBuffer(), 60, PRIVATE, NO_ACK);
            JSONArray.init();
            JSONArray.startArray();
          }

          //Add new data to the JSON Array
          JSONArray.insertCheckSeparator();
          JSONArray.insertJson(jw.getBuffer());

          //If adding the new data extends the array beyond the max bytes of a Publish event, close the array, publish it, clear it and start another array 
          //If if the data came in during an irregular time period publish the data immediately. 
          if (!rptWindow || (maxEventDataSize - int(JSONArray.getOffset()) - int(jw.getOffset())) < 8){
            JSONArray.finishObjectOrArray();
            publishQueue.publish("DataArray", JSONArray.getBuffer(), 60, PRIVATE, NO_ACK);
            JSONArray.init();
            JSONArray.startArray();
          }

When I’m done with all sensor readings for this period (i.e. every 20 or 60 minutes), publish the Current JSON Array to the Cloud:

            //If there is data in the JSON Array: End the Array, Publish it, clear the array and start a new JSON array for new data:
            if (JSONArray.getOffset()>4){
              JSONArray.finishObjectOrArray();
              publishQueue.publish("DataArray", JSONArray.getBuffer(), 60, PRIVATE, NO_ACK);
              JSONArray.init();
              JSONArray.startArray();
            }

            //Updated the Max Event Data Size if Needed. This can not be done in setup() as the cellular modem must be ON. The value is first initialized to 256. This should allow updating it once. 
            if (maxEventDataSize <= 256) {
              maxEventDataSize = Particle.maxEventDataSize();
            }
                       
            //Wait until all Events in the PublishQueue are sent successfully. Wait up to 30 seconds
            int i = 0;
            while (publishQueue.getNumEvents() > 0 and i <= 30) {
                  softDelay(1000);
                  i = i+1;
            }

The result in the Particle Console is identical as my earlier example. It’s an array of JSON where each member is sensor readings. In most cases, these are sensor readings spaced 5 minutes apart in time and then published at either a 20 minute frequency or 60 minute frequency.

Overall, this JSON Array seems like a cleaner approach then using the Flash File System. I won’t have any lasting affects on the flash this way either. Only downside I see is lost data if the device resets but that should be rare and is not critical for my application.

system · February 26, 2022, 9:05am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Conceptual method to upgrade PublishQueue functionality with reduced data operations Libraries boron	4	864	August 6, 2021
Need help managing new Data Operations limit (polling Particle Cloud API at high frequency) Integrations	4	587	April 30, 2021
How/when is Data Operations measured? Device OS photon	5	596	May 2, 2022
Saving on data operations? Cloud boron , argon	2	242	October 5, 2023
Using webhooks and webtasks to send Particle data to a database Project Share	2	3254	February 21, 2018

Method to reduce data operations | Publish an array of JSON

Related topics