User NVM - non-volatile memory support with the serial flash

I’ve written a library to give organized user access to the serial flash. Basic wear leveling and CRC checking of restored data is supported. There is also a stream mode which I have planned but not implemented so far.

From the header file:

NVM is a non-volatile storage class to handle user interaction with storing
blocks or streams in the external sFLASH, of which 1.5MB is available for
the user.

There are two ways to interact with NVM: block writes and stream writes.
Blocks are written in their entirety all at once. A block is CRC checked
and this check is performed when restoring the block. When a block is
written, any previously stored content of the block is replaced.

Streams are written at up to one byte at a time. Streams have a fixed
maximum number of bytes that can be written to the stream record based
on the NVM map definition. Streams can only be appended or reset (to empty).
Once all available bytes of a stream have been written, the stream record
must be flushed to return it to empty. Streams are not checked with a CRC.

The NVM map defines how flash is laid out in terms of blocks and streams.
If the NVM map changes from one version of software to another, all data 
stored in the NVM is reset.

Wear leveling is performed by writing copies of the block data following
the already written block until the page is full, then the page is erased.
Every block or stream write causes a new info block to be written, which is
64 bytes long. Given 4K pages, this block can be written 64 times before the
page is erased. Pages can be erased upto 100k times, allowing at most
6.4 million writes before wearing out the info page.

I would like to get some feedback on this software:

  • Interest in merging this library?
  • Interest level in supporting a stream mode (I don’t need this for my own projects but if there is strong interest I would be willing to implement it)

Total .text size is about ~800 bytes and uses 260 bytes of RAM with support for up to 15 NVM records. If you want to check it out, merge mattande/core-firmware@nvm and mattande/core-common-lib@nvm_lib. There is a sample application.cpp that exercises the NVM API’s.

I’ve also issued a pull request with more general cleanups to the sFLASH driver that fixes bugs in the existing driver if starting a write at an odd flash address and/or writing an odd number of bytes. There is also a modest improvement in code space utilization and slight speedup when writing > 4K bytes in a single write.

5 Likes

Wow, this looks great @mattande! I don’t see any reason why we wouldn’t want to incorporate this into the base code, as it should only take up memory if you use the library.

Would be interesting to know what might need to be done to make the 4MB version of this chip compatible with the library: http://www.digikey.com/product-search/en?pv7=6&k=SST25VF032B-80-4I-S2AF

I’m probably going to need a micro SD card for some of the projects I have in mind, but it’s interesting to see how much on board storage we can get.

I don’t fully understand the Stream mode. “Once all available bytes of a stream have been written, the stream record
must be flushed to return it to empty.” I’m imagining streaming bytes into Flash memory… but if they get erased when full, what is the point of streaming them in? I must not be thinking about this correctly yet :smile:

If you allocate a stream record and give it a maximum number of bytes, say 20k, you can write up to 20k bytes in as small of a piece as 1 byte at a time. The NVM library keeps track of how many bytes have been written and would give some convenience methods for reading back the stored bytes. Once you’ve written all 20k bytes, the stream record is full and would have to be flushed (erases all of the related flash pages and reset the length of the stream to 0) before more bytes could be written.

At least that’s how I was thinking of implementing a stream mode.

Oh and if you had a different size serial flash chip, just override the value of USER_NVM_START_ADDR and USER_NVM_SIZE in your build to reflect the flash addresses that the userNVM object would use. Or create a 2nd NVM instance that covers the additional space in the chip.

Cool well that sounds like a snap to double up the memory then.

I guess I understand that the Streamed memory would have to be erased before continuing to write to it, but it seems like a strange use case where memory is growing and growing and then just POOF it’s gone. What would be a typical use case that does this? Also, there could be an option to warn if Stream memory space runs out, and puts the onus on the user to kick off the Stream record erase.

// psuedocode
if( NVM.streamWrite(byte) == -1 ) { // Stream record full!
  if( BACKUP_FIRST ) {
     // backup some stuff first
  }
  NVM.eraseStreamRecord();
  if( NVM.streamWrite(byte) == -1 ) {  // retry the byte write
    // error! could not write!
  }
}

I can see why you might not want to continue developing Stream… I can’t think of good use cases for it. I completely understand the block erase/write … and use that in Flash memory for calibration data fairly regularly.

The stream mode potentially becomes useful when logging larger amounts of data. Say you want to log a temperature every minute. If you use an array of points in a block record, the entire record needs to be rewritten every time its updated (slow) and you will quickly run out of free RAM to store the array as the number of points becomes large.

Using a stream record you would append your new point every minute, requiring only as much RAM as the size of the data point. You then have potentially up to ~1.5MB of space available for point records.

Yes the user would need to detect when the stream is full and either erase it or stop logging points. I didn’t intend that the stream record would automatically be cleared when full. The user could double buffer their data using two stream records, switch between them when they are full and once a record becomes full and gets downloaded the user could erase it making it available for points again.

Then again, if a stream mode isn’t all that useful then I wont spend time implementing it…

Ok good! I was thinking you had some use case where auto erase made sense… like logging data just to log it xD

Yeah the single access part of Stream is obviously very useful for async logging / log retrieval uses. I’m sure many people would want that. Thanks for your efforts on this! :slight_smile:

This is great, but a pity we couldn’t have collaborated on this, since I’m also writing the same library.

CRCs

I notice the code uses CRC16 to check for changes on block writes. This seems a little dangerous - 1 in 64k chance of a change not being written, which is not insignificant.

Streaming

I would say stream support is important for anyone logging data or wanting a simple device-neutral method to store and retrieve data in the flash.

It would be good to allow streams to be arbitrary in length - an optional limit can be set to safeguard client code from overwriting into unwanted areas, but it’s not a key part of the implementation - the stream can continue to grow into new blocks. When the stream is closed, the data at the end of the last block written to is then restored from the original block, allowing callers to “rewrite” data in the middle of flash. (The library itself can determine if the past page written to by the stream was clean or had data in from before, so only needs to make a copy of the page if it contained data.)

** Efficient Writes / Random Access **

Since flash memory can be written multiple times, so long as the data doesn’t attempt to flip a 0 back to 1, you can implement schemes that allow incremental updates and modification to data structures without requiring an erase each time. This could be used as the basis for providing random access like the arduino EEPROM.write() function without requiring a block erase for each access.

I’d love to see more people using this before we consider merging in the NVM library, and it might be more appropriate to keep as an external library since it requires more C++ knowledge to use than the typical arduino user has.

The pull request to clean up sFLASH is much appreciated! In the coming weeks we’ll be releasing a standardized library packaging, sharing, and importing mechanism, so definitely package this up once that’s out in the wild.

Speaking of which I've stabilized and finished my pulseIn support, should I submit a pull request to add it to the spark_wiring.cpp to retain parity with Arduino?

For a block record a CRC-32 is computed and yes, the result is truncated and stored in 16 bits. This was a trade off in keeping the size of the information block small since info block is written every time a block or stream record is updated this can be a limiting factor in the wear leveling of the device. When you add the constraint that for a block update to be skipped because a change was not detected, the same number of different bytes having to generate the same low 16 bits of a CRC-32 becomes higher than 2^16-1. I agree the potential for a hash collision is not zero, but I contend its insignificant.

Though, the number of bytes used for the CRC can be expanded from 16. The other field in the structure, the offset of the most recent block does not need all 16 bits. If only one page is used for the block record and (worst case) the block record is 1 byte in length, the offset only needs to take on values between 0-4095 -- 12 bits. The remaining 4 bits can be returned to the CRC yielding a (again in the most simplified case where the number of bytes CRC'd can be varied) a 1 in 2^20-1 or 1-in-1,048,575 chance of a hash collision. I will make an update to extend the CRC width to 20 bits.

I agree streaming is useful. In fact I got the idea from your previous post on the subject. The size of the stream can be defined to any length within the constraints of the size allocated for the NVM and any other record types have already been allocated in the NVM region.

As far as having a stream record overwrite previously written pages instead of becoming full.. maybe this is a different type of record - a ring buffer record? Or as I suggested previously, the user could double buffer with two stream records to achieve a similar effect.

You will find that the NVM library already does this. When writing a new block record, the new data is placed after any previous updates to the block within the page. Reference to the most recent block (and its CRC) is stored in the nvmInfo record using a similar scheme. The page is erased only when its full and a new block record needs to be stored.

Random access gets tricky. As I'm sure you're aware if you intend to modify bytes within an already written page, either the entire page needs to be copied to RAM (and upto 4K of RAM is hard to come by given the current firmware constraints), or copied to a different page of flash, requiring complete page erases and writes. This is inefficient in terms of time (erasing and writing whole pages is slow, at around 30 ms per page) and in terms of device lifetime (a single random access write to a complete page requires at least one full erase of the page). Of course there are many other possible implementations with their own advantages and disadvantages.

But, I wasn't trying to provide a true "eeprom"-like random access read/write interface to the serial flash. My goal was to simplify the interface so a user only needs to think in terms of 1) I have some data structure (or structures) that I want to persist (registration -> nvmEntry), 2) I want to restore this structure from some previous copy (restoreBlock), and 3) I have a new version of this structure and I want to save it to restore in the future (writeBlock). All of the details for underlying organization, erasing and (simplified, but still effective) leveling of the sflash memory is taken care of for them using the smallest overhead (in terms of execution time, code space and ram) as possible.

I think both the NVM library and your EEPROM library have different objectives and are complimentary. A user who wants true random access read/write access to the sflash (or has an existing Arduino sketch that makes reference to the EEPROM library) would use the EEPROM library.

Thanks for the feedback.

Just to clarify - I’m not persuing using eeprom since my attempts ended up causing the core to lose handshake with the cloud and later no longer connects to wifi. So, I’m building a flash library.

I think you misunderstand streaming - it should be just another interface to the data, not needing to be segregated to it’s own space. It’s another way of accessing the data - you can read/write blocks of data to addresses, or you can open a stream.

Making a circular buffer will involve connected streams so that the write stream doesn’t write over what is not read by the read stream, and the read stream doesn’t read beyond what is written.

Definitely @timb! Yay for detecting impulses!

I will keep an eye out for the new user library support!

Do you know what speed grade of the sst25vf is on the board? I’m going to assume its the 50MHz part. I may try out increasing the SPI clock when interacting with the chip.

Also, fyi your current core-schematic.pdf calls out the sst25vf04 - the 4MBit version, not the sst25vf16 - the 16MBit version which is actually on the board.

Ah thanks @mattande@mohit please update that schematic PDF.

The datasheet’s in the repo, but I don’t know the speed off the top of my head. We’ve tried bumping up the SPI and settled on a prescalar of 4 recently as the fastest we can do. Keep in mind the SPI line is shared with the CC3000, so the clock speed must be less than the lesser of the two components’ maxima.

I ran into something similar while writing test routines for the sflash. If you used too much RAM - and 4K was enough - the heap allocations in the crypto library would fail causing inability to connect with the cloud.

No, I follow what you mean but I disagree. The block model vs. the stream model have different advantages when working with flash memory depending on the users needs in their application, so I've created them as separate data models.

When the CC3000’s chip select is high there should be no limitation on the SPI clock. Its set to 9MHz currently, could be turned up to 18MHz (Max SPI rate given APB1 Pclk/2 is smallest clock divider available). Potentially (somewhat less than) 2x faster flashing from the cloud…

Also I noticed the routines that write the OTA image and backup image to sflash and write the OTA image back into STM32 program flash are not taking full advantage of the sflash auto increment write or block reads. There is potentially a meaningful speedup there as well in the time it takes to flash an image OTA. If I get time this week I’ll test this out and send you a pull request.

1 Like

Awesome @mattande, good call. Switching the SPI speed back and forth as we select the CC3000 then the flash chip and back sounds tricky to get right. Looking forward to that PR!

Even with reflashing with a factory image, my core doesn’t come online. It’s not a memory issue, but I imagine some eeprom corruption. I just get a white flashing led. I’m waiting for my JTAG shield from the spark team to try some dubugging to find out the cause.

Please explain more why you disagree about streaming being a layer on top.

Regarding this library it would be a good idea to write some unit tests. I know there are unit test frameworks for arduino, has anyone tried these on the spark? I wonder how do the spark team automate testing?

I wrote my library with abstractions so that the flash was abstracted in the build, This allowed me to compile the same code to run on my desktop to test the parts that were not hardware specific (which is actually most of the library - since the hardware is abstracted already by the sFlash function calls.)