Strings And Program Space

rickpbush · May 13, 2014, 10:09pm

Hi, I have a(nother ) question. I have an SD card I want to read some strings from. I have 8 strings, delimited by line feeds, something like this

String One\n
Another String\n
Something Else\n
etc, etc…8 times.

I want to read these strings out and store them in a structure that is a property of a settings class. I initially decided to make the strings fixed length, 10 chars each and so created an array of string objects to achieve this like this

String camOnePresets[8];

I’m thinking though, this structure must require a very large amount of memory to be reserved, by my reakoning, 1 byte per char so (8 * 10)*8=640bytes.

So, I’ve decided I don’t require the strings to be fixed length. I read up on vectors but I’m not sure how they compare memory size wise to strings. I can’t use a 2d array of chars as they are fixed length. I’m using an SD card library which reads chars one at a time, I’m buffering them then putting them into this structure. Does anyone have any suggestions or advice, I’m prety new to C++, I’m using the spark core as a learning tool. Thanks, Rick.

ScruffR · May 14, 2014, 8:05am

@rickpbush, there are some misconceptions - some of which are not uncommon - in your post I’d like to adress.

First String is a C++ class that does actually support dynamic length strings, but since it is a class it does require more RAM than a char\* of the same length (for its internal fields). As for the momentary implementation the minimum RAM required for one instance of String is 32byte (16 for the fields and 16 heap ‘granularity’). And if your Strings are growing often you might cause heap fragmentation and when they shrink you have to be aware, that the allocated mem on the heap stayes as large as the largest string was at any time since the String instance was created.
Maybe there will be some garbage collection to free up unused heap space in the Core firmware in future, but I’m not aware of anything like it, yet

On the other hand if you declare one string as char x[10]; it uses ten byte and eight of these use 80bytes (not 640).

But when you use char* sPtr[8]; and allocate eight strings with ten chars each you will use at least the 80 bytes for the strings plus 8*4 = 32byte for the pointers (80+32=112byte) - and it might be more if heap granularity is assumed to be 16byte (816 + 84 = 160byte).

The next thing would be your delimiter. If you only store strings - and want to use them with str functions - you’ll need a \0 delimiter which would require one extra byte, unless you drop the \n and put the \0 in its place.
If you need the \n you could always add it on the fly.

rickpbush · May 14, 2014, 10:17am

Hi ScruffR, firstly, thanks for the reply. I have read through it a few times. I don’t suppose I am trying to do anything new here, loading up a list from a file that is, except I am working with limited resources in terms of memory.

So, thas said, if I were coding for a PC, I would not hesitate to use a fixed length array of strings as the strings would dinamically resize to fit their contents. However, from your explaination, I assume that an array of 8 strings would allocate 32x8=256bytes and that’s before I’ve put anything in them, am I correct?

So, I think that I would be better off using a 2d array of chars with fixed length and width, ie char xxx[8][20] as this would use 20x8=160, assuming I can live with each ‘string’ being 20 chars long. It does however mean I have to mess with pointers if I want to return the array, as in the case of a getter method etc.

bko · May 14, 2014, 11:32am

Hi @rickpbush

I use a 16 character by 2 line LCD display but I want to display multiple “pages” of data on it, so I use an array of char arrays to hold that. The nice thing is that you can refer the each char string with just the first index and the compiler knows you mean the entire second index part. Here is a small sample of how I use it:

#define DISPLAYLINES 10
#define DISPLAYCHARS 16

char UIStr[DISPLAYLINES][(DISPLAYCHARS+1)]; 
...
strcpy(UIStr[writeUILine], p);
//or
strcat(UIStr[writeUILine], " ");
strcat(UIStr[writeUILine], p);
strcat(UIStr[writeUILine], "/");

So you don’t have to deal with pointers if you don’t want to!

ScruffR · May 14, 2014, 12:27pm

@rickpbush, for my feeling working with pointers isn't nearly as difficult as it often is viewed as.

If you could explain a bit more what your expected troubles with returning the array - when dealing with pointers - would be, I'd be happy to assist.
Maybe you also could provide some code snippet, to illustrate your use case.

BTW, a lot of people who find pointers hard to work with but have no trouble with arrays seem to forget that they are using pointers all along, anyway (e.g. char x[10]; makes x a pointer and x[3] is shorthand for *(x+3))

On the other hand depending on your use case, you could even consider all your messages as one long String or a single char[]. This way you'd only introduce the Stringclass 'overhead' once having all the seperate messages back to back not wasting space for fixed length strings. Then you could brake it up into the substrings on the fly or once on initialization - keeping hold of the delimiter offsets in an uint8_t[] array.

There are always sooooo many ways to do things ...

rickpbush · May 15, 2014, 11:52am

Hi, ok, I’ll give you a little history to this. I’m using the spark as a sort of platform to learn C++ to have a real reason to do so while having a little fun along the way I have been coding for years in C#, I did a degree in computer science in 2000, part of which was a C module which did include pointers which I understand. Having finished my course, I never used C again and used C# as it was much easier to achieve very quick results with, most of the lower stuff I didn’t have to deal with.

My mecca though has always been to learn C++ as I view this as the daddy of languages, able to get down low and dirty when needed and also able to opperate without frameworks, multi-platform and depending on level of experience, very efficient code is just keystrokes away …

So, I also have an interest in the real world application of micro controllers, back along I bought some pics and tried to code onto them using mplab, way beyond me, machine code wo wo wo, um, no. So when Arduino came along I was in heaven, that led me to Spark and here we are.

I am now happily learning a great deal about electronics and C++. The only issue is that comming from a C# background, things in C++ are shal we say, more problematic.

The specific problem I have which relates to why I asked this question is this. I have a class I wrote whos job it is to load a list from a file. Initially I thought of doing this in a 2d char array. I don’t like using strings becuase in my Arduino days they were frowned upon as being inefficient and the library which just about supported them was buggy, so strings were a bit of a no no to me. So, the more complicated char array it is then. This brings me to functions. In C#, I could just return any old object from a function, not so it seems in C++. Well, I found out that’s not entirely true. In C# you can pass by ref like this

void function x(out int y)

or by value

int function x(int y)

This is how my brain works and so C++ is bending it with new concepts. In order to work with my desired char arrays, I have to be able to return a char array, I now realise I can do this as follows

void x(char y[])

being a this is the same as passing a pointer to the array which means the function will opperate on the contents of the memory pointed to by said pointer, thus changing the original array, no return value, ok.

The problem comes when we start throwing dereferencing (asterisk here) and address of & opperators around, I get really confused. Questions like, in a param list, does it tag on the end of the type like this char* y or does it go on the name like this
char (asterisk here)x,
do we even need it, is
char (asterisk here)x or char x the same as char x[], bloody astrisk, leave me alone!

Anyway, this is a long post and i haven’t even got to my specific problem yet. I have now encountered null terminated strings, O my goodsness, another fly in the ointment. So we have std srings, strings, character arrays, null terminated character arrays, c strings and goodness only knows what other types of strings, why on earth did someone go to all this effort to create multiple types whos only collective job in life is to hold an array of characters!

So, to my problem. Some code.

This method reads a file using the SD library. The data in said file is \n delimited, in rows of 8 string per camera preset names. So passing the camera number, ie 1, 2 3 etc, is supposed to read the first 8 lines if cam 1, lines 9-16 for cam 2 etc, which it does.

in data header I declare an array of strings thus

String camPresets[8];

    int DataManager::load(int cam){
        int index = 0;
        currentCam = cam;
        myFile = SD.open("test.txt");
        if(!myFile)
            return 2;
        
        int line = 0;
        int row = 0;
        char buff[10];    
        std::fill_n(buff, 10, ' ');
        while (myFile.available()) 
        {
            char c = myFile.read();
            if(c == '\n')
            {
                if(line >= ((currentCam - 1) * 8) && line <= ((currentCam * 8) - 1))
                {
                    camPresets[index] = buff;
                    index++;
                }
                
                if(line > ((currentCam * 8) - 1))
                    break;
                
                line++;
                row = 0;
            }
            else
            {
                buff[row] = c;
                row++;
            }
        }
        myFile.close();
        return 0;
        fail:
          retrun 3;
    }

The following method then prints the returned string to the screen. Except there’s a problem. The screen is in fact displaying an editable char array which the user, using a rotary encoder can traverse. So the display object has a global char array which it needs to display. Because of my lack of ability to pass char arrays around, I opted to use sting whenever I needed to return a, well, string from a function. So, the data gets read into a char array, converted to a string, then passed to the display, then converted back into a char array anf output to the screen and copied to the global char array using a for loop. This is great except not only do I see the string on the screen, I also see a buch of garbage on the end of the string.

String x = data.getPresetLabel(0);  // gets a string from datas array camPresets[]
PrintTextEntryScreen(x); //call the display method.
void Display::PrintTextEntryScreen(String label)
{
  digole.setPrintPos(0, 2, _TEXT_);
  for(i=0; i < label.length(); i++)
  {
    userInput[i] = label.charAt(i);
    digole.print(label.charAt(i));
  }
}

for input string “StringOne” I get “StringOne p” on the screen.

Sorry this is such a long post and if you havn’t given up and gone home by this point then I applaud you.

I’m guessing there are two things going on here. 1. using my understanding of being able to pass by ref, I should do away with the strings and work only with char arrays and 2. my garbage on the end of my output, why is C++ doing this to me ?

rickpbush · May 15, 2014, 9:29pm

Ok, sorry for the war and peace above :).

I have figured it out, I answered my own question really. I have rid my code of outside library support and re-written all my code to use 2d char arrays. I hit a bit of a stumbling block trying to assign a 2d array to another, then instead of this

char xx[10][20]
char yy[10][20]
yy = xx;

I used a nested for loop instead. I’m on a learning curve here, getting frustrated with the idiosychrcies of C++ but I’ll get there with a little help from my friends Thanks for being there guys and girls

bko · May 15, 2014, 9:56pm

Instead of nested loop, I would have used memcpy to make yy a copy of xx, since they are both statically allocated.

If I only wanted to do one C string from xx into yy, I would use something like strcpy(yy[2],xx[7]);

ScruffR · May 16, 2014, 12:18am

@rickpbush, I’m not aware of the CS sylabi in 2000 as I got my CS education back late 80s - last century but I quite liked ASM and one good thing about it was that you really had to think the way the machine does.
This seems to get lost a bit with high level languages like C# and JAVA which hide a lot of that from you.

As for your quarrel with null-terminated-strings (sz or C string), there is a good reason for this - and this will also answer your question about the string garbage at the end of your string.
Actually the machine does not care what a string is. It only cares about bytes and maybe ‘words’ (meaning 16/32/64 bits depending on the CPU) and if you don’t like having to tell the machine each time, you want to work with this obscure human thing called string, how long it is (either by passing it as a seperate value or prepending it to the string as in a PASCAL string) you just tell the machine to keep on going, till it finds something the humans don’t want in their strings - what would fit this description better than the informationless \0.
So if you forget to end your string with a \0 the machine will keep on going and going and doing what you told it to do with each individual byte until it finds an \0 - spitting out garbage.

As far as I’m aware of, even C# and JAVA do work with C strings, since it is a handy way of doing the job without actually limiting the max length of a string (unlike a PASCAL string that is/was restricted by the prepended len field - typically an uint16) but still only sacrificing one byte (not two like PASCAL).
And even if you have a String class that wraps up some fancy features to manipulate this thing, you will find a C-string somewhere inside it, since this is the actual essence of the thing.

And for the comparison of a char array vs. a string, one could say what’s the difference between a bunch of letters and a word (as in a book, not a collection of bits)?
While a char array can store a string a char array as such is actually focused on each individual letter, while a string only has its worth/meaning as an entirety - just like ‘b’, ‘o’, ‘o’, ‘k’ compared to “book”.
Sure the intellectual capability of a human reader clearly recognizes that both can be read and understood as this old fashioned thing with lots of individual paper sheets and some more or less random characters scattered all over them; but a machine does only sees one byte after the other.

And for C (and the likes) the \0 turns a bunch of chars into a (maybe) meaningful string.

ScruffR · May 16, 2014, 12:42am

After all this long-winded philosophizing about strings, I'd just like to throw in one heretic question
Does the display object actually, really have to have a global char array?

The picture I got from your description above, you have a bunch of strings for a bunch of cams which you pull in from your SD and store in RAM, then display and possibly edit this or the other of them and when edited presumably at some point will write back to the SD.
If so, why not just pass arround the pointer ( ) to the one and only instance of the respective string, manipulate it in place and write it back from that one place onto SD (providing you have no dynamic length)?

This way you could spare yourself the hassle of repeatedly copying to and fro.

rickpbush · May 16, 2014, 12:42am

Hi bko, yes, I toyed with the idea of using memcpy, as well as

std::copy(xx[0], xx[0] + rows*columns, yy[0]);

but in the end, being as I don’t know c++ well enough and I don’t know if bringing std into my code will make it larger, I opted for a good ol’ fasioned nested for loop as I can see straight away what it is supposed to be doing. As I get more involved with the language I will no doubt get familiar with memcpy and the like but at the satge I’m at now it’s just too dangerouse and difficult for me to debug if I get the syntax wrong

rickpbush · May 16, 2014, 1:03am

ScruffR, your suggestion is excellent, at some point I probably would have come to the conclusion that using a pointer is the way forward too but I’m just too green at the mo, I’m trying to design an object model and missing the intricacy I guess, my thinking is still to high level and C# like.

I just want to say though, for me, the discusion is really good, I am learning a lot, pulling things apart, hearing what people have to say and the different view points is great. Since leaving uni, my coding tends to be a solatary affair, being able to discuss with people who have possibly been in the business of coding for decades really helps direct my research and learning and helps me from going off on a crazy tangent so I thank you most sincerely for your time.

rickpbush · May 23, 2014, 11:07am

Hi, quick question, if I #include array, or std, does that add to the code space I’m using ?

On that note, can I include any c++ libraries and use them on the core or indeed the arduino?

Topic		Replies	Views
String vs Array of Char General	3	3206	February 6, 2015
Char, array and string Troubleshooting	1	3477	January 13, 2015
Am I using pointers correctly? Firmware	4	1688	June 16, 2014
String Manipulation Firmware	6	1565	February 28, 2016
How to best handle global Strings / strings / text data? Firmware	25	1429	September 23, 2022

Strings And Program Space

Related topics