Click here to Skip to main content
15,892,809 members
Please Sign up or sign in to vote.
3.00/5 (2 votes)
See more:
Hi All,

A have a problem about std::string memory using

I would like to create a dynamic string matrix:

C++
//For example:
int X_range=100000; // hundreds of tousands row
int Y_range=10;//ten coloumn
string **matrix;

matrix = new *string[X_range];
for (int x=0;x<X_range;x++)
 {
   matrix = new string[Y_range];
   matrix[x][y]="ABCD"
 }


This code is working fine.I can acces related strings from matrix and I using them.
My problem is that this matrix using ~60MB of memory.
I think that it should use only 100000*10*4 bytes = ~4MB of memory.
I've checked every string's capacity() and size()
and both function returned with 4.
So, I don't understand why allocated ~60MB memory for ~4MB text.

Have you got any idea?:)
Thanks in advance!!
Posted

First: how do you measure these 60MB.
Second: your size calculation is wrong.
What you have to calculate:
1) you allocate an array of 100000 pointers
2) for each X, you allocate an array of 10 * sizeof(string)
3) each string is initialized by a character array of 5 characters (four letters plus terminating zero character).

The implementation of string does not guarantee any size for book keeping (pointer to dynamic content, size, ...). So, a string is at least a pointer and an length of probably size_t type. In addition to that, the string implementation may have choosen to allocate only in chunks and not exactly only as much as needed.

A pointer and size_t on a 64 bit system are - 64 bits in size ;-)

So, calculating the pure skeleton for empty strings (no content, no capacity) would result in
1) 100000 * siteof(pointer) = 800000
2) one array of 10 empty string is at least 10 * sizeof(string) = at least 10 * (8 + 8) = 160 (or more)
3) each content of the string is at least 5 bytes, probably more due to memory management considerations) = 5 (or more)

Summing up: the minimul expected memory size = 100000 * (8 + (10 * ((8 + 8) + (5 * 1)))) ~ 21MB

Now, some operating systems may decide for arrays, that they allocate some more bytes, usually before the returned address. This is mainly to allow delete[] to get the size of the data to delete.

Each dynamic array will add up say 8 bytes for this memory management book keeping of arrays:
- 1 * X-array
- 100000 * Y array
- 1000000 * dynamic string content
= 1100001 * 8 bytes ~ 8MB

My calculation comes to some memory usage of 30MB or more, depending of that capacity the string reserves.

I guess you 60MB is still too large - how you measure that? Processes also acquire memory from the operarting system in chunks, so the process memory size is not a fine grained enough indicator for memory usage of this matrix.

How to optimize: If memory usage is critical but accessing may be slow (and seldom) and if the data is constant:
make a class that abstracts the whole matrix, store the content in one large character array with each string literal (e.g. "ABC") terminated with a zero-character (i.e. '\0') and access the content by two index (x, y), x searches the respective x * 10 + y '\0'-characters, after that last character, the searched string spans to the next '\0'.

Cheers
Andi
 
Share this answer
 
Comments
TibiiBot 21-Jan-14 15:06pm    
Dear Andi! Thank you for your correct explanation, I understand it clearly.I've checked your solution that I should use character array and it worked fine.Data accessing is a little bit slower, but all thing cannot be perfect:) So thanks again!
I think the additional memory usage is due to the initial capacity of the string (it is implementation dependent, see, for instance this page: "std::string length and capacity"[^]).

Please note, posted code has mistakes, it should be
C++
int X_range=100000; // hundreds of tousands row
int Y_range=10;//ten coloumn
string **matrix;

matrix = new string * [X_range];
for (int x=0;x<X_range;x++)
{
     matrix[x]= new string[Y_range];
     for (int y=0; y<Y_range; y++)
            matrix[x][y]="ABCD";
}


How did you check matrix memory usage?
On my system (Win 8 64 bits, with VS 2012) the capacity of the strings is 15).
 
Share this answer
 
Comments
TibiiBot 21-Jan-14 15:20pm    
Dear CPallini! Sorry, my posted code was wrong, but in my code it was correct: matrix[x]= new string[Y_range]. I've used GetProcessMemoryInfo() and before create matrix, my program used about ~2MB of memory and after created matrix it used ~60MB memory.Directly I checked strings with capacity() and size() functions.Capacity was same with size: 4.I'm using Dev C++ IDE on Win7 64bit.I've checked Andi's solution(character array for strings), and it works fine, but if you have some minutes for any solution I welcome it:) Thank you and have a nice evening!
CPallini 21-Jan-14 15:49pm    
It is very strange that capacity is wrong in your test. However, with std::string there is no solution, for instance the shrink_to_fit method, is a non-binding request (and actually has no effect on my system). Nice evening to you.
TibiiBot 22-Jan-14 16:28pm    
Yes I also think that it's very strange. Perhaps my Dev C++ IDE is not so perfect..i don't know. shrink_to_fit() is not working for me: "std::string has no member shrink_to_fit".Another interest thing :).So finally I could fix my problem with char array, it works fine just a little bit slower and complicated than string.Thanks again for your help.
CPallini 22-Jan-14 16:34pm    
shrink_to_fit is a C++11 thing, might be your compiler is just a bit outdated.
With character arrays the code should be indeed more complicated, but actually faster than the one with std::strings.
You are welcome and good luck!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900