Efficient way to truncate std :: vector <char> by length N - to free memory

I have some large std :: vectors characters (bytes loaded from binaries).

When my program runs out of memory, I need to clean up some of the memory used by these vectors. These vectors are pretty much all my memory, and they are just caches for local and network files, so it's safe to just grab the largest one and slice it in half or so.

The only thing I'm currently using is vector :: resize and vector :: shrink_to_fit, but it seems to require more memory (I assume to reallocate the new size) and then a bunch of time (to destroy the now destroyed pointers which are like i thought would be free?) and then copying the rest into a new vector. Note that this is on the Windows platform when debugging, so pointers may not be destroyed in a Release build or on other platforms.

Is there something I can do to just say "C ++, tell the OS that I no longer need memory outside of location N in this vector"?

Alternatively, is there another container I would be better off using? I need to have random access though, or put in the effort of devising a way to easily store the iterators pointing to the place I want to read next, which would be possible just not easy, so I'd rather not use a std :: list.

+3


source to share


3 answers


resize

and shrink_to_fit

- your best bets if we are talking about standard C ++, but as you noticed, they may not help at all if you are starting from low memory: the allocator interface does not provide an operation realloc

, the vector is forced to allocate a new block, copy the data in it and free the old block.

Now I see essentially four simple ways out:



  • discard entire vectors, not just parts of them, perhaps using LRU or the like; when dealing with large vectors, a C ++ allocator usually just forwards memory management calls to the OS, so the memory has to go back to the OS;
  • write your own container that uses malloc

    / realloc

    or OS specific functionality;
  • use std::deque

    ; you lose the guaranteed data continuity, but since deques usually allocate space for data in separate chunks, resize

    + shrink_to_fit

    should be pretty cheap - just all unused blocks are freed at the end, without the need for massive reallocations;
  • just leave this task to the OS. As mentioned in the comments, the OS already has a file cache and in normal cases it can handle it better than you or me, even to have a better idea of ​​how much physical memory is left for this, which files. " hot "for most applications, etc. Also, since you are working in a virtual address space, you cannot even guarantee that your memory will actually stay in RAM; the very moment the machine goes into memory pressure and you don't use some memory pages very often, they change to disk, so all your performance loss is lost (and you waste paging file space for things already found on disk).

An additional way would be to simply use memory mapped files - the system will do its own caching as usual, but you avoid any syscall glitches as long as the file stays in memory.

+4


source


std :: vector :: shrink_to_fit () cannot result in more memory usage if it does.

C ++ 11 defines shrink_to_fit () as follows:

void shrink_to_fit (); Notes: shrink_to_fit is an optional request to shrink capacity () to size (). [Note. The request is optional to allow the breadth for specific implementations. - end note]



As the note notes, shrink_to_fit () can, but not necessarily, actually free memory, and the standard gives C ++ implementations a free hand to reuse and optimize memory usage internally as they see fit. C ++ does not make it mandatory for shrink_to_fit () and the like to actually free RAM, and in many cases the C ++ runtime library may not be able to do it as I get the moment. The C ++ runtime library is allowed to use freed memory as well as dispose of it internally and automatically use it for future memory allocation requests (explicit news or container growth).

Most modern operating systems are not designed to allocate and release blocks of memory of arbitrary sizes. The details differ from each other, but usually the operating system allocates and frees memory in even chunks, typically 4Kbps or more, even at memory page addresses. If you allocate a new object just a few hundred bytes long, the C ++ library will request to allocate an entire page of memory, take the first hundreds of bytes for the new object, and then store spare memory for future new requests.

Likewise, even if shrink_to_fit () or delete frees a few hundred bytes, it cannot immediately return to the operating system, but only when the entire 4kb contiguous memory range (or whatever allocation page size used by the operating system) is suitably aligned - Not used at all. Only then can the process return this page to the operating system. Until then, the library keeps track of freed memory ranges that will be used for future new requests without asking the operating system to allocate more memory pages to the process.

+2


source


I think you can do this by copying your own vector and creating a method that uses the new placement operator. The bottom line is that you are doing the usual thing, using a new one to grab memory on the first assignment or when you need to expand. When you want to cut down on memory usage, first call delete to free the memory and then call the new location at the same location you just occupied. This might be a little risky as I'm not sure what guarantees the standard makes it impossible to change bits immediately after deletion caused by an array of primitives, but if it works, it should give you very good performance.

-1


source







All Articles