Which one of vsize, size and rss should I use when detecting a memory leak?

Which of the three values, vsize, size and rss from ps

is suitable for use in fast memory leak detection? For my purpose, if a process has been running for several days and its memory continues to grow, that is a good enough indicator that it is a memory leak. I understand that a tool like valgrind should eventually be used, but using it is intrusive and therefore not always desirable.

For my understanding, I wrote a simple piece of C code that basically allocates 1 MiB of memory, frees it and then allocates 1 MiB again. He also sleeps for 10 seconds before each step, giving me time to see the result from ps -p <pid> -ovsize=,size=,rss=

. Here he is:

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <stdint.h>

#define info(args...) printf(args)

char* bytes(char* str, uint32_t size, uint32_t n)
{
    char* unit = "B";

    if (n > 1000) {
        n /= 1000;
        unit = "KB";
    }
    if (n > 1000) {
        n /= 1000;
        unit = "MB";
    }

    snprintf(str, size, "%u %s", n, unit);
    return(str);
}

void* xmalloc(size_t size)
{
    char msg[64];
    size_t max = sizeof(msg);
    void *p = NULL;

    info("Allocating %s\n", bytes(msg, max, size));
    p = malloc(size);
    memset(p, '1', size);
    return(p);
}


void* xfree(void* p, size_t size)
{
    char msg[64];
    size_t max = sizeof(msg);
    info("Freeing %s\n", bytes(msg, max, size));
    free(p);
    return(NULL);
}
void nap()
{
    const int dur = 10;
    info("Sleeping for %d seconds\n", dur);
    sleep(dur);
}

int main(void)
{
    int err = 0;
    size_t kb = 1024;
    size_t block = 1024 * kb;
    char* p = NULL;

    nap();
    p = xmalloc(block);
    nap();
    p = xfree(p, block);
    nap();
    p = xmalloc(block);
    nap();

    return(err);
}

      

Now ps

runs every two seconds from a shell script that helped print the measurement timestamps as well. Its output:

# time vsize size rss
1429207116   3940   188   312
1429207118   3940   188   312
1429207120   3940   188   312
1429207122   3940   188   312
1429207124   3940   188   312
1429207126   4968  1216  1364
1429207128   4968  1216  1364
1429207130   4968  1216  1364
1429207132   4968  1216  1364
1429207135   4968  1216  1364
1429207137   3940   188   488
1429207139   3940   188   488
1429207141   3940   188   488
1429207143   3940   188   488
1429207145   5096  1344  1276
1429207147   5096  1344  1276
1429207149   5096  1344  1276
1429207151   5096  1344  1276
1429207153   5096  1344  1276

      

From the above values ​​and considering the descriptions provided on the ps(1)

man page for , it seems to me that vsize is the best measure. Is this understanding correct? Note that the man page states that size is a measure of total dirty pages and rss is the number of pages in physical memory. They can become much smaller than the total memory used by the process.

These experiments were tested on Debian 7.8 using GNU / Linux 3.2.0-4-amd64.

+3


source to share


1 answer


In general, the total virtual size ( vsize

) of your process is the main metric for the size of the process. rss

is just the part that currently uses real memory. size

is a measure of how many pages were actually changed.

Ascending vsize

, with relatively stable or cyclic values, size

and rss

may indicate heap fragmentation or fuzzy heap allocation algorithm.

Constantly increasing vsize

and size

, when relatively stable, rss

may suggest memory leak, heap fragmentation, or fuzzy heap allocation algorithm.

You will need to understand how a given program uses memory resources in order to use only these external measures of process resource utilization to assess whether it is leaking memory or not.

Some of this requires a little knowledge of how the heap is managed by the C library routines malloc()

and free()

including the additional memory that might be required internally to manage the active allocation list, how it relates to heap fragmentation, and how it can free unused portions of the heap back to the operating system.



For example, your test shows that both the total virtual size of the process and the number of dirty pages it requires increased slightly the second time the program allocated the same amount of memory again. This probably shows some of the overhead malloc()

, that is, the amount of memory that its internal data structures should have up to this point. It would be interesting to see what happened if the program took another free()

, and sleep()

before exit. It can also be instructive to modify your code to call sleep()

between calls malloc()

and memset()

and then observe the results from ps

.

So, a simple program that should only require a fixed amount of memory, or that allocates memory to do a specific unit of work, and then needs to free all that memory after that piece of work has finished, should show relatively stable vsize

, assuming it never processes more than one unit works at the same time and has a "bad" allocation scheme that will lead to heap fragmentation.

As you noted, a tool like this valgrind

, along with an intimate knowledge of the internal implementation of a program, is needed to show actual memory leaks and prove that they are solely responsible for the program.

(By the way, you can simplify your code a little - don't use unnecessary macros like info()

in particular, and for this type of example, trying to come up with print values ​​in large units, using additional variables to do size calculations, etc. is also rather obfuscation rather than help. Too many printfs also obfuscate the code - only use the ones you need to see where the program is in and see values ​​that are not known at compile time.)

+4


source







All Articles