Which one of vsize, size and rss should I use when detecting a memory leak?
Which of the three values, vsize, size and rss from ps
is suitable for use in fast memory leak detection? For my purpose, if a process has been running for several days and its memory continues to grow, that is a good enough indicator that it is a memory leak. I understand that a tool like valgrind should eventually be used, but using it is intrusive and therefore not always desirable.
For my understanding, I wrote a simple piece of C code that basically allocates 1 MiB of memory, frees it and then allocates 1 MiB again. He also sleeps for 10 seconds before each step, giving me time to see the result from ps -p <pid> -ovsize=,size=,rss=
. Here he is:
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <stdint.h>
#define info(args...) printf(args)
char* bytes(char* str, uint32_t size, uint32_t n)
{
char* unit = "B";
if (n > 1000) {
n /= 1000;
unit = "KB";
}
if (n > 1000) {
n /= 1000;
unit = "MB";
}
snprintf(str, size, "%u %s", n, unit);
return(str);
}
void* xmalloc(size_t size)
{
char msg[64];
size_t max = sizeof(msg);
void *p = NULL;
info("Allocating %s\n", bytes(msg, max, size));
p = malloc(size);
memset(p, '1', size);
return(p);
}
void* xfree(void* p, size_t size)
{
char msg[64];
size_t max = sizeof(msg);
info("Freeing %s\n", bytes(msg, max, size));
free(p);
return(NULL);
}
void nap()
{
const int dur = 10;
info("Sleeping for %d seconds\n", dur);
sleep(dur);
}
int main(void)
{
int err = 0;
size_t kb = 1024;
size_t block = 1024 * kb;
char* p = NULL;
nap();
p = xmalloc(block);
nap();
p = xfree(p, block);
nap();
p = xmalloc(block);
nap();
return(err);
}
Now ps
runs every two seconds from a shell script that helped print the measurement timestamps as well. Its output:
# time vsize size rss
1429207116 3940 188 312
1429207118 3940 188 312
1429207120 3940 188 312
1429207122 3940 188 312
1429207124 3940 188 312
1429207126 4968 1216 1364
1429207128 4968 1216 1364
1429207130 4968 1216 1364
1429207132 4968 1216 1364
1429207135 4968 1216 1364
1429207137 3940 188 488
1429207139 3940 188 488
1429207141 3940 188 488
1429207143 3940 188 488
1429207145 5096 1344 1276
1429207147 5096 1344 1276
1429207149 5096 1344 1276
1429207151 5096 1344 1276
1429207153 5096 1344 1276
From the above values and considering the descriptions provided on the ps(1)
man page for , it seems to me that vsize is the best measure. Is this understanding correct? Note that the man page states that size is a measure of total dirty pages and rss is the number of pages in physical memory. They can become much smaller than the total memory used by the process.
These experiments were tested on Debian 7.8 using GNU / Linux 3.2.0-4-amd64.
source to share
In general, the total virtual size ( vsize
) of your process is the main metric for the size of the process. rss
is just the part that currently uses real memory. size
is a measure of how many pages were actually changed.
Ascending vsize
, with relatively stable or cyclic values, size
and rss
may indicate heap fragmentation or fuzzy heap allocation algorithm.
Constantly increasing vsize
and size
, when relatively stable, rss
may suggest memory leak, heap fragmentation, or fuzzy heap allocation algorithm.
You will need to understand how a given program uses memory resources in order to use only these external measures of process resource utilization to assess whether it is leaking memory or not.
Some of this requires a little knowledge of how the heap is managed by the C library routines malloc()
and free()
including the additional memory that might be required internally to manage the active allocation list, how it relates to heap fragmentation, and how it can free unused portions of the heap back to the operating system.
For example, your test shows that both the total virtual size of the process and the number of dirty pages it requires increased slightly the second time the program allocated the same amount of memory again. This probably shows some of the overhead malloc()
, that is, the amount of memory that its internal data structures should have up to this point. It would be interesting to see what happened if the program took another free()
, and sleep()
before exit. It can also be instructive to modify your code to call sleep()
between calls malloc()
and memset()
and then observe the results from ps
.
So, a simple program that should only require a fixed amount of memory, or that allocates memory to do a specific unit of work, and then needs to free all that memory after that piece of work has finished, should show relatively stable vsize
, assuming it never processes more than one unit works at the same time and has a "bad" allocation scheme that will lead to heap fragmentation.
As you noted, a tool like this valgrind
, along with an intimate knowledge of the internal implementation of a program, is needed to show actual memory leaks and prove that they are solely responsible for the program.
(By the way, you can simplify your code a little - don't use unnecessary macros like info()
in particular, and for this type of example, trying to come up with print values in large units, using additional variables to do size calculations, etc. is also rather obfuscation rather than help. Too many printfs also obfuscate the code - only use the ones you need to see where the program is in and see values that are not known at compile time.)
source to share