Reducing garbage collection time when using large data structures in functional language
How can you shorten garbage collection time when using large data structures in a functional language?
(I'm using Racket, but the question applies to any function-oriented language with a garbage collector.)
The basic idiom of functional programming is that you create functions to work with a copy of the data and return the processed data as a result, instead of mutating data structures from afar. You don't worry about extra copying because the garbage collector comes in and reclaims memory that is no longer in use.
This is as great as it gets. But as I started writing programs that process large data structures, I find that garbage collection takes the most execution time (as in 25-35%, not 10-15%, which I found typical with small structures ).
This is not surprising, because more data is copied for each function call, and therefore more garbage to collect.
But it also makes it harder to speed up because the garbage collector takes longer to load, which is not significantly under control.
The obvious solution would be to avoid copying the data altogether. But it brings you back to mutating data from afar, which goes against the functional idiom. (Though I know this can be done to some extent in Racket using boxed values โโor even with parameters.)
In addition, three options are available to me:
- Create your data structures to encode information more compactly.
- Rather than passing entire data structures to functions, pull out a subset of the data you need to process (although, of course, this assumes the cost of splitting and rejoining a subset of the data saves enough garbage collection time to be worth it).
- If possible, combine multiple functions (which usually pass a large data structure from one to the other, copying it each time) into a single, tail-recursive function (which you won't).
Are these effective approaches? Are there others that I don't notice?
source to share
It's not a perfect answer, but it's more on-topic than other resources I've found. This document Functional Data Structures in Typed Racket discusses alternative functional data structures that are better than standard lists in certain contexts and give specific synchronization results that include garbage collection time (HT Greg Hendershott):
http://www.ccs.neu.edu/racket/pubs/sfp10-kth.pdf
And here is the code for the implementations:
source to share
There are functional data structures that are designed to reduce the cost of copying - for example, if a function mutates a branch of a tree, then the new tree will split nodes from the unaffected branches, and only the mutant branch is needed to copy.
Chris Okasaki Purely Functional Data Structures is the best work on this subject that I know of, but there is probably more recent research that I am not aware of (e.g. ctrie , which I only know about via Wikipedia).
source to share
Making a lot of unnecessary junk is, of course, bad. If the profiler shows that you are allocating a lot of memory and that the GC is successfully reclaiming a lot of memory, you should be thinking about generating less garbage. This is the place your (1) enters:
Create your data structures to encode information more compactly.
This is really important in general. If GC time degrades performance, try using more compact data structures. If GC time does not degrade performance, try using more compact data structures. Compact data structures improve GC performance and cache utilization. If you can squeeze more information into the processor's cache in fewer bytes, you can often get significant speed improvements.
Yours (2) and (3), on the other hand, look rather suspicious and suggest to me that you don't have a clear idea of โโhow the values โโare actually represented in memory. Passing a structure to a function does not usually copy it in typical functional language implementations. However, "pulling out a subset of the data you need to process" might make sense, because if you don't need the rest, you can trash it faster, which is good.
The last thing that 25-35% of the time GC may sound good, but it's not as bad as it can be. The worst thing I've seen is when the "weak generation hypothesis" is violated. That is, when you allocate a bunch of memory that doesn't get garbage quickly. This is bad because the garbage collector keeps on collecting garbage, tracking more and more garbage without doing much. I've personally seen that GC times are around 75% in these cases, but I wouldn't be surprised if it could be worse.
source to share