Ruby requests more memory when there are many free heap slots

We have a server with

  • Sidekiq 4.2.9
  • rails 4.2.8
  • MRI 2.1.9

This server periodically does some import from external APIs, does some calculations on them, and stores these values โ€‹โ€‹in the database.

About 3 weeks ago the server started hanging, as I can see from NewRelic (and when ssh'ed to it) - it consumes more and more memory over time, eventually taking up all the available RAM, and then the server freezes.

I've read several articles on how ruby โ€‹โ€‹GC works, but still can't figure out why at ~ 5: 30 AM the heap size jumps from ~ 2.3M to 3M when there are 1M more free heaps available (in the default GC settings )

enter image description here

similar behavior, 15:35: enter image description here

So the questions:

  • how to make Ruby fill free heaps instead of requesting new slots from the OS?
  • how to make it free free heaps on the system?
+3


source to share


1 answer


how to make Ruby fill free heaps instead of requesting new slots from the OS?

Your schedule doesn't have "complete" fidelity. It is very much to assume that GC.stat was called by Newrelic or something else at exactly the right time.

It's incredibly likely that you ran out of slots, the heap grew, and since the heaps don't shrink in Ruby, you're stuck in a somewhat bloated heap.



To ease some pain, you can restrict RUBY_GC_HEAP_GROWTH_MAX_SLOTS to a sane number, something like 100,000 will do, I'm trying to lobby for a default setting here in the kernel.

Besides

  • Create a persistent log of jobs that start and when they start (duration, etc.), collect GC.stat before and after jobs run

  • Split your jobs one at a time, run 1 queue on one server and another queue on another, see which queue and which job is responsible for the problem.

  • Profile the different jobs you have with a plugin or other profiling tools.

  • Reduce the number of parallel jobs you run as an experiment, or put a mutex between certain job types. It is possible that 1 "job a" is simultaneously OKRI, and 20 simultaneous "jobs a" will bloat memory at the same time.

+3


source







All Articles