Ruby on Rails - Slow loading and wasting huge amounts of time in the garbage collector

I have a large Rails application and I am looking to improve (depressing) performance.

Working with ruby-prof doesn't help me much, I get a similar result (works in production mode on thin):

Thread ID: 9322800
Total: 1.607768
Sort by: self_time

 %self     total     self     wait    child    calls   name
 26.03      0.42     0.42     0.00     0.00     1657   Module#define_method 
  8.03      0.13     0.13     0.00     0.00      267   Set#initialize 
  4.41      0.07     0.07     0.00     0.00       44   PG::Result#values 
  4.28      0.07     0.07     0.00     0.00     1926   ActiveSupport::Callbacks::Callback#start 
  4.21      0.07     0.07     0.00     0.00    14835   Kernel#hash 
  4.13      0.08     0.07     0.00     0.01      469   Module#redefine_method 
  4.11      0.07     0.07     0.00     0.00       63  *<Class::ActiveRecord::Base>#with_scope 
  4.02      0.07     0.06     0.00     0.00      774   ActiveSupport::Callbacks::Callback#_compile_options 
  3.24      0.05     0.05     0.00     0.00       30   PG::Connection#async_exec 
  2.31      0.40     0.04     0.00     0.37     2130  *Module#class_eval 
  1.47      0.02     0.02     0.00     0.00        6   PG::Connection#unescape_bytea 
  1.03      0.05     0.02     0.00     0.03      390  *Array#select 

* indicates recursively called methods

      

I guessed that maybe he spends a lot of time in the garbage collector since I am running REE . I decided to try GC.enable_stats for more information. I added the following to my app controller:

around_filter :enable_gc_stats

private

def enable_gc_stats
  GC.enable_stats

  begin
    yield
  ensure
    GC.disable_stats
    GC.clear_stats
  end
end

      

On a relatively large page running on my machine here, in production mode with REE and a thin webserver (ruby-prof is disabled as it makes it a little slower), I get:

Completed 200 OK in 1093ms (Views: 743.1ms | ActiveRecord: 139.2ms)

GC.collections: 11
GC.time: 666299 us 666.299 ms
GC.growth: 461 KB

GC.allocated_size: 152 MB
GC.num_allocations: 1,924,773
ObjectSpace.live_objects: 1,015,195
ObjectSpace.allocated_objects: 12,393,644

      

So, for a page that took 1093ms, it seems like almost 700ms was wasted in the garbage collector. Has anyone had this problem before? I understand that you cannot help with my application in particular (it is quite large with a lot of gems and things) - but are there methods or tools to better understand why so much garbage is being generated?

Any ideas would be much appreciated!

+3


source to share


1 answer


Your rails log shows most of the time (75%) in view code.

Your profile report shows three obvious hotspots: Module#define_method

for independent time, Module#class_eval

for total time, and Set#initialize

.

define_method

and class_eval

point out that there is probably a lot of dynamic code execution that seems overkill to me - usually you want to generate that code earlier and reuse it instead of re-creating it. This is almost certainly part of the problem with excessive object placement problems. Generating a graph report instead of a flat report should help you find parent methods that fall into these expensive paths, and it can give you a pointer to where you could optimize.

Set#initialize

could be a real artifact of what your code is supposed to do, or it could be a sign that some meaningful Set[...]

or Set::new

set inline creation calls can be made once and assigned to a constant or var instance / class for reuse.



ruby-prof is ok, but you can also try perftools.rb , which can be easily connected to rack rails using rack-perftools_profiler . Perftools has some advanced renderers that can make hot paths much easier to understand.

Since you are using REE and extensive object allocation (and hence garbage collection) this is a problem, you can try memprof to get some understanding of what and where all these allocations come from.

If you can't find a way to reduce the number of allocated objects, you can ease the load on the GC by using more process memory by tuning the GC to predefine a heap large enough to meet typical query allocation requirements. Unicorn offers rack module for out of range GC . You might be able to adapt this modular approach to working with thin and move all GC time between requests - you'll still pay the cost of the processor, but at least you won't postpone your responses to garbage collection.

+4


source







All Articles