How can Array # sort in Ruby so fast?

C # has BitArray, C has bit fields. I couldn't find an equivalent in Ruby core. Google showed me a class BitField

that Peter Cooper wrote for him.

I was reading Jon Bentley Programming Pearls and while trying one of the first examples that deals with sorting a BitMap, I need a type that is an array of bits. I used Peter class

class BitMapSort
  def self.sort(list, value_range_max)
    bitmap =

    list.each{|x| bitmap[x] = 1 }

    sorted_list = []
    0.upto(value_range_max-1){ |number|
      sorted_list << number if (bitmap[number] == 1)


Running this set of 1M unique numbers in the range [0, 10,000,000] gave interesting results,

                         user     system      total        real
bitmap               11.078000   0.015000  11.093000 ( 11.156250)
ruby-system-sort      0.531000   0.000000   0.531000 (  0.531250)
quick-sort           21.562000   0.000000  21.562000 ( 21.625000){|x|"bitmap"){ ret = BitMapSort.sort(series, 10_000_000);}"ruby-system-sort"){ ret = series.sort; }"quick-sort"){ ret = QuickSort.sort( series, 0, series.length-1); }


How is the default sort for ruby ​​by default 22x faster than 1M bitsField.set + 1 for a vector of 10M bits? Is there a more efficient bit / array field in Ruby? How does Ruby's default sort achieve this level of performance .. does it jump to C to do this?


source to share

5 answers

Array # sort is implemented in C, see. rb_ary_sort

In array.c

It also has some checks to compare Fixnums, so sorting an array of integers doesn't even need to look up methods.



How does Ruby's default sort achieve this performance level .. does it jump to C to do it?

All the main classes and methods in the default ruby ​​implementation are implemented in C.



The reason this is much faster is probably because it is implemented in the C ruby ​​implementation.



I think the real problem is that you are doing 10M comparisons, 10M fetching arrays, 10M from a lot of things, whereas a properly optimized sort routine does a lot less operations since it works on a fixed set of 1M elements.

Basic operations such as sorting are highly optimized in the Ruby VM and are difficult to work around with a pure-Ruby alternative.



Jumps to C are definitely correct. Arrays and hashes have C implementations of many methods to improve performance. Integer and float literals also have some tricky code optimizations. When you convert them to bitmap, you lose this optimization as well.

With a compiled language like C or Java, it really makes sense to look for complex optimization patterns. with interpreted languages, the cost of interpreting each command makes this counter productive.



All Articles