Ruby compare two hash arrays, with specific keys

There are two hash arrays and I want to remove the "common" elements from the two arrays based on certain keys. For example:

array1 = [{a: '1', b:'2', c:'3'}, {a: '4', b: '5', c:'6'}]
array2 = [{a: '1', b:'2', c:'10'}, {a: '3', b: '5', c:'6'}]

      

and criteria keys a and b . So when I get the result of something like

array1-array2 (don't have to overwrite '-' if there better approach)

      

it expects to get [{a: '4', b: '5', c: '6'}] sine we used a and b as comparison criteria. It will destroy the second element since the value for a is different from array1.last and array2.last.

0


source to share


2 answers


As I understand it, you are given two arrays of hashes and a set of keys. You want to reject all elements (hashes) of the first array whose values ​​match the values ​​of any element (hash) of the second array for all the specified keys. You can do it like this.

code

require 'set'

def reject_partial_dups(array1, array2, keys)
  set2 = array2.each_with_object(Set.new) do |h,s|
     s << h.values_at(*keys) if (keys-h.keys).empty? 
  end
  array1.reject do |h|
    (keys-h.keys).empty? && set2.include?(h.values_at(*keys))
  end
end

      

Line:

(keys-h.keys).empty? && set2.include?(h.values_at(*keys))

      

can be simplified to:

set2.include?(h.values_at(*keys))

      

if none of the key values ​​in the elements (hashes) array1

are equal nil

. I created a set (not an array) of array2

to speed up searching h.values_at(*keys)

this string.

Example

keys = [:a, :b]
array1 = [{a: '1', b:'2', c:'3'}, {a: '4', b: '5', c:'6'}, {a: 1, c: 4}]
array2 = [{a: '1', b:'2', c:'10'}, {a: '3', b: '5', c:'6'}]
reject_partial_dups(array1, array2, keys)
  #=> [{:a=>"4", :b=>"5", :c=>"6"}, {:a=>1, :c=>4}] 

      

Explanation

First create set2

e0 = array2.each_with_object(Set.new)
  #=> #<Enumerator: [{:a=>"1", :b=>"2", :c=>"10"}, {:a=>"3", :b=>"5", :c=>"6"}]
  #     #:each_with_object(#<Set: {}>)> 

      

Pass the first element e0

and calculate the block.

h,s = e0.next
  #=> [{:a=>"1", :b=>"2", :c=>"10"}, #<Set: {}>]
h #=> {:a=>"1", :b=>"2", :c=>"10"} 
s #=> #<Set: {}> 
(keys-h.keys).empty?
  #=> ([:a,:b]-[:a,:b,:c]).empty? => [].empty? => true

      

so calculate:



s << h.values_at(*keys)
  #=> s << {:a=>"1", :b=>"2", :c=>"10"}.values_at(*[:a,:b] }
  #=> s << ["1","2"] => #<Set: {["1", "2"]}> 

      

Pass the second (last) element e0

to the block:

h,s = e0.next
  #=> [{:a=>"3", :b=>"5", :c=>"6"}, #<Set: {["1", "2"]}>] 
(keys-h.keys).empty?
  #=> true

      

so calculate:



s << h.values_at(*keys)
  #=> #<Set: {["1", "2"], ["3", "5"]}> 

set2
  #=> #<Set: {["1", "2"], ["3", "5"]}> 

      

Reject items from array1

We now iterate through array1

, rejecting the items for which the block is evaluating true

.

e1 = array1.reject
  #=> #<Enumerator: [{:a=>"1", :b=>"2", :c=>"3"},
  #                  {:a=>"4", :b=>"5", :c=>"6"}, {:a=>1, :c=>4}]:reject> 

      

The first element e1

is passed to the block:

h = e1.next
  #=> {:a=>"1", :b=>"2", :c=>"3"} 
a = (keys-h.keys).empty?
  #=> ([:a,:b]-[:a,:b,:c]).empty? => true
b = set2.include?(h.values_at(*keys))
  #=> set2.include?(["1","2"] => true
a && b
  #=> true

      

so the first element is e1

rejected. Further:

 h = e1.next
   #=> {:a=>"4", :b=>"5", :c=>"6"} 
 a = (keys-h.keys).empty?
   #=> true 
 b = set2.include?(h.values_at(*keys))
   #=> set2.include?(["4","5"] => false
 a && b
   #=> false

      

therefore the second element is e1

not rejected. Finally:

h = e1.next
  #=> {:a=>1, :c=>4} 
a = (keys-h.keys).empty?
  #=> ([:a,:c]-[:a,:b]).empty? => [:c].empty? => false

      

so return true (meaning the last item is e1

not rejected) since there is no need to compute:

 b = set2.include?(h.values_at(*keys))

      

+8


source


So, you really should try this yourself, because I will basically solve it for you.

General approach:



  • For each time in array1
  • Make sure the same value in array2 has any keys and values ​​with the same value
  • If so, remove it

You will probably have something like array1.each_with_index { |h, i| h.delete_if {|k,v| array2[i].has_key?(k) && array2[i][k] == v } }

+1


source







All Articles