Manipulating reports returned in the active recording area
TL; DR . Is there a way to define a scope so that I can manipulate the records found in a query that uses that scope before returning it? Can I use the data returned by a query to prepopulate arbitrary values โโin a collection of records, just like rails can "preload" association data?
Basically, I have a database table that contains hierarchical information, so each row has a parent, and there are many times when I have to iterate up and down the hierarchy to get the parent or child nodes. To improve performance, we actively use the Postgresql WITH RECURSIVE query, which allows us to quickly capture all decoders for a given set of node ids. On my real model, I have two key methods that use this type of request: instance method descendants
and scope find_with_all_descendants(*ids)
. However, if I have a collection of these models and I want to loop through and get descendants for each by calling descendants
, I end up creating a query for each record. So my current code looks like this
collection = Node.find_with_all_descendants(1,2,3,4)
# collection gets passed around to other parts of the program ...
collection.each do |node|
# other parts of the program do stuff with node.descendants, resulting in
# a select N+1 issue as the query for descendants fires
node.descendants
end
Which would be great if I could call Node.find_with_all_descendants(*ids)
and then pre-build the collection of descendants, so subsequent calls descendants
for any of the returned records end up in the cached data rather than result in another request. So my method Node.descendants
might look like this.
def descendants
return @cached_descendants if @cached_descendants
# otherwise execute big sql statement I'm not including
end
Then I just need to find a place where I can set @cached_descendants
for records returned by queries that usefind_with_all_descendants
But given that this is scoped and all I can get back is an active post association, I don't understand how I can tweak this cached value setting. Is there any hook where I can run the code after any queries that use my scope find_with_all_descendants
return their records?
UPDATE: Including related methods upon request. In addition, including mannequins for correcting monkeys, we use the depth and path of the nodes for completeness of loading.
scope :find_with_all_descendants, -> (*ids) do
tree_sql = <<-SQL
WITH RECURSIVE search_tree(id, path, depth) AS (
SELECT id, ARRAY[id], 1
FROM #{table_name}
WHERE #{table_name}.id IN(#{ids.join(', ')})
UNION ALL
SELECT #{table_name}.id, path || #{table_name}.id, depth + 1
FROM search_tree
JOIN #{table_name} ON #{table_name}.parent_id = search_tree.id
WHERE NOT #{table_name}.id = ANY(path)
)
SELECT id, depth, path FROM search_tree ORDER BY path
SQL
if ids.any?
rel = select("*")
.joins("JOIN (#{tree_sql}) tree ON tree.id = #{table_name}.id")
.send(:extend, NodeRelationMethods)
else
Node.none
end
end
def descendants
self.class.find_with_all_descendants(self.id).where.not(id: self.id)
end
# This defines the methods we're going to monkey patch into the relation returned by
# find_with_all_descendants so that we can get the path and the depth of nodes
module NodeRelationMethods
# All nodes found by original ids will have a depth of 1
# depth is accessible by calling node.depth
def with_depth
# Because rails is a magical fairy unicorn, just adding this select statement
# automatically adds the depth attribute to the data nodes returned by this
# scope
select("tree.depth as depth")
end
def with_path
# Because rails is a magical fairy unicorn, just adding this select statement
# automatically adds the path attribute to the data nodes returned by this
# scope
self.select("tree.path as path")
end
end
source to share
It looks like it can be done by overriding http://apidock.com/rails/v3.2.3/ActiveRecord/Relation/exec_queries . Here's some sample code boiled down to bare aspects
scope :find_with_all_descendants, -> (*ids) do
#load all your records here...
where(#...).extend(IncludeDescendants)
end
module IncludeDescendants
def exec_queries
records = super
records.each do |r|
#pre-populate/manipulate records here before returning
end
end
end
Basically rails calls Relation # exec_queries just before returning records. By extending the relationship that we return in our scope, we can override exec_queries. In the overriden method, we get the original results of the method, then process them, and then return
source to share
If you add path[1]
to selection, you can use Ruby group_by
(not AR group
, which is for SQL GROUP BY
) to group the selected records by the top level parent ID. I wrote an example of this below, with some scope refactoring to take advantage of chaining:
def self.all_descendants
tree_sql = <<-SQL
WITH RECURSIVE search_tree(id, path, depth) AS (
SELECT id, ARRAY[id], 1
FROM (#{where("1=1").to_sql}) tmp
UNION ALL
SELECT #{table_name}.id, path || #{table_name}.id, depth + 1
FROM search_tree
JOIN #{table_name} ON #{table_name}.parent_id = search_tree.id
WHERE NOT (#{table_name}.id = ANY(path))
)
SELECT id, depth, path FROM search_tree ORDER BY path
SQL
unscoped.select("*, tree.depth as depth, tree.path as path, tree.path[1] AS top_parent_id")
.joins("JOIN (#{tree_sql}) tree ON tree.id = #{table_name}.id")
end
def descendants
self.class.where(id: id).all_descendants.where.not(id: id)
end
This way you can do the following:
collection = Node.where(id: [1,2,3,4]).all_descendants
collection.group_by(&:top_parent_id).each do |top_parent_id, descendant_group|
top_parent = descendant_group.detect{|n| n.id == top_parent_id}
top_parent_descendants = descendant_group - top_parent
# do stuff with top_parent_descendants
end
source to share
This is very far from what you want, but I ran into very similar problems and I am wondering if a recursive gem request was considered or if it was available at the time and if it suits your needs in this case? I was hoping it would not render the main class harmless and not ideally override a method in ActiveRecord, but this seems to be a solid DSL-style extension to solve what I think is a fairly common problem:
https://github.com/take-five/activerecord-hierarchical_query
source to share