Finding the least common ancestor from the transitive closure table

I have a table representing the transitive closure of an organizational hierarchy (i.e. its tree with a single root):

create table ancestry (
    ancestor   integer,
    descendant integer,
    distance   integer
);

      

I have another table that contains the organizations that each user has access to:

create table accessible (
    user         integer,
    organization integer
);

      

The system shows the user a roll-up of the costs associated with each organization the user can access. I could always start by showing the user a view of the company (i.e. Root) showing the user a list of direct subsidiaries and how many of their organizations contribute to the total. In most cases, there will be one child and the user will need to expand several levels before seeing multiple children. I would prefer to start my presentation with the first organization that shows multiple children (i.e. LCA).

For a given user, I can easily find a set of root paths, but I am having a hard time finding the least common ancestor. I am using postgresql 9.1 but prefer a solution that is database agnostic. In the worst case, I can pull the paths back into the application code and calculate the LCA there.

+3


source to share


2 answers


I took another look at this and came up with the following solution. I used a common-table expression to make it easier to understand how it works, but it can be easily written using a subquery.

with
hit (id, count) as (
    select
        ancestry.ancestor
       ,count(ancestry.descendant)
    from
        accessible
        inner join ancestry
            on accessible.organization = ancestry.descendant
    where
        accessible.user = @user_id
    group by
        ancestry.ancestor
)
select
    ancestry.descendant as lca
from
    hit
    inner join ancestry
        on ancestry.descendant = hit.id
       and ancestry.ancestor = @company_id
order by
    hit.count desc
   ,ancestry.distance desc
limit 1
;

      

The CTE column lists, for each organization in the hierarchy, the number of paths from the child to the root that crosses the organization. LCA is the organization with the most traversal. In the case of a tie, the organization farthest from the root (i.e. Max (distance)) is the actual LCA. This is best illustrated by an example.

        A
        |
        B
       / \
      C   D

      

Assuming we want to find the LCA of nodes C and D from the tree above. Hit CTE makes the following calculations:

Node    Count
  A       2
  B       2
  C       1
  D       1

      



The main query adds distance:

Node    Count    Distance
  A       2         0
  B       2         1
  C       1         2
  D       1         2

      

The main query then orders the results in descending number and distance

Node    Count    Distance
  B       2         1
  A       2         0
  C       1         2
  D       1         2

      

LCA is the first item in the list.

+2


source


Just a guess, not dn agnostic (SQL Server) but adaptable

SELECT TOP 1
       a1.ancestor
FROM   ancestor a1
       INNER JOIN
       ancestor a2 ON a1.ancestor=a2.ancestor
WHERE  a1.descendent = @Dec1
       AND
       a2.descendent = @Dec2
ORDER BY a1.distance DESC

      



If you want to put some data in a SQLFiddle I can play with it.

0


source







All Articles