Query to find the second highest value from each group

I have three tables:

  • project: project_id, project_name

  • milestone: milestone_id, milestone_name

  • project_milestone: id, project_id, milestone_id, completed_date

I want to get the second highest completed_dance and milestone_id from project_milestone, grouped by project_id. That is, I want to get the milestone_id of the second highest completed_date for each project. What would be the correct query for this?

+3


source to share


2 answers


I think you can do what you want with a table project_milestone

and row_number()

:

select pm.*
from (select pm.*,
             row_number() over (partition by project_id order by completed_date desc) as seqnum
      from project_milestone pm
      where pm.completed_date is not null
     ) pm
where seqnum = 2;

      



If you need to include all projects, even those that do not have two stages, you can use left join

:

select p.project_id, pm.milestone_id, pm.completed_date
from projects p left join
     (select pm.*,
             row_number() over (partition by project_id order by completed_date desc) as seqnum
      from project_milestone pm
      where pm.completed_date is not null
     ) pm
     on p.project_id = pm.project_id and pm.seqnum = 2;

      

+6


source


Using LATERAL (PG 9.3+) may provide better performance than the window function version.



SELECT * FROM project;
 project_id | project_name 
------------+--------------
          1 | Project A
          2 | Project B

SELECT * FROM project_milestone;
 id | project_id | milestone_id |     completed_date     
----+------------+--------------+------------------------
  1 |          1 |            1 | 2000-01-01 00:00:00+01
  2 |          1 |            2 | 2000-01-02 00:00:00+01
  3 |          1 |            5 | 2000-01-03 00:00:00+01
  4 |          1 |            6 | 2000-01-04 00:00:00+01
  5 |          2 |            3 | 2000-02-01 00:00:00+01
  6 |          2 |            4 | 2000-02-02 00:00:00+01
  7 |          2 |            7 | 2000-02-03 00:00:00+01
  8 |          2 |            8 | 2000-02-04 00:00:00+01


SELECT *
FROM project p
CROSS JOIN LATERAL (
    SELECT milestone_id, completed_date
    FROM project_milestone pm
    WHERE pm.project_id = p.project_id
    ORDER BY completed_date ASC
    LIMIT 1
    OFFSET 1
) second_highest;
 project_id | project_name | milestone_id |     completed_date     
------------+--------------+--------------+------------------------
          1 | Project A    |            2 | 2000-01-02 00:00:00+01
          2 | Project B    |            4 | 2000-02-02 00:00:00+01

      

0


source







All Articles