Deterministic sort order for window functions

I have a table status

and I want to get the latest information.

Slno |   ID | Status | date
  1  |   1  | Pass   | 15-06-2015 11:11:00 - this is inserted first
  2  |   1  | Fail   | 15-06-2015 11:11:00 - this is inserted second 
  3  |   2  | Fail   | 15-06-2015 12:11:11 - this is inserted first
  4  |   2  | Pass   | 15-06-2015 12:11:11 - this is inserted second

      

I am using a window function with partition by ID order by date desc

to get the first value.
Excluded output:

2  |   1  | Fail   | 15-06-2015 11:11:00 - this is inserted second
4  |   2  | Pass   | 15-06-2015 12:11:11 - this is inserted second

      

Actual output:

1  |   1  | Pass   | 15-06-2015 11:11:00 - this is inserted first
3  |   2  | Fail   | 15-06-2015 12:11:11 - this is inserted first

      

According to [ http://docs.aws.amazon.com/redshift/latest/dg/r_Examples_order_by_WF.html] adding a second column ORDER BY

to the window function can fix the problem. But I have no other column to distinguish the lines!
Is there any other approach to solving the problem?

EDIT: I added slno

here for clarity. I do not have slno

as such in the table!

My SQL:

with range as (
  select id from status where date between 01-06-2015 and 30-06-2015
), latest as ( 
  select status, id, row_number() OVER (PARTITION BY id ORDER BY date DESC) row_num
)
select * from latest where row_num = 1

      

+1


source to share


1 answer


If not in your table slno

, then you have no reliable information about which row was inserted first. There is no natural order in the table, the physical order of the rows can change at any time (with any update or with help VACUUM

, etc.)

You can use an unreliable trick: Inner order ctid

.

select *
from  (
   select id, status
        , row_number() OVER (PARTITION BY id
                             ORDER BY date, ctid) AS row_num
   from   status  -- that your table name??
   where  date >= '2015-06-01'  -- assuming column is actually a date
   and    date <  '2015-07-01'
  ) sub
where  row_num = 1;
      

  • In the absence of any other information, which line came first (this is a design error , fix it!), You can try to keep what you can use with the internal tuple IDctid

    Rows will be in physical order if inserted initially, but this could change at any time with any table write VACUUM

    or other event.
    This is an extreme measure and it will break.

  • Your submitted query was invalid for multiple counters: no column name in 1st CTE, no table name in 2nd CTE, ...

  • You don't need a CTE for this.



Simplification with DISTINCT ON

(considerations for ctid

apply the same):

SELECT DISTINCT ON (id)
       id, status
FROM   status
WHERE  date >= '2015-06-01'
AND    date <  '2015-07-01'
ORDER  BY id, date, ctid;

      

0


source







All Articles