Oracle: select missing dates

I have a table with (among other things) dates in a field.

I need to get a list of all dates that are later than the oldest date, older than the most recent date, and are completely missing from the table.

So, if the table contains:

2012-01-02
2012-01-02
2012-01-03
2012-01-05
2012-01-05
2012-01-07
2012-01-08

      

I need a query that returns:

2012-01-04
2012-01-06

      

+3


source to share


4 answers


Something like this (assuming your table has a name your_table

and your date column has a name the_date

):

with date_range as (
      select min(the_date) as oldest, 
             max(the_date) as recent, 
             max(the_date) - min(the_date) as total_days
      from your_table
),
all_dates as (
   select oldest + level - 1 as a_date
   from date_range
   connect by level <= (select total_days from date_range)
)
select ad.a_date
from all_dates ad
  left join your_table yt on ad.a_date = yt.the_date
where yt.the_date is null
order by ad.a_date;  

      

Edit: The
clause WITH

is called "common table expression" and is equivalent to a view ("inline view").

He looks like

select * 
from ( 
     ..... 
) all_dates
join your_table ...

      

The second CTE simply creates a list of dates on the fly using an undocumented Oracle implementation function connect by

.

Reusing select (as with calculating the first and last date) is a little easier (and IMHO more readable) than using views.



Edit 2:

This can be done with a recursive CTE as well:

with date_range as (
      select min(the_date) as oldest, 
             max(the_date) as recent, 
             max(the_date) - min(the_date) as total_days
      from your_table
),
all_dates (a_date, lvl) as (
   select oldest as a_date, 1 as lvl
   from date_range 
   union all
   select (select oldest from date_range) + lvl, lvl + 1
   from all_dates 
   where lvl < (select total_days from date_range)
)
select ad.a_date, lvl
from all_dates ad    
  left join your_table yt on ad.a_date = yt.the_date
where yt.the_date is null
order by ad.a_date;  

      

Which should work in all DBMSs that support recursive CTEs (PostgreSQL and Firebird are more standard compatibility, but should be used recursive

).

Notice the hack select (select oldest from date_range) + lvl, lvl + 1

in the recursive part. This is not required, but Oracle still has some bugs about DATEs in recursive CTEs. In PostgreSQL, the following works without issue:

....
all_dates (a_date, lvl) as (
   select oldest as a_date, 0 as lvl
   from date_range 
   union all
   select a_date + 1, lvl + 1
   from all_dates 
   where lvl < (select total_days from date_range)
)
....

      

+11


source


I would choose this option because it is more efficient:

with all_dates_wo_boundary_values as
( select oldest + level the_date
    from ( select min(the_date) oldest
                , max(the_date) recent
             from your_table
         )
 connect by level <= recent - oldest - 1
)
select the_date
  from all_dates_wo_boundary_values
 minus
select the_date
  from your_table

      

And here's some evidence. Install first:

SQL> create table your_table (the_date)
  2  as
  3  select date '2012-01-02' from dual union all
  4  select date '2012-01-02' from dual union all
  5  select date '2012-01-03' from dual union all
  6  select date '2012-01-05' from dual union all
  7  select date '2012-01-05' from dual union all
  8  select date '2012-01-07' from dual union all
  9  select date '2012-01-08' from dual
 10  /

Table created.

SQL> exec dbms_stats.gather_table_stats(user,'your_table')

PL/SQL procedure successfully completed.

SQL> alter session set statistics_level = all
  2  /

Session altered.

      

Horse:



SQL> with date_range as
  2  ( select min(the_date) as oldest
  3         , max(the_date) as recent
  4         , max(the_date) - min(the_date) as total_days
  5      from your_table
  6  )
  7  , all_dates as
  8  ( select ( select oldest from date_range) + level as a_date
  9      from dual
 10   connect by level <= (select total_days from date_range)
 11  )
 12  select ad.a_date
 13    from all_dates ad
 14         left join your_table yt on ad.a_date = yt.the_date
 15   where yt.the_date is null
 16   order by ad.a_date
 17  /

A_DATE
-------------------
04-01-2012 00:00:00
06-01-2012 00:00:00

2 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  gaqx49vb9gz9k, child number 0
-------------------------------------
with date_range as ( select min(the_date) as oldest        , max(the_date) as recent        , max(the_date) - min(the_date) as total_d
ays     from your_table )

, all_dates as ( select ( select oldest from date_range) + level as a_date     from dual  connect by level <= (select total_days from
date_range) ) select

ad.a_date   from all_dates ad        left join your_table yt on ad.a_date = yt.the_date  where yt.the_date is null  order by ad.a_date

Plan hash value: 1419150012

------------------------------------------------------------------------------------------------------------------------------------------------------------------------    
| Id  | Operation                         | Name                        | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | Writes |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   1 |  TEMP TABLE TRANSFORMATION        |                             |      1 |        |      2 |00:00:00.01 |      22 |      1 |    1 |       |       |          |
|   2 |   LOAD AS SELECT                  |                             |      1 |        |      1 |00:00:00.01 |       7 |      0 |    1 |   262K|   262K|  262K (0)|
|   3 |    SORT AGGREGATE                 |                             |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |    0 |       |       |          |
|   4 |     TABLE ACCESS FULL             | YOUR_TABLE                  |      1 |      7 |      7 |00:00:00.01 |       3 |      0 |    0 |       |       |          |
|   5 |   SORT ORDER BY                   |                             |      1 |      1 |      2 |00:00:00.01 |      12 |      1 |    0 |  2048 |  2048 | 2048  (0)|
|*  6 |    FILTER                         |                             |      1 |        |      2 |00:00:00.01 |      12 |      1 |    0 |       |       |          |
|*  7 |     HASH JOIN OUTER               |                             |      1 |      1 |      7 |00:00:00.01 |      12 |      1 |    0 |  1048K|  1048K|  707K (0)|
|   8 |      VIEW                         |                             |      1 |      1 |      6 |00:00:00.01 |       9 |      1 |    0 |       |       |          |
|   9 |       CONNECT BY WITHOUT FILTERING|                             |      1 |        |      6 |00:00:00.01 |       3 |      0 |    0 |       |       |          |
|  10 |        FAST DUAL                  |                             |      1 |      1 |      1 |00:00:00.01 |       0 |      0 |    0 |       |       |          |
|  11 |        VIEW                       |                             |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |    0 |       |       |          |
|  12 |         TABLE ACCESS FULL         | SYS_TEMP_0FD9D660C_81240964 |      1 |      1 |      1 |00:00:00.01 |       3 |      0 |    0 |       |       |          |
|  13 |      TABLE ACCESS FULL            | YOUR_TABLE                  |      1 |      7 |      7 |00:00:00.01 |       3 |      0 |    0 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("YT"."THE_DATE" IS NULL)
   7 - access("YT"."THE_DATE"=INTERNAL_FUNCTION("AD"."A_DATE"))


32 rows selected.

      

And my suggestion:

SQL> with all_dates_wo_boundary_values as
  2  ( select oldest + level the_date
  3      from ( select min(the_date) oldest
  4                  , max(the_date) recent
  5               from your_table
  6           )
  7   connect by level <= recent - oldest - 1
  8  )
  9  select the_date
 10    from all_dates_wo_boundary_values
 11   minus
 12  select the_date
 13    from your_table
 14  /

THE_DATE
-------------------
04-01-2012 00:00:00
06-01-2012 00:00:00

2 rows selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  7aavxmzkj7zq7, child number 0
-------------------------------------
with all_dates_wo_boundary_values as ( select oldest + level the_date     from ( select min(the_date) oldest
  , max(the_date) recent              from your_table          )  connect by level <= recent - oldest - 1 ) select
the_date   from all_dates_wo_boundary_values  minus select the_date   from your_table

Plan hash value: 2293301832

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name       | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
|   1 |  MINUS                          |            |      1 |        |      2 |00:00:00.01 |       6 |       |       |          |
|   2 |   SORT UNIQUE                   |            |      1 |      1 |      5 |00:00:00.01 |       3 |  9216 |  9216 | 8192  (0)|
|   3 |    VIEW                         |            |      1 |      1 |      5 |00:00:00.01 |       3 |       |       |          |
|   4 |     CONNECT BY WITHOUT FILTERING|            |      1 |        |      5 |00:00:00.01 |       3 |       |       |          |
|   5 |      VIEW                       |            |      1 |      1 |      1 |00:00:00.01 |       3 |       |       |          |
|   6 |       SORT AGGREGATE            |            |      1 |      1 |      1 |00:00:00.01 |       3 |       |       |          |
|   7 |        TABLE ACCESS FULL        | YOUR_TABLE |      1 |      7 |      7 |00:00:00.01 |       3 |       |       |          |
|   8 |   SORT UNIQUE                   |            |      1 |      7 |      5 |00:00:00.01 |       3 |  9216 |  9216 | 8192  (0)|
|   9 |    TABLE ACCESS FULL            | YOUR_TABLE |      1 |      7 |      7 |00:00:00.01 |       3 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------


22 rows selected.

      

Regards,
Rob.

+1


source


We can use a simple hierarchical query as shown below:

WITH CTE AS
(SELECT (SELECT MIN(COL1) FROM T)+LEVEL-1 AS OUT FROM DUAL
CONNECT BY (LEVEL-1) <= (SELECT MAX(COL1) - MIN(COL1) FROM T))
SELECT OUT FROM CTE WHERE OUT NOT IN (SELECT COL1 FROM T);

      

+1


source


You need a table Calendar

(persistent or created on the fly). Then you can do simple:

SELECT c.my_date
FROM 
        calendar c
    JOIN
        ( SELECT MIN(date_column) AS min_date 
               , MAX(date_column) AS max_date 
          FROM tableX
        ) mm
      ON c.mydate BETWEEN min_date AND max_date
WHERE
    c.my_date NOT IN
    ( SELECT date_column
      FROM tableX
    )

      

0


source







All Articles