Intersection Complement in SQL

Question

Intersection Complement in SQL

I am using Oracle SQL and I have a basic question regarding the command join

.

I have 5 tables. Each of them has the same column as the primary key: ID (int)

. Let's take a look at the following queries:

select count(*) from table_a - 100 records
select count(*) from table_c - 200 records
select count(*) from table_c - 150 records
select count(*) from table_d - 100 records
select count(*) from table_e - 120 records

select * -- 88 records
 from table_a a
  inner join table b
    on a.id = b.id
  inner join table c
    on a.id = c.id
  inner join table d
    on a.id = d.id
  inner join table e
    on a.id = e.id

In this case, many records are not included in the output if one of the tables does not include a specific identifier (even if the rest of them are included). How can I find out what these "bad" records are? This is actually an addition to the intersection that I think.

I want to know what are the problem records and tables of each case. For example: ID 123 is a bad record because it is not included in table_c, but is included in other tables. ID 321 is problematic because it is included in all tables except table_d.

+3

sql join oracle intersection

Omri 14 oct. 14 at 12:46

source to share

5 answers

You can try the following query

 SELECT id, COUNT(id) as id_num FROM (
 SELECT id FROM table_a
 UNION
 SELECT id FROM table_b
 UNION
 SELECT id FROM table_c
 UNION
 SELECT id FROM table_d
 UNION
 SELECT id FROM table_e
 ) 
GROUP BY id HAVING id_num <5

+2

geoandri 14 oct. 14 at 13:50

source to share

Try the following:

        SELECT id FROM (
SELECT id FROM table_a
UNION
SELECT id FROM table_b
UNION
SELECT id FROM table_c
UNION
SELECT id FROM table_d
UNION
SELECT id FROM table_e
) result
WHERE id NOT IN ( select a.id from table_a a
        inner join table_b b
        on a.id = b.id
        inner join table_c c
        on a.id = c.id
        inner join table_d d
        on a.id = d.id
        inner join table_e e
        on a.id = e.id )

+1

www 14 oct. 14 at 13:05

source to share

If you understand correctly, you can use outer joins to determine which rows do not have matching primary (or unique) keys. For example, use a left join to find inconsistent rows in table b in the following example:

select a.id from a left join b on a.id=b.id where b.id is null

conversely, to find inconsistent rows in table a:

select b.id from a right join b on a.id=b.id where a.id is null

0

ron tornambe 14 oct. At 14:01

source to share

This solution will tell you which tables do not have rows for each ID

:

SELECT   *
FROM     (SELECT id, 'table_a' AS table_name FROM table_a
          UNION ALL
          SELECT id, 'table_b' FROM table_b
          UNION ALL
          SELECT id, 'table_c' FROM table_c
          UNION ALL
          SELECT id, 'table_d' FROM table_d
          UNION ALL
          SELECT id, 'table_c' FROM table_e) PIVOT (COUNT (*)
                                             FOR table_name
                                             IN  ('table_a' AS table_a,
                                                 'table_b' AS table_b,
                                                 'table_c' AS table_c,
                                                 'table_d' AS table_d,
                                                 'table_e' AS table_e))
WHERE    table_a + table_b + table_c + table_d + table_e < 5
ORDER BY id

Result example:

ID | TABLE_A | TABLE_B | TABLE_C | TABLE_D | TABLE_E
0  |       1 |       0 |       0 |       1 |       0
1  |       0 |       1 |       0 |       1 |       0
2  |       1 |       1 |       0 |       0 |       0

0

Allan 14 oct. 14 at 14:39

source to share

Sylvain Leroux · Accepted Answer · 2014-10-14T16:50:54+0000

You are probably looking for a symmetrical difference between all of your tables.

To solve this problem without being too smart, you will need FULL OUTER JOIN ... USING

:

SELECT id
    FROM table_a
    FULL OUTER JOIN table_b USING(id) 
    FULL OUTER JOIN table_c USING(id) 
    FULL OUTER JOIN table_d USING(id) 
    FULL OUTER JOIN table_e USING(id) 
WHERE table_a.ROWID IS NULL
   OR table_b.ROWID IS NULL
   OR table_c.ROWID IS NULL
   OR table_d.ROWID IS NULL
   OR table_e.ROWID IS NULL;

FULL OUTER JOIN

will return all rows that satisfy the concatenation condition (as normal JOIN

), as well as all rows without matching rows. The clause USING

inserts an implicit one COALESCE

into the equijoin column.

Another option is to use anti-join :

SELECT id
    FROM table_a
    FULL OUTER JOIN table_b USING(id) 
    FULL OUTER JOIN table_c USING(id) 
    FULL OUTER JOIN table_d USING(id) 
    FULL OUTER JOIN table_e USING(id) 
WHERE id NOT IN (
    SELECT id
        FROM table_a
        INNER JOIN table_b USING(id) 
        INNER JOIN table_c USING(id) 
        INNER JOIN table_d USING(id) 
        INNER JOIN table_e USING(id) 
)

Basically, this will lead to the union of all sets minus the intersection of all sets.

Graphically, you can compare INNER JOIN

and OUTER JOIN

(on 3 tables just for presentation convenience):

INNER JOIN FULL OUTER JOIN

Given this test case:

ID    TABLE_A TABLE_B TABLE_C TABLE_D TABLE_E
1     *       -       -       -       -
2     -       *       *       *       *
3     *       -       -       *       -
4     *       *       *       *       *

_{*

-

no entry in the table}

Both requests will return:

ID
1
3
2

If you want a tabular result, you can adapt one of these queries by adding a bunch of expressions CASE

. Something like that:

SELECT ID,
    CASE when table_a.rowid is not null then 1 else 0 END table_a,
    CASE when table_b.rowid is not null then 1 else 0 END table_b,
    CASE when table_c.rowid is not null then 1 else 0 END table_c,
    CASE when table_d.rowid is not null then 1 else 0 END table_d,
    CASE when table_e.rowid is not null then 1 else 0 END table_e
FROM table_a
    FULL OUTER JOIN table_b USING(id) 
    FULL OUTER JOIN table_c USING(id) 
    FULL OUTER JOIN table_d USING(id) 
    FULL OUTER JOIN table_e USING(id) 
WHERE table_a.ROWID IS NULL
   OR table_b.ROWID IS NULL
   OR table_c.ROWID IS NULL
   OR table_d.ROWID IS NULL
   OR table_e.ROWID IS NULL;

Production:

ID    TABLE_A TABLE_B TABLE_C TABLE_D TABLE_E
1     1       0       0       0       0
3     1       0       0       1       0
2     0       1       1       1       1

_{1

0

no entry in the table}

Intersection Complement in SQL

More articles: