Collect a concatenated set of multiple tables related to the same (super) table

Question

Collect a concatenated set of multiple tables related to the same (super) table

I came up with an OO-like design for my database tables with a "super-table" that contains the columns that are in all of my shared tables, each of the "helper tables" using the rowid ptr for the super table.

Like this:

CREATE TABLE 'SuperTable' (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  created DATETIME
);

CREATE TABLE 'SubTable1' (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  super_id INTEGER, -- reference to SuperTable
  additionalData TEXT
);

CREATE TABLE 'SubTable2' (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  super_id INTEGER, -- reference to SuperTable
  moreData BLOB
);

For every record in any "auxiliary table" in the SuperTable, there is exactly one corresponding record and vice versa.

Now, I like to query all sub tables, giving me a row for each record in SuperTable

, with associated data in the appropriate subcategory.

I came up with this:

SELECT * FROM SuperTable
LEFT OUTER JOIN SubTable1 ON SubTable1.super_id = SuperTable.id
LEFT OUTER JOIN SubTable2 ON SubTable2.super_id = SuperTable.id
WHERE
  SubTable1.super_id IS NOT NULL OR
  SubTable2.super_id IS NOT NULL

I saw that without the part WHERE

I would have gotten quite a few rows where both sub-tables were NULL

- thanks to the modifier OUTER JOIN

, because it is SuperTable

also used by other sub-categories.I did not include this query.

Here's an example of output without a WHERE clause:

id          created     id          super_id    additionalData  id          super_id    moreData  
----------  ----------  ----------  ----------  --------------  ----------  ----------  ----------
1           a                                                                                     
2           b                                                   1           2           more of 1 
3           c                                                                                     
4           d           3           4           additional 3                                      
5           e                                                   2           5           more of 2

Rows 1 and 3 above are empty and should be removed from the results I am currently achieving with the suggestion WHERE

.

I wonder if there is a better way to select rows for selected subcategories. For example. the one that doesn't end first, collecting all the rows from SuperTable

and only then sorting those that were not in the combined table.

I'm using SQLite at the moment, but a more general answer would be appreciated as well.

BTW, here's the test database I'm using with the examples above: SO_ 30595895.sqlite

+3

sql sqlite left-join

Thomas tempelmann 02 june 15 at 12:19

source to share

2 answers

joop · Answer 1 · 2015-06-02T13:10:46+0000

There are two ways to avoid duplication (caused by FKs in that they are not unique): 1) there is:

SELECT s.*
FROM supertable s
WHERE EXISTS ( SELECT 1 FROM subtable1 x
     WHERE x.super_id = s.id)
OR EXISTS ( SELECT 1 FROM subtable2 x
     WHERE x.super_id = s.id)
-- OR EXISTS ...

Or, 2) first concatenate the subtext FKs and concatenate the result with supertable:

SELECT s.*
FROM supertable s
JOIN ( SELECT DISTINCT super_id AS id
      FROM subtable1
     UNION
     SELECT DISTINCT super_id AS id
      FROM subtable2
     -- union ...
     ) x ON x.id = s.id
     ;

UPDATE. 3) if you also want a (boolean) indicator for existence in any of the sub-tables, you can use exists () on a scalar subquery:

SELECT s.*
  , (EXISTS ( SELECT 1 FROM subtable1 x
     WHERE x.super_id = s.id)) AS exists_in_1
  , (EXISTS ( SELECT 1 FROM subtable2 x
     WHERE x.super_id = s.id)) AS exists_in_2
  -- , ...
FROM supertable s

shA.t · Answer 2 · 2015-06-02T13:33:00+0000

I have to clean this up when you have a dataset like this:

[SuperTable]     [SubTable1]     [SubTable2]
ID               ID | stID       ID | stID
----             ---+-------     ---+-------
1                1  | 1          1  | 2
2                2  | 1          2  | 2

the result of using multi LEFT JOIN

is as follows:

ID  | ID    | sID   | ID    | sID
----+-------+-------+-------+-------
1   | 1     | 1     | NULL  | NULL
1   | 2     | 1     | NULL  | NULL
2   | NULL  | NULL  | 1     | 2
2   | NULL  | NULL  | 2     | 2

Therefore, I suggest you use this query:

SELECT s.*, SubTable1.*, SubTable2.*
FROM SuperTable s
    LEFT OUTER JOIN 
    (SELECT MIN(id) id, super_id
     FROM SubTable1
     GROUP BY super_id) s1 
    JOIN SubTable1 ON s1.id = SubTable1.id ON s1.super_id = s.id
    LEFT OUTER JOIN 
    (SELECT MIN(id) id, super_id 
     FROM SubTable2
     GROUP BY super_id) s2 
    JOIN SubTable2 ON s2.id = SubTable2.id ON s2.super_id = s.id
WHERE
    COALESCE(s1.super_id, s2.super_id, -2) <> -2

Collect a concatenated set of multiple tables related to the same (super) table

More articles: