Self joins. Does it matter inward, outward, or left?
I'm curious about this since I'm only joining one table, so it doesn't seem to make a difference. I read this question: Self -Joins Explained . There are several answers, and they use different types of joins for seemingly the same task.
So does it matter or not? If so, can you provide an example of how?
source to share
It all depends on what you want to do with the data. This answer has done a lot of work in detailing what a native inner join might look like. I recently wrote a report that asked to compare grades from two courses a student took on. It happened something like this:
For a table student_course
:
STUDENT_ID COURSE GRADE
1 MTH251 A
1 MTH252 B
2 MTH251 A
2 MTH252 A
3 MTH251 B
3 MTH252 C
Query:
SELECT course1.student_id
, course1.course AS course1
, course1.grade AS grade1
, course2.course AS course2
, course2.grade AS grade2
FROM student_course course1
INNER JOIN student_course course2
ON course1.student_id = course2.student_id
WHERE course1.course = 'MTH251'
AND course2.course = 'MTH252';
Here's the script. Sorry PostgreSQL script didn't work for me, so I used Oracle for testing. The PostgreSQL equivalent should look approximately the same.
Now tell me that I wanted to see a student who may not have accepted MTH252. You can do it:
SELECT course1.student_id
, course1.course AS course1
, course1.grade AS grade1
, course2.course AS course2
, course2.grade AS grade2
FROM student_course course1
LEFT OUTER JOIN student_course course2
ON course1.student_id = course2.student_id
AND course2.course = 'MTH252'
WHERE course1.course = 'MTH251';
The first shows students who took BOTH MTH251 and MTH252, and the last shows students who took MTH251 regardless of their completion of MTH252.
As noted by Nick.McDermaid, a self-join works exactly the same as joining two tables with different data.
source to share
It really matters. There are many ways to think about this conceptually. In a sense, concatenation means that you want to use one line instead of two lines, if possible. Basically you take two tables and make one table out of them.
The best way I find to understand between inner, right / left and outer is with tables
**FULL Outer:**
name number
john
jamie 7
ann 10
11
12
Some rows are missing items because the outer join uses every possible row in every table. In this case, no matter what we chose as our linker (that is, what happens after "ON"), John has a linker value that does not match any row in our second table. And 11 and 12 are numbers in the second table that don't have linker values ββthat match the name in the other name
Internal means that if one of the items in any of your tables does not have a data item along with the other table, we must skip those items. So the table becomes
**INNER**
name number
jamie 7
ann 10
A left / right join is the same if you look at them from an abstract point of view, because each of these joins will result in a display of the full set of elements in one of the tables, while the other is limited to only those that have a partner in the other table. Left / Right are outer joins, but mostly only half are outer.
**left/right:**
name number
lee
john
jamie 7
ann 10
name number
jamie 7
ann 10
15
29
In Explaining Self-Joins, take the example of inner join he gave. But what if there are some Bosses that list an employee that cannot be found in the employee table, or even list a null value? And what if there is an employee who indicates the boss in the employee table, but the boss is not in the boss table. Or maybe the employee doesn't have a single boss? (This would be realistically realistic since some people are self-employed)
Then we have to decide what exactly are we trying to request? Do we need to include self-employed employees? If so, INNER JOIN is excluded. So now we have to decide if we want to include bosses that don't have employees among our pool in the database.
Realistically, I can imagine that we will be doing a left or right join.
source to share
First of all, it left join
is outer join
.
This is important because the definition inner join
and left join
will be the same as long as the self-join. So let's say you have a typical Employee table with a manager. Now, for the sake of argument, we can say that one of the employee managers is not on the employee list. Therefore, if you do the typical self-join, you will not receive this entry. But with a left join you can get this record back.
Also one use case left join
is to search for records that are in the table left
but not in right
one using a where clause like where right_table.key is null
. You can achieve the same thing for self-joining, which you cannot do with inner joining.
Here are some queries explaining the above scenario and you can see additional things you can do with left join
, but not with inner join
.
source to share
LEFT (OUTER) JOIN ON, by definition, gives the rows that INNER JOIN ON gives plus the unmatched rows on the left, extended with NULL. So if every left line is consistent, they give the same answer. Specifically, if the ON condition is a non-NULL equality of the FK (foreign key) of the left table referencing the PK (primary key), or UNIQUE NOT NULL in the other, then every left row has a match and they give the same answer. Same refers to the JOIN prime and the right table.
So, in LEFT self-JOIN, if every left row is matched, they give the same answer. And, in particular, if the ON condition is a non-NULL equality of the FK (foreign key) of its reference to its PK (primary key) or UNIQUE NOT NULL, then each row has a match and gives the same answer.
For example, if every employee has a manager, then in EMPLOYEE(e,...,m)
not NULL
FOREIGN KEY (m) REFERENCES PK (e)
, so LEFT
self- JOIN ON left.m = right.e
gives the same result as INNER
.
PS When you have a hypothesis, you can look for counter examples that could simply refute it. Almost any small random self-connection will disprove yours. You tried? If you have a "feeling" about hypotheses about some special cases, you can do it again.
source to share