Is there a more elegant way to write this SQL query?
I'm doing Stanfords' introduction to DB course and this is one of the homework assignments. My code does the job well, but I don't really like how I used the same SELECT-FROM-JOIN part twice:
SELECT name, grade
FROM Highschooler
WHERE
ID IN (
SELECT H1.ID
FROM Friend
JOIN Highschooler AS H1
ON Friend.ID1 = H1.ID
JOIN Highschooler AS H2
ON Friend.ID2 = H2.ID
WHERE H1.grade = H2.grade
) AND
ID NOT IN (
SELECT H1.ID
FROM Friend
JOIN Highschooler AS H1
ON Friend.ID1 = H1.ID
JOIN Highschooler AS H2
ON Friend.ID2 = H2.ID
WHERE H1.grade <> H2.grade
)
ORDER BY grade, name
This is the SQL schema for the two tables used in the code:
Highschooler(ID int, name text, grade int);
Friend(ID1 int, ID2 int);
I had to ask all high school students who only have friends in one class and not in other classes. Is there a way to somehow write the code below once and reuse it twice for two different WHERE = and <> clauses?
SELECT H1.ID
FROM Friend
JOIN Highschooler AS H1
ON Friend.ID1 = H1.ID
JOIN Highschooler AS H2
ON Friend.ID2 = H2.ID
EDIT: we need to provide the SQLite code.
source to share
This is an example of a "floating child" for a query WHERE EXISTS
:
SELECT name, grade
FROM Highschooler ME
WHERE EXISTS (
SELECT 1
FROM Friend F
JOIN Highschooler OTHER on F.ID2=OTHER.ID
WHERE F.ID1=ME.ID AND OTHER.Grade = ME.GRADE
)
AND NOT EXISTS (
SELECT 1
FROM Friend F
JOIN Highschooler OTHER on F.ID2=OTHER.ID
WHERE F.ID1=ME.ID AND OTHER.Grade <> ME.GRADE
)
Condition EXISTS
true
if it is SELECT
returned by one or more rows; otherwise it is false
. All you have to do is map the inner subquery to the outer one (part of F.ID1=ME.ID
) and add the rest of the constraints you need ( OTHER.Grade = ME.GRADE
or OTHER.Grade <> ME.GRADE
) to your query.
source to share
This is a typical question about human-related groups. When faced with a question like this, one approach is to use joins (looking at things in pairs). Often the best approach is to use aggregation to view the entire group at once.
The understanding here is that if you have a group of friends and they are all in the same class, then the minimum and maximum marks will be the same.
This hint may be enough for you to write your request. If so, stop here.
A query that returns what you want is much easier than what you were doing. You just need to look at your friends' ratings:
SELECT f.id1
FROM Friend f jJOIN
Highschooler fh
ON Friend.ID1 = fh.ID join
group by f.id1
having max(fh.grade) = min(fh.grade)
The clause having
makes sure everyone is the same (ignore NULL values).
EDIT:
This version answers the question: which high school students have friends, all of whom are in the same class. Your question is ambiguous. Perhaps you mean the friends and the original person are in the same class. If so, then you can do it with a little modification. One way is to change the sentence having
to:
having max(fh.grade) = min(fh.grade) and
max(fh.grade) = (select grade from Highschooler h where f.id1 = h.id1)
This checks that the friends and the original are in the same class.
source to share
Sometimes you can get a more natural query form when you include some filtering joins in given operations, such as UNION or MINUS / EXCEPT. Your request could be written like (pseudocode):
SELECT H.id
FROM Highschooler H
JOIN .... | has a friend
WHERE ... | in SAME grade
EXCEPT
SELECT H.id
FROM Highschooler H
JOIN .... | has a friend
WHERE ... | in OTHER grade
Some SQL machines use the "MINUS" keyword, some use "EXCEPT".
But note, very similar to UNION, this will execute both queries and then filter their results. This may have different performance than a single all-all-all query, but remember that it is not necessarily worse. Many times I find it even has better performance, as "excluding" one column at a time, especially a sorted one, is very fast
Also, if your DB engine allows, you can try to use View or CTE to shorten the original query, but I don't see much point in doing this other than aesthetics
source to share