Is there a more elegant way to write this SQL query?

I'm doing Stanfords' introduction to DB course and this is one of the homework assignments. My code does the job well, but I don't really like how I used the same SELECT-FROM-JOIN part twice:

SELECT name, grade
FROM Highschooler
WHERE
    ID IN (
        SELECT H1.ID
        FROM Friend
        JOIN Highschooler AS H1
            ON Friend.ID1 = H1.ID
        JOIN Highschooler AS H2
            ON Friend.ID2 = H2.ID
        WHERE H1.grade = H2.grade    
    ) AND
    ID NOT IN (
        SELECT H1.ID
        FROM Friend
        JOIN Highschooler AS H1
            ON Friend.ID1 = H1.ID
        JOIN Highschooler AS H2
            ON Friend.ID2 = H2.ID
        WHERE H1.grade <> H2.grade
    )
ORDER BY grade, name

      

This is the SQL schema for the two tables used in the code:

Highschooler(ID int, name text, grade int);
Friend(ID1 int, ID2 int);

      

I had to ask all high school students who only have friends in one class and not in other classes. Is there a way to somehow write the code below once and reuse it twice for two different WHERE = and <> clauses?

    SELECT H1.ID
    FROM Friend
    JOIN Highschooler AS H1
        ON Friend.ID1 = H1.ID
    JOIN Highschooler AS H2
        ON Friend.ID2 = H2.ID

      

EDIT: we need to provide the SQLite code.

+3


source to share


4 answers


This is an example of a "floating child" for a query WHERE EXISTS

:

SELECT name, grade
FROM Highschooler ME
WHERE EXISTS (
    SELECT 1
    FROM Friend F
    JOIN Highschooler OTHER on F.ID2=OTHER.ID
    WHERE F.ID1=ME.ID AND OTHER.Grade = ME.GRADE
)
AND NOT EXISTS (
    SELECT 1
    FROM Friend F
    JOIN Highschooler OTHER on F.ID2=OTHER.ID
    WHERE F.ID1=ME.ID AND OTHER.Grade <> ME.GRADE
)

      



Condition EXISTS

true

if it is SELECT

returned by one or more rows; otherwise it is false

. All you have to do is map the inner subquery to the outer one (part of F.ID1=ME.ID

) and add the rest of the constraints you need ( OTHER.Grade = ME.GRADE

or OTHER.Grade <> ME.GRADE

) to your query.

+4


source


This is a typical question about human-related groups. When faced with a question like this, one approach is to use joins (looking at things in pairs). Often the best approach is to use aggregation to view the entire group at once.

The understanding here is that if you have a group of friends and they are all in the same class, then the minimum and maximum marks will be the same.

This hint may be enough for you to write your request. If so, stop here.

A query that returns what you want is much easier than what you were doing. You just need to look at your friends' ratings:

SELECT f.id1
FROM Friend f jJOIN
     Highschooler fh
     ON Friend.ID1 = fh.ID join
group by f.id1
having max(fh.grade) = min(fh.grade)

      



The clause having

makes sure everyone is the same (ignore NULL values).

EDIT:

This version answers the question: which high school students have friends, all of whom are in the same class. Your question is ambiguous. Perhaps you mean the friends and the original person are in the same class. If so, then you can do it with a little modification. One way is to change the sentence having

to:

having max(fh.grade) = min(fh.grade) and
       max(fh.grade) = (select grade from Highschooler h where f.id1 = h.id1)

      

This checks that the friends and the original are in the same class.

+4


source


Sometimes you can get a more natural query form when you include some filtering joins in given operations, such as UNION or MINUS / EXCEPT. Your request could be written like (pseudocode):

  SELECT H.id
  FROM Highschooler H
  JOIN .... | has a friend
  WHERE ... | in SAME grade

EXCEPT

  SELECT H.id
  FROM Highschooler H
  JOIN .... | has a friend
  WHERE ... | in OTHER grade

      

Some SQL machines use the "MINUS" keyword, some use "EXCEPT".

But note, very similar to UNION, this will execute both queries and then filter their results. This may have different performance than a single all-all-all query, but remember that it is not necessarily worse. Many times I find it even has better performance, as "excluding" one column at a time, especially a sorted one, is very fast

Also, if your DB engine allows, you can try to use View or CTE to shorten the original query, but I don't see much point in doing this other than aesthetics

+1


source


Some databases support the minus keyword.

select whatever
from wherever
where id in
(select id
 from somewhere
 where something
 minus
 select id
 from somewhere
 where something else
 )

      

Other databases support the same concept, but with a keyword other than, not a minus.

0


source







All Articles