SQL Query compares data in different rows
I had this question yesterday in a job interview, and although it seemed pretty straight forward, I couldn't figure it out and it kept me awake all night thinking about it.
The system records data about the schoot league, there is a table of participants, a table of games and a join_games table.
members: member_id, name : pk(member_id)
member_games: game_id, member_id, score : pk(game_id, member_id)
games: game_id, location, date : pk(game_id)
members
1, mick
2, keith
3, charlie
member_games
1, 1, 50
1, 2, 60
2, 1, 45
2, 3, 105
3, 1, 30
3, 3, 120
game
1, london, 2012-12-01
2, richmond, 2012-12-02
3, leeds, 2012-12-03
How would you formulate the SQL query to find out the number of wins for member_id
= 1?
source to share
The following query will give you the number of wins for member_id 1.
select count(*) as number_of_wins
from member_games as mg1
where member_id = 1
and not exists (select 1
from member_games as mg2
where mg2.game_id = mg1.game_id
and mg2.member_id <> mg1.member_id
and mg2.score >= mg1.score);
The following query will provide you with a list of entrants with the corresponding number of wins.
select m.*,
(select count(*) from member_games as mg1
where member_id = m.member_id
and not exists (select 1
from member_games as mg2
where mg2.game_id = mg1.game_id
and mg2.member_id <> mg1.member_id
and mg2.score >= mg1.score)
) as number_of_wins
from members as m;
These requests do not treat first place ties as wins.
I created a MySQL SQL Fiddle Demo and a SQL Server SQL Fiddle Demo .
source to share
Query to find the number of wins for member_id = 1
,
SELECT COUNT(1) "No. of Wins"
FROM (SELECT game_id, member_id, score
FROM member_games b
WHERE score =
(SELECT max(score) from member_games WHERE game_id = b.game_id)) A
WHERE member_id = 1;
See this SQLFiddle
source to share
The key is to first group the member members and get the highest score, then join them back into the member_games table to get the member_id.
The point is, you need a left join to see that Member_id has won 0 games.
SELECT
member_games.member_id
,COUNT(BestScore.game_id)
FROM member_games
LEFT JOIN
(
SELECT game_id, MAX(score) AS HighestScore FROM member_games GROUP BY Game_ID
) BestScore ON member_games.Score = BestScore.HighestScore
AND member_games.game_id = BestScore.game_id
WHERE member_games.member_id = 1
GROUP BY member_games.member_id;
Here is the SQL Fiddle as MySQL
This solution counts links as gains, but should work on any SQL server. The Rank function is available in Microsoft SQL Server 2005 and later.
For the sake of completeness, here's a more complex query that doesn't account for linking how it wins:
SELECT
member_games.member_id
,COUNT(BestScore.game_id)
FROM member_games
LEFT JOIN
(
SELECT member_games.game_id, HighestScore
FROM member_games
LEFT JOIN
(
SELECT game_id, MAX(score) AS HighestScore FROM member_games GROUP BY Game_ID
) BestScore ON member_games.Score = BestScore.HighestScore
AND member_games.game_id = BestScore.game_id
GROUP BY game_id, HighestScore
HAVING count(1) = 1
) BestScore ON member_games.Score = BestScore.HighestScore
WHERE member_games.member_id = 1
GROUP BY member_games.member_id;
Links as losses on SQL Fiddle like MySQL
source to share
This should be done (SQL Server syntax)
SELECT member_id
,COUNT(*) Wins
FROM (
SELECT member_id
,RANK() OVER (PARTITION BY Game_Id ORDER BY score DESC) Position
FROM member_games
) Positions
WHERE Position=1
AND
member_id=1
GROUP BY member_id
RANK()
allows for the possibility of relationships that are possible (if unlikely) in Scrabble. ROW_NUMBER()
may offer a slight performance boost in games where a tie is not possible.
This method can be easily expanded to show all interference as
SELECT member_id
,Position
,COUNT(*) Games
FROM (
SELECT member_id
,RANK() OVER (PARTITION BY Game_Id ORDER BY score DESC) Position
FROM member_games
) Positions
WHERE member_id=1
GROUP BY member_id
,Position
See this SQL Fiddle
source to share