SELECT Duplicate / Duplicate values โโwith count N without COUNT or GROUP BY
Let's say I have a table of personal IDs (1-8) and person roles (1-4) as such:
CREATE TABLE personRole (
PersonId int NOT NULL,
RoleId int NOT NULL
);
INSERT INTO personRole
VALUES
(1, 1),
(1, 2),
(2, 1),
(2, 3),
(3, 3),
(4, 3),
(1, 4),
(5, 2),
(6, 1),
(7, 1),
(7, 4),
(8, 1),
(8, 2),
(8, 4)
;
My goal is to choose the id of a person who has 3 or more roles, and roles specifically 1, 2 and 4. Here is my first solution:
SELECT PersonId FROM personRole
WHERE RoleID in (1,2,4)
GROUP BY PersonId
HAVING count(*) >= 3
But then I was told to do it without GROUP BY because it is slow, so I came up with this solution:
SELECT distinct PersonId
FROM
(
SELECT PersonId, count(*) over(partition by PersonId) AS pcount
FROM (SELECT * FROM personRole WHERE RoleID in (1,2,4)) AS A
) AS S
WHERE pcount >= 3
I have included them to give an example of what I am trying to achieve. But now I was told to try it without counting. Currently I can find all lines that have a duplicate / duplicate person ID:
SELECT personId
FROM personRole AS a
WHERE EXISTS (
SELECT 1
FROM personRole AS a2
WHERE a2.PersonId = a.PersonId
AND a2.RoleID <> a.RoleID
);
But I get stuck trying to figure out how to select them if they are repeated 3 or more times. If I can, then I suspect I can just VIEW it with:
SELECT PersonId FROM personRole
WHERE RoleID in (1,2,4)
To get a complete solution. Am I solving this correctly so far, or am I going in the wrong direction?
source to share
Does the dictator not have any aggregate functions under "disregarding"? You can always sum (1) instead of count (*).
If not, try joining yourself.
select a.PersonId,
a.RoleId,
b.RoleId,
c.RoleId,
d.RoleId
from personRole a
left join personRole b
on a.PersonId = b.PersonId
and a.RoleId <> b.RoleId
left join personRole c
on a.PersonId = c.PersonId
and a.RoleId <> c.RoleId
and b.RoleId <> c.RoleId
left join personRole d
on a.PersonId = d.PersonId
and a.RoleId <> d.RoleId
and b.RoleId <> d.RoleId
and c.RoleId <> d.RoleId
order by a.PersonId, a.RoleId
;
+----------+--------+--------+--------+--------+
| PersonId | RoleId | RoleId | RoleId | RoleId |
+----------+--------+--------+--------+--------+
| 1 | 1 | 4 | 2 | NULL |
| 1 | 1 | 2 | 4 | NULL |
| 1 | 2 | 4 | 1 | NULL |
| 1 | 2 | 1 | 4 | NULL |
| 1 | 4 | 2 | 1 | NULL |
| 1 | 4 | 1 | 2 | NULL |
| 2 | 1 | 3 | NULL | NULL |
| 2 | 3 | 1 | NULL | NULL |
| 3 | 3 | NULL | NULL | NULL |
| 4 | 3 | NULL | NULL | NULL |
| 5 | 2 | NULL | NULL | NULL |
| 6 | 1 | NULL | NULL | NULL |
| 7 | 1 | 4 | NULL | NULL |
| 7 | 4 | 1 | NULL | NULL |
| 8 | 1 | 2 | 4 | NULL |
| 8 | 1 | 4 | 2 | NULL |
| 8 | 2 | 1 | 4 | NULL |
| 8 | 2 | 4 | 1 | NULL |
| 8 | 4 | 2 | 1 | NULL |
| 8 | 4 | 1 | 2 | NULL |
+----------+--------+--------+--------+--------+
20 rows in set (0.00 sec)
Limit this with a where clause that looks for values โโin c.RoleId and use your magic numbers to cull the Cartesian product like this:
select a.PersonId,
a.RoleId,
b.RoleId,
c.RoleId
from personRole a
left join personRole b
on a.PersonId = b.PersonId
left join personRole c
on a.PersonId = c.PersonId
where
b.RoleId <> a.RoleId
and b.RoleId <> c.RoleId
and c.RoleId <> a.RoleId
and c.RoleId <> b.RoleId
and a.RoleId = 1
and b.RoleId = 2
and c.RoleId = 4
order by a.PersonId, a.RoleId
;
+----------+--------+--------+--------+
| PersonId | RoleId | RoleId | RoleId |
+----------+--------+--------+--------+
| 1 | 1 | 2 | 4 |
| 8 | 1 | 2 | 4 |
+----------+--------+--------+--------+
2 rows in set (0.00 sec)
If you want it to be even more compact and you are looking for just one case, you can do away with left concatenation and compare the values โโtogether
mysql> select a.PersonId,
-> a.RoleId,
-> b.RoleId,
-> c.RoleId
-> from personRole a,
-> personRole b,
-> personRole c
-> where
-> a.PersonId = b.PersonId
-> and a.PersonId = c.PersonId
-> and a.RoleId = 1
-> and b.RoleId = 2
-> and c.RoleId = 4
-> order by a.PersonId, a.RoleId
-> ;
+----------+--------+--------+--------+
| PersonId | RoleId | RoleId | RoleId |
+----------+--------+--------+--------+
| 1 | 1 | 2 | 4 |
| 8 | 1 | 2 | 4 |
+----------+--------+--------+--------+
2 rows in set (0.00 sec)
source to share
You can do your own pooling, although I don't know if that would be more efficient than your other solutions. This will save you the trouble of any aggregate functions as you cannot seem to use them.
select a.PersonId
from personRole a
join personRole b on a.PersonId = b.PersonId
and b.RoleId = 2
join personRole c on a.PersonId = c.PersonId
and c.RoleId = 4
where a.RoleId = 1
source to share