SELECT Duplicate / Duplicate values with count N without COUNT or GROUP BY

Question

SELECT Duplicate / Duplicate values with count N without COUNT or GROUP BY

Let's say I have a table of personal IDs (1-8) and person roles (1-4) as such:

CREATE TABLE personRole (
PersonId int NOT NULL,
RoleId int NOT NULL
);

INSERT INTO personRole
VALUES
(1, 1),
(1, 2),
(2, 1),
(2, 3),
(3, 3),
(4, 3),
(1, 4),
(5, 2),
(6, 1),
(7, 1),
(7, 4),
(8, 1),
(8, 2),
(8, 4)
;

My goal is to choose the id of a person who has 3 or more roles, and roles specifically 1, 2 and 4. Here is my first solution:

SELECT PersonId FROM personRole
WHERE RoleID in (1,2,4)
GROUP BY PersonId
HAVING count(*) >= 3

But then I was told to do it without GROUP BY because it is slow, so I came up with this solution:

SELECT distinct PersonId
FROM 
(
  SELECT PersonId, count(*) over(partition by PersonId) AS pcount
  FROM (SELECT * FROM personRole WHERE RoleID in (1,2,4)) AS A
) AS S
WHERE pcount >= 3

I have included them to give an example of what I am trying to achieve. But now I was told to try it without counting. Currently I can find all lines that have a duplicate / duplicate person ID:

 SELECT personId
 FROM personRole AS a
 WHERE EXISTS (
   SELECT 1
   FROM   personRole AS a2
   WHERE  a2.PersonId = a.PersonId
   AND    a2.RoleID <> a.RoleID
 );

But I get stuck trying to figure out how to select them if they are repeated 3 or more times. If I can, then I suspect I can just VIEW it with:

SELECT PersonId FROM personRole
WHERE RoleID in (1,2,4)

To get a complete solution. Am I solving this correctly so far, or am I going in the wrong direction?

+3

mysql

Ryan Apr 25. 17 at 19:46

source to share

2 answers

You can do your own pooling, although I don't know if that would be more efficient than your other solutions. This will save you the trouble of any aggregate functions as you cannot seem to use them.

select a.PersonId
from personRole a
    join personRole b on a.PersonId = b.PersonId
        and b.RoleId = 2
    join personRole c on a.PersonId = c.PersonId
        and c.RoleId = 4
where a.RoleId = 1

+2

Jen R Apr 25. 17 at 20:14

source to share

Bryan newman · Accepted Answer · 2017-04-25T20:05:02+0000

Does the dictator not have any aggregate functions under "disregarding"? You can always sum (1) instead of count (*).

If not, try joining yourself.

select a.PersonId, 
   a.RoleId, 
   b.RoleId, 
   c.RoleId,
   d.RoleId
from personRole a
    left join personRole b
    on a.PersonId = b.PersonId
    and a.RoleId <> b.RoleId
left join personRole c
    on a.PersonId = c.PersonId
    and a.RoleId <> c.RoleId
    and b.RoleId <> c.RoleId
left join personRole d
    on a.PersonId = d.PersonId
    and a.RoleId <> d.RoleId
    and b.RoleId <> d.RoleId
    and c.RoleId <> d.RoleId
order by a.PersonId, a.RoleId
;

+----------+--------+--------+--------+--------+
| PersonId | RoleId | RoleId | RoleId | RoleId |
+----------+--------+--------+--------+--------+
|        1 |      1 |      4 |      2 |   NULL |
|        1 |      1 |      2 |      4 |   NULL |
|        1 |      2 |      4 |      1 |   NULL |
|        1 |      2 |      1 |      4 |   NULL |
|        1 |      4 |      2 |      1 |   NULL |
|        1 |      4 |      1 |      2 |   NULL |
|        2 |      1 |      3 |   NULL |   NULL |
|        2 |      3 |      1 |   NULL |   NULL |
|        3 |      3 |   NULL |   NULL |   NULL |
|        4 |      3 |   NULL |   NULL |   NULL |
|        5 |      2 |   NULL |   NULL |   NULL |
|        6 |      1 |   NULL |   NULL |   NULL |
|        7 |      1 |      4 |   NULL |   NULL |
|        7 |      4 |      1 |   NULL |   NULL |
|        8 |      1 |      2 |      4 |   NULL |
|        8 |      1 |      4 |      2 |   NULL |
|        8 |      2 |      1 |      4 |   NULL |
|        8 |      2 |      4 |      1 |   NULL |
|        8 |      4 |      2 |      1 |   NULL |
|        8 |      4 |      1 |      2 |   NULL |
+----------+--------+--------+--------+--------+
20 rows in set (0.00 sec)

Limit this with a where clause that looks for values in c.RoleId and use your magic numbers to cull the Cartesian product like this:

select a.PersonId, 
       a.RoleId, 
       b.RoleId, 
       c.RoleId
from personRole a
left join personRole b
    on a.PersonId = b.PersonId
left join personRole c
    on a.PersonId = c.PersonId
where 
    b.RoleId <> a.RoleId
    and b.RoleId <> c.RoleId
    and c.RoleId <> a.RoleId
    and c.RoleId <> b.RoleId
    and a.RoleId = 1
    and b.RoleId = 2
    and c.RoleId = 4
order by a.PersonId, a.RoleId
;

+----------+--------+--------+--------+
| PersonId | RoleId | RoleId | RoleId |
+----------+--------+--------+--------+
|        1 |      1 |      2 |      4 |
|        8 |      1 |      2 |      4 |
+----------+--------+--------+--------+
2 rows in set (0.00 sec)

If you want it to be even more compact and you are looking for just one case, you can do away with left concatenation and compare the values together

mysql> select a.PersonId, 
    ->        a.RoleId, 
    ->        b.RoleId, 
    ->        c.RoleId
    -> from personRole a,
    ->      personRole b,
    ->      personRole c
    -> where 
    ->     a.PersonId = b.PersonId
    ->     and a.PersonId = c.PersonId
    ->     and a.RoleId = 1
    ->     and b.RoleId = 2
    ->     and c.RoleId = 4
    -> order by a.PersonId, a.RoleId
    -> ;
+----------+--------+--------+--------+
| PersonId | RoleId | RoleId | RoleId |
+----------+--------+--------+--------+
|        1 |      1 |      2 |      4 |
|        8 |      1 |      2 |      4 |
+----------+--------+--------+--------+
2 rows in set (0.00 sec)

SELECT Duplicate / Duplicate values ​​with count N without COUNT or GROUP BY

More articles:

SELECT Duplicate / Duplicate values with count N without COUNT or GROUP BY