SQL - find all instances where two columns are the same

So, I have a simple table containing comments

from user

that belongs to a specific blog post

.

id  |  user           |  post_id  |  comment
----------------------------------------------------------
0   | john@test.com   |  1001     |  great article
1   | bob@test.com    |  1001     |  nice post
2   | john@test.com   |  1002     |  I agree
3   | john@test.com   |  1001     |  thats cool
4   | bob@test.com    |  1002     |  thanks for sharing
5   | bob@test.com    |  1002     |  really helpful
6   | steve@test.com  |  1001     |  spam post about pills

      

I want to get all instances in which a user commented on the same post twice (which means the same user

one post_id

). In this case, I would return:

id  |  user           |  post_id  |  comment
----------------------------------------------------------
0   | john@test.com   |  1001     |  great article
3   | john@test.com   |  1001     |  thats cool
4   | bob@test.com    |  1002     |  thanks for sharing
5   | bob@test.com    |  1002     |  really helpful

      

I thought DISTINCT

this was what I needed, but it just gives me unique rows.

+3


source to share


3 answers


You can use GROUP BY

and HAVING

to find pairs user

and post_id

that have multiple entries:



  SELECT a.*
  FROM table_name a
  JOIN (SELECT user, post_id
        FROM table_name
        GROUP BY user, post_id
        HAVING COUNT(id) > 1
        ) b
  ON a.user = b.user
  AND a.post_id = b.post_id

      

+2


source


DISTINCT

removes all duplicate lines, so you get unique lines.

You can try using CROSS JOIN

(available as in Hive 0.10 according to https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins ):



SELECT mt.*
FROM MYTABLE mt
CROSS JOIN MYTABLE mt2
WHERE mt.user = mt2.user
AND mt.post_id = mt2.post_id

      

Performance may not be the best. If you want to sort it, use SORT BY

or ORDER BY

.

0


source


DECLARE @MyTable TABLE (id int, usr varchar(50), post_id int, comment varchar(50))
INSERT @MyTable (id, usr, post_id, comment) VALUES (0,'john@test.com',1001,'great article')
INSERT @MyTable (id, usr, post_id, comment) VALUES (1,'bob@test.com',1001,'nice post')
INSERT @MyTable (id, usr, post_id, comment) VALUES (3,'john@test.com',1002,'I agree')
INSERT @MyTable (id, usr, post_id, comment) VALUES (4,'john@test.com',1001,'thats cool')
INSERT @MyTable (id, usr, post_id, comment) VALUES (5,'bob@test.com',1002,'thanks for sharing')
INSERT @MyTable (id, usr, post_id, comment) VALUES (6,'bob@test.com',1002,'really helpful')
INSERT @MyTable (id, usr, post_id, comment) VALUES (7,'steve@test.com',1001,'spam post about pills')

SELECT
    T1.id,
    T1.usr,
    T1.post_id,
    T1.comment
FROM
    @MyTable T1

    INNER JOIN @MyTable T2
    ON T1.usr = T2.usr AND T1.post_id = T2.post_id
GROUP BY
    T1.id,
    T1.usr,
    T1.post_id,
    T1.comment
HAVING
    Count(T2.id) > 1

      

0


source







All Articles