SQL: tell if columns are unique with respect to each other
Let's say I have a table:
Role
----
person_id company_id financial_year
How can I say the following:
- Whether each person_id occurs at most once per company_id per financial_year in this table.
- If 1. is false, which person_id and company_id and financial_year occur together more than once.
Edit 1: Edited to add financial_year col
Edit 2: The DBMS platform here is MySQL, although I don't expect it to require a lot of vendor-specific SQL to do this
source to share
For the former, it's generally a good idea to just have a grouping that you can then filter if you want:
select
r.company_id, r.person_id, r.financial_year, count(r.person_id)
from
Role as r
group by
r.company_id, r.person_id, r.financial_year
In the second case, you can simply change the above like this:
select
r.company_id, r.person_id, r.financial_year, count(r.person_id)
from
Role as r
group by
r.company_id, r.person_id, r.financial_year
having
count(r.person_id) > 1
source to share
This should do what you want:
select left.person_id, left.company_id, left.financial_year, count(*)
from role left
inner join role right
on left.person_id = right.person_id
and left.company_id = right.company_id
and left.financial_year = right.financial_year
group by left.person_id, left.company_id, left.financial_year
Note that this is T-SQL (MS), but the only thing I can know about it is the syntax of the table aliases, since the rest is ANSI SQL. This will result in a single row being returned per person / company / year repeat combination, counting the number of times that combination was repeated (although the count was not mentioned in the question, I know this can be useful sometimes).
source to share
I think this will do it for # 1:
select count(*), count(distinct person_id, company_id, financial_year)
from role
(Edit: If the two count () are different, then the table contains multiple rows per unique combination of three columns, which I asked about in question # 1. Check them to get the number of such rows.)
and casperOne's answer will do it in # 2
source to share
Yes, in general, for duplicate detection,
Select [ColumnList you want to be unique]
From Table
Group By [SameColumn List]
Having Count(*) > 1
In your specific case
Select person_id, company_id, financial_year
From Table
Group By person_id, company_id, financial_year
Having Count(*) > 1
or for your subquery (1) that each person_id occurs no more than once per company_id per financial_year in this table
Select company_id, financial_year
From Table
Group By company_id, financial_year
Having Count(Person_Id) > 1
and for (2): (when (1) is false, which person_id and company_id and financial_year occur together more than once
Select person_id, company_id, financial_year
From Table T
Where Not Exists
(Select * From Table
Where company_id = T.company_id
And financial_year = T.financial_year
Having Count(Person_Id) > 1)
Group By person_id, company_id, financial_year
Having Count(*) > 1
source to share