Compare every record of one table with sql server and return duplicates

I have a table as shown below. I wanted to get duplicate records. If the condition is if date2 and date4 have the same dates or dates within less than or equal to 10 days of each other, then the entries are duplicated. I have about 2000 records in the database. An example is shown here with examples. Date1 can be ignored. It may be the same date or different.

ID Number   Code Type   date1     Date2       date3        Date4     status   shortname     CP      deferred
1  EO2      C    TO   9/20/2000  9/1/2010   9/18/2010   9/1/2010     Archi    Ordinary  58.65586    0
2  EO2      C    TO   9/20/2000  9/5/2010   9/18/2010   9/5/2010     Archi    Ordinary  58.65586    0
3  EO2      C    TO   9/21/2000  9/10/2010  9/18/2010   9/10/2010    Archi    Ordinary  58.65586    0
4  EO2      C    TO   9/21/2000  9/24/2010  9/18/2010   9/24/2010    Archi    Ordinary  58.65586    0

      

I wrote below request:

select * from T a
join T b on a.ID = b.ID
where a.[Number] = b.[Number] and a.ID >1

      

Also, I linked this:

SELECT * FROM T a WHERE a.Id IN (SELECT b.Id FROM T b)
EXCEPT
SELECT * FROM T a

      

The problem is that I cannot find a way where each string can compare against each other with the date condition above. I should get a result similar to below as a duplicate:

Number  Code Type   date1     Date2       date3        Date4     status   shortname     CP      deferred
EO2     C    TO   9/20/2000  9/1/2010   9/18/2010   9/1/2010     Archi    Ordinary  58.65586    0
EO2     C    TO   9/20/2000  9/5/2010   9/18/2010   9/5/2010     Archi    Ordinary  58.65586    0
EO2     C    TO   9/21/2000  9/10/2010  9/18/2010   9/10/2010    Archi    Ordinary  58.65586    0

      

Please help. Thanks.

+3


source to share


2 answers


This is how I solved it. Thanks for the help.



select t1.Number,t1.Code,t1.Type,t1.date1,t1.Date2,t1.date3,t1.Date4,t1.stats,t1.shortn‌​ame,t1.CP, t1.deferred 
from T t1 join T t2 on 1=1 and t2.Number = t1.Number and t2.Code = t1.Code and t2.Type = t1.Type and t2.deferred = t1.eferred and t2.CP = t1.CP and t2.status = t1.status and abs(datediff(day,t1.Date2,t2.Date2)) <=10 and abs(datediff(day,t1.date3,t2.date3)) <=10 and abs(datediff(day,t1.Date4,t2.Date4)) <=10 group by t1.Number, t1.Code,t1.Type , t1.date1, t1.Date2,t1.date3,t1.Date4 ,t1.status,t1.shortname,t1.CP, t1.deferred having count(*) > 1

      

0


source


You can use join

to find other identical strings. A join

tries to match all rows from the right table based on a condition on

. For example:

select  *
from    YourTable t1
join    YourTable t2
on      t1.ID < t2.ID -- Must be different rows
                      -- Smaller than presents the duplicates once
                      -- Otherwise you'd get both 1,3 and 3,1
        and abs(datediff(day, t1.date2, t2.date2)) <= 10
        and abs(datediff(day, t1.date4, t2.date4)) <= 10
        and t1.Number = t2.Number
        and t1.Code = t2.Code
        and -- So on for every column that should be equal

      



To ignore a column, omit it from the condition on

.

+1


source







All Articles