Remove all but minimum values ​​based on two columns in a SQL Server table

how to write an expression to accomplish the following:

lets say the table has 2 columns (both are nvarchar) with the following data

col1 10000_10000_10001_10002_10002_10002
col2 10____20____10____30____40_____50

      

I would like to store only the following data:

col1 10000_10001_10002
col2 10____10____30

      

thus removing duplicates based on the values ​​of the second column (none of the columns are primary keys), keeping only those records with the minimum value in the second column.

how to do it?

+2


source to share


3 answers


This should work for you:

;
WITH NotMin AS
(
    SELECT Col1, Col2, MIN(Col2) OVER(Partition BY Col1) AS TheMin
    FROM Table1
)

DELETE Table1
--SELECT * 
FROM Table1
INNER JOIN NotMin
ON Table1.Col1 = NotMin.Col1 AND Table1.Col2 = NotMin.Col2 
    AND Table1.Col2 != TheMin 

      



In this case, a CTE is used (for example, a view, but cleaner) and the over operator is used as a shortcut to less code. I also added a highlighted comment so you can see the relevant lines (before deleting). This will work in SQL 2005/2008.

Thanks Eric

+4


source


Sorry, I misunderstood the question.


SELECT col1, MIN(col2) as col2
FROM table
GROUP BY col1

      

Of course returns the rows in question, but assuming you cannot modify the table to add a unique ID, you would need to do something like:




DELETE FROM test
WHERE col1 + '|' + col2 NOT IN
(SELECT col1 + '|' + MIN(col2)
FROM test
GROUP BY col1)

      

Which should work, assuming the pipe symbol never appears in your set.

0


source


Ideally, you would like to say:

DELETE
FROM tbl
WHERE (col1, col2) NOT IN (SELECT col1, MIN(col2) AS col2 FROM tbl GROUP BY col1)

      

Unfortunately, this is not allowed in T-SQL, but there is a proprietary double FROM extension (using EXCEPT for clarity):

DELETE
FROM tbl
FROM tbl
EXCEPT
    SELECT col1, MIN(col2) AS col2 FROM tbl GROUP BY col1

      

Generally:

DELETE
FROM tbl
WHERE col1 + '|' + col2 NOT IN (SELECT col1 + '|' + MIN(col2) FROM tbl GROUP BY col1)

      

Or other workarounds.

0


source







All Articles