R: remove duplicate values ​​in different rows and columns

I found many pages about finding duplicate items in a list or duplicate rows in a dataframe. However, I want to search for duplicated elements throughout the dataframe. Let's take this as an example:

df
     coupon1    coupon2    coupon3
1         10         11         12
2         13         16         15
3         16         17         18
4         19         20         21
5         22         23         24
6         25         26         27

      

You will notice that df [2,2] and df [3,1] have the same element (16). When I ran

duplicated(df)

      

It returns six "FALSE" because the entire string is not duplicated, only one element. How do I check for any duplicate values ​​in the entire dataframe? I would like to know that a duplicate exists and also to know its value (and the same if there are multiple duplicates).

+3


source to share


2 answers


This will find global spoofs, but it will search by column. So, (3,1) will still be FALSE since it is the first value 16

in the data frame.

m <- matrix(duplicated(unlist(df)), ncol=ncol(df))
#      [,1]  [,2]  [,3]
#[1,] FALSE FALSE FALSE
#[2,] FALSE  TRUE FALSE
#[3,] FALSE FALSE FALSE
#[4,] FALSE FALSE FALSE
#[5,] FALSE FALSE FALSE
#[6,] FALSE FALSE FALSE

      



Then you can use it but want, for example:

df[m]
#[1] 16

      

+2


source


which(duplicated(stack(yourdf)[,1]))
[1] 8
stack(yourdf)[,1][which(duplicated(stack(yourdf)[,1]))]
[1] 16

      



+1


source







All Articles