R: remove duplicate values ββin different rows and columns
I found many pages about finding duplicate items in a list or duplicate rows in a dataframe. However, I want to search for duplicated elements throughout the dataframe. Let's take this as an example:
df
coupon1 coupon2 coupon3
1 10 11 12
2 13 16 15
3 16 17 18
4 19 20 21
5 22 23 24
6 25 26 27
You will notice that df [2,2] and df [3,1] have the same element (16). When I ran
duplicated(df)
It returns six "FALSE" because the entire string is not duplicated, only one element. How do I check for any duplicate values ββin the entire dataframe? I would like to know that a duplicate exists and also to know its value (and the same if there are multiple duplicates).
source to share
This will find global spoofs, but it will search by column. So, (3,1) will still be FALSE since it is the first value 16
in the data frame.
m <- matrix(duplicated(unlist(df)), ncol=ncol(df))
# [,1] [,2] [,3]
#[1,] FALSE FALSE FALSE
#[2,] FALSE TRUE FALSE
#[3,] FALSE FALSE FALSE
#[4,] FALSE FALSE FALSE
#[5,] FALSE FALSE FALSE
#[6,] FALSE FALSE FALSE
Then you can use it but want, for example:
df[m]
#[1] 16
source to share