R: how to remove specific lines in data.frame

Question

R: how to remove specific lines in data.frame

> data = data.frame(a = c(100, -99, 322, 155, 256), b = c(23, 11, 25, 25, -999))
> data
    a    b
1 100   23
2 -99   11
3 322   25
4 155   25
5 256 -999

For a data.frame like this, I would like to remove any line that contains -99 or -999. Therefore my resulting datafile should only consist of lines 1, 3 and 4.

I was thinking about writing a loop for this, but I hope there will be an easier way. (If my data.frame had az columns then the loop method would be very clunky). My loop would probably look something like this.

i = 1
for(i in 1:nrow(data)){
  if(data$a[i] < 0){
    data = data[-i,]
  }else if(data$b[i] < 0){
    data = data[-i,]
  }else data = data
}

+3

r subset

Adrian 08 jul. At 22:10

source to share

3 answers

 data [ rowSums(data == -99 | data==-999) == 0 , ]
    a  b
1 100 23
3 322 25
4 155 25

Both "==" and "|" (OR) acts on dataframes like matrices, returning booleans of the same dimensions as rowSums can succeed.

+6

42- 08 jul. 15 at 22:21

source to share

@rawr's comment probably makes sense to do this during import. However, you can do this if you already have data:

na.omit(replace(data, sapply(data,`%in%`,c(-99,-999)), NA))
#    a  b
#1 100 23
#3 322 25
#4 155 25

+1

thelatemail 09 jul. '15 at 1:44

source to share

joran · Accepted Answer · 2015-07-08T22:16:58+0000

Perhaps it:

ind <- Reduce(`|`,lapply(data,function(x) x %in% c(-99,-999)))
> data[!ind,]
    a  b
1 100 23
3 322 25
4 155 25

R: how to remove specific lines in data.frame

More articles: