Subset of data without using column names

Question

Subset of data without using column names

I am wondering if there is a better way to do this or if I may run into some unexpected problems. I need a subset from a dataframe, but I don't want to use the column names. I need to do this by specifying the column number.

data <- data.frame(col1= c(50, 20, NA, 100, 50), 
                   col2= c(NA, 25, 125, 50, NA),
                   col3= c(NA, 100, 15, 55, 25),
                   col4= c(NA, 30, 125, 100, NA),
                   col5= c(80, 25, 75, 40, NA))

Suppose I want to multiply a dataframe and only store a row containing 3 consecutive NA's up to a real number in column 5. The best thing I can think of without using column names is:

sub <- data[(which(is.na(data[2]) & 
                   is.na(data[3]) & 
                   is.na(data[4]) & 
                   !is.na(data[5]))), ]

Does anyone see any problems with this or knows a better way? I'm worried about using subsets within subsets, even though every thing works as it should.

+3

r subset

jtdoud 28 Aug 14 at 18:19

source to share

1 answer

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2014-08-28T18:25:35+0000

If you want to condense your code a bit, you can do something like:

> data[rowSums(is.na(data[2:4])) == 3 & !is.na(data[5]), ]
  col1 col2 col3 col4 col5
1   50   NA   NA   NA   80

Subset of data without using column names

More articles: