Replace the NA line with the previous line in R
I was wondering if anyone has a quick and dirty solution to the following problem: I have a matrix with NA rows and I would like to replace the NA rows with the previous row (unless it is also an NA row).
Suppose the first line is not an NAs string
Thank!
Adapted from answer to this question: An idiomatic way to copy the values of cells "down" in a vector R
f <- function(x) {
idx <- !apply(is.na(x), 1, all)
x[idx,][cumsum(idx),]
}
x <- data.frame(a=c(1, 2, NA, 3, NA, NA), b=c(4, 5, NA, 6, NA, 7))
> x
a b
1 1 4
2 2 5
3 NA NA
4 3 6
5 NA NA
6 NA 7
> f(x)
a b
1 1 4
2 2 5
2.1 2 5
4 3 6
4.1 3 6
6 NA 7
Trying to think about times, you might have two NA strings per line.
#create a data set like you discuss (in the future please do this yourself)
set.seed(14)
x <- matrix(rnorm(10), nrow=2)
y <- rep(NA, 5)
v <- do.call(rbind.data.frame, sample(list(x, x, y), 10, TRUE))
One approach:
NArows <- which(apply(v, 1, function(x) all(is.na(x)))) #find all NAs
notNA <- which(!seq_len(nrow(v)) %in% NArows) #find non NA rows
rep.row <- sapply(NArows, function(x) tail(notNA[x > notNA], 1)) #replacement rows
v[NArows, ] <- v[rep.row, ] #assign
v #view
It won't work if your first line is all NA.
If m
is your matrix, this is your quick and dirty solution:
sapply(2:nrow(m),function(i){ if(is.na(m[i,1])) {m[i,] <<- m[(i-1),] } })
Note that it uses an ugly (and dangerous) operator <<-
.
You can always use a loop, assuming 1 is not NA, as indicated:
fill = data.frame(x=c(1,NA,3,4,5))
for (i in 2:length(fill)){
if(is.na(fill[i,1])){ fill[i,1] = fill[(i-1),1]}
}
Matthew's example:
x <- data.frame(a=c(1, 2, NA, 3, NA, NA), b=c(4, 5, NA, 6, NA, 7))
na.rows <- which( apply( x , 1, function(z) (all(is.na(z)) ) ) )
x[na.rows , ] <- x[na.rows-1, ]
x
#---
a b
1 1 4
2 2 5
3 2 5
4 3 6
5 3 6
6 NA 7
Obviously the first line with all NA will present problems.
Here's a simple and conceptually, arguably the simplest one-liner:
x <- data.frame(a=c(1, 2, NA, 3, NA, NA), b=c(4, 5, NA, 6, NA, 7))
a b
1 1 4
2 2 5
3 NA NA
4 3 6
5 NA NA
6 NA 7
x1<-t(sapply(1:nrow(x),function(y) ifelse(is.na(x[y,]),x[y-1,],x[y,])))
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 2 5
[4,] 3 6
[5,] 3 6
[6,] NA 7
To get back the column names, just use the code names (x1) <-colnames (x)