# Split single data column in R in NA

I have a large dataset that I want to split into separate units. Right now, these locks are marked with NA, but how do I split them? Sample set:

``````df=matrix(c(1,2,3,4,NA,6,7,8,NA,10,11,12),ncol=1,byrow=TRUE)
```

```

gives us

``````       [,1]
[1,]    1
[2,]    2
[3,]    3
[4,]    4
[5,]   NA
[6,]    6
[7,]    7
[8,]    8
[9,]   NA
[10,]    10
[11,]    11
[12,]    12
```

```

I would like these three to be stored in separate variables, so

``````a
[,1]
[1,]    1
[2,]    2
[3,]    3
[4,]    4
b
[,1]
[1,]    6
[2,]    7
[3,]    8
c
[,1]
[1,]    10
[2,]    11
[3,]    12
```

```
It makes sense? Thank.
+3

source to share

I wasn't sure if you meant a true matrix or data.frame by "dataset". Here's an example data.frame, the matrix would be like

``````df <- data.frame(a=c(1,2,3,4,NA,6,7,8,NA,10,11,12))
gg <- ifelse(is.na(df\$a),NA, cumsum(is.na(df\$a)))
split(df, gg)
```

```

We just use it `gg`

as a new variable to count each time we see NA so that we can divide the sections into groups. We also store the NA values ​​to omit for splitting. And finally, `split()`

with this new categorical variable, does what we want.

``````\$`0`
a
1 1
2 2
3 3
4 4

\$`1`
a
6 6
7 7
8 8

\$`2`
a
10 10
11 11
12 12
```

```
+1

source

One line solution using `split`

and `cumsum`

after removing missing values:

`````` split(df[!is.na(df)],cumsum(is.na(df))[!is.na(df)])
\$`0`
 1 2 3 4

\$`1`
 6 7 8

\$`2`
 10 11 12
```

```
+2

source

All Articles