Extract rows of data frame with specific conditions

Question

Extract rows of data frame with specific conditions

I have a data frame, two variables V1

and V2

, with 9 lines of data.

Original Data Frame:

                 Var1       Var2 
    sigma1       11          1
    alpha1       12          5
    pi1          13          3
    sigma2       14          4
    alpha2       21          9
    pi2          34          6
    sigma3       55          12
    alpha3       18          9
    pi3          19          10

I want to separate all the alpha, sigma and pi observations. And make each one a new dataframe.

Ideal format afterwards:

    Data Frame 1:

        sigma1       11          1
        sigma2       14          4
        sigma3       55          12

    Data Frame 2:

        alpha1       12          5
        alpha2       21          9
        alpha3       18          9

    Data Frame 3:

        pi1          13          3
        pi2          34          6
        pi3          19          10

I want to separate all the alpha, sigma and pi observations. And make each one a new dataframe.

+3

r dataframe subset

jester 08 jul. '15 at 3:41

source to share

3 answers

I would make a grouping variable from the first letter and use it with split:

df <- read.table(header=T, text='
    group      Var1       Var2 
    sigma1       11          1
    alpha1       12          5
    pi1          13          3
    sigma2       14          4
    alpha2       21          9
    pi2          34          6
    sigma3       55          12
    alpha3       18          9
    pi3          19          10
    ')

 split(df, substr(df$group,0,1))

give this:

> split(df, substr(df$group,0,1))
$a
   group Var1 Var2
2 alpha1   12    5
5 alpha2   21    9
8 alpha3   18    9

$p
  group Var1 Var2
3   pi1   13    3
6   pi2   34    6
9   pi3   19   10

$s
   group Var1 Var2
1 sigma1   11    1
4 sigma2   14    4
7 sigma3   55   12

+2

Neal fultz 08 jul. 15 at 5:03

source to share

If you are converting to data.table

, you can do something like: (I am calling the name of your first column letter

)

DT <- as.data.table(DF)
DT[grep('sigma.*', DT[, letter])]

Then you can do the same with others.

0

Chris watson 08 jul. 15 at 3:56

source to share

zx8754 · Accepted Answer · 2015-07-08T07:44:55+0000

We can use eval(parse())

to create dynamic variables, try this example:

#dummy data
df <- read.table(text="Var1       Var2 
sigma1       11          1
alpha1       12          5
pi1          13          3
sigma2       14          4
alpha2       21          9
pi2          34          6
sigma3       55          12
alpha3       18          9
pi3          19          10")

#get unique rownames
myNames <- unique(gsub(".$", "", rownames(df)))
myNames
#[1] "sigma" "alpha" "pi" 

#split to 3 data.frames
for(i in myNames)
  eval(parse(text=paste0("df_",i," <- df[ grepl('",i,"',rownames(df)),]")))

#check output
ls()
# [1] "df"       "df_alpha" "df_pi"    "df_sigma" "i"        "myNames" 
df_alpha
#        Var1 Var2
# alpha1   12    5
# alpha2   21    9
# alpha3   18    9

EDIT: As suggested by @NealFultz to improve code readability, we can use assignment:

for(i in myNames)
  assign(paste0("df_",i),df[ grepl(i,rownames(df)),])

Extract rows of data frame with specific conditions

More articles: