How to properly split the value of a character

I have a data frame that is composed of some composite information. I would like to split vector a into vectors "a" and "d", where "a" only matches numeric ID 898, 3467, 234, 222, and vector "d" contains the corresponding signed values.

Data:

a<-c("898_Me","3467_You or ", "234_Hi-hi", "222_what")
b<-c(1,8,3,8)
c<-c(2,4,6,2)
df<-data.frame(a,b,c)

      

What I have tried so far:

a<-str(df$a)

a<-strsplit(df$a, split)

      

But that just doesn't work with my regex skills.

The required output table can be of the form:

                       a    d        b   c
                      898   Me       1   2
                      3467  You or   8   3
                      234   Hi-hi    3   6
                      222   what     8   2   

      

+3


source to share


3 answers


library(tidyr)

a<-c("898_Me","3467_You or ", "234_Hi-hi", "222_what")

b<-c(1,8,3,8)

c<-c(2,4,6,2)

df <-data.frame(a,b,c)

final_df <- separate(df , a , c("a" , "d") , sep = "_")

#    a       d b c
#1  898      Me 1 2
#2 3467 You or  8 4
#3  234   Hi-hi 3 6
#4  222    what 8 2

final_df$d

# [1] "Me"      "You or " "Hi-hi"   "what"  

      



+4


source


strsplit

is right, but you need to pass the character to be split with:

do.call(rbind, strsplit(as.character(df$a), "_"))
#      [,1]   [,2]     
# [1,] "898"  "Me"     
# [2,] "3467" "You or "
# [3,] "234"  "Hi-hi"  
# [4,] "222"  "what"   

      



Or

library(stringi)
stri_split_fixed(df$a, "_", simplify = TRUE)

      

+2


source


In your example, Here is my solution in R base:

df$a2 <- gsub("[^0-9]", "", a)
df$d <- gsub("[0-9]", "", a)

      

This gives:

> df
             a b c   a2        d
1       898_Me 1 2  898      _Me
2 3467_You or  8 4 3467 _You or 
3    234_Hi-hi 3 6  234   _Hi-hi
4     222_what 8 2  222    _what

      

Not elegant, but it retains the original data and is easy to apply.

0


source







All Articles