Extracting all values ​​between () and up to a% sign

How can I extract only the number between the parentheses ()

and before %

?

df <- data.frame(X = paste0('(',runif(3,0,1), '%)'))


                     X
1 (0.746698269620538%)
2 (0.104987640399486%)
3 (0.864544949028641%)

      

For example, I would like to have a DF like this:

                  X
1 0.746698269620538
2 0.104987640399486
3 0.864544949028641

      

+3


source to share


2 answers


We can use sub

to match (

(escaped \\

because it is a metacharacter) at the start ( ^

) of a string followed by 0 or more numbers ( [0-9.]*

) captured as group ( (...)

), then %

other characters ( .*

), replace it with backreference ( \\1

) of the captured group

df$X <- as.numeric(sub("^\\(([0-9.]*)%.*", "\\1", df$X))

      




If it also contains non-numeric characters, then

sub("^\\(([^%]*)%.*", "\\1", df$X)

      

+3


source


Use substr

as you know you need to omit the first and last two characters:



> df <- data.frame(X = paste0('(',runif(3,0,1), '%)'))
> df
                      X
1  (0.393457352882251%)
2 (0.0288733830675483%)
3  (0.289543839870021%)
> df$X <- as.numeric(substr(df$X, 2, nchar(as.character(df$X)) - 2))
> df
           X
1 0.39345735
2 0.02887338
3 0.28954384

      

+3


source







All Articles