Extracting all values ​​between () and up to a% sign
How can I extract only the number between the parentheses ()
and before %
?
df <- data.frame(X = paste0('(',runif(3,0,1), '%)'))
X
1 (0.746698269620538%)
2 (0.104987640399486%)
3 (0.864544949028641%)
For example, I would like to have a DF like this:
X
1 0.746698269620538
2 0.104987640399486
3 0.864544949028641
source to share
We can use sub
to match (
(escaped \\
because it is a metacharacter) at the start ( ^
) of a string followed by 0 or more numbers ( [0-9.]*
) captured as group ( (...)
), then %
other characters ( .*
), replace it with backreference ( \\1
) of the captured group
df$X <- as.numeric(sub("^\\(([0-9.]*)%.*", "\\1", df$X))
If it also contains non-numeric characters, then
sub("^\\(([^%]*)%.*", "\\1", df$X)
source to share
Use substr
as you know you need to omit the first and last two characters:
> df <- data.frame(X = paste0('(',runif(3,0,1), '%)'))
> df
X
1 (0.393457352882251%)
2 (0.0288733830675483%)
3 (0.289543839870021%)
> df$X <- as.numeric(substr(df$X, 2, nchar(as.character(df$X)) - 2))
> df
X
1 0.39345735
2 0.02887338
3 0.28954384
source to share