How to replace only spaces between numbers with dots
I have a factor variable with several levels indicating the wealth of people. Unfortunately, thousands of numbers are indicated by spaces:
> levels(bron$vermogen)
[1] "negatief" "0 tot 5 000 euro" "5 000 tot 10 000 euro"
[4] "10 000 tot 20 000 euro" "20 000 tot 50 000 euro" "50 000 tot 100 000 euro"
[7] "100 000 tot 200 000 euro" "200 000 tot 500 000 euro" "500 000 tot 1 miljoen euro"
[10] "1 miljoen euro en meer"
I want to replace those spaces with dots while keeping spaces between numbers and words at the same time. I can do this for example:
bron$vermogen <- gsub("5 000 tot 10 000 euro", "5.000 tot 10.000 euro", bron$vermogen)
Using this method, I have to repeat this procedure 8 times. How can I do this more efficiently?
A dput
levels:
c("negatief", "0 tot 5 000 euro", "5 000 tot 10 000 euro", "10 000 tot 20 000 euro", "20 000 tot 50 000 euro", "50 000 tot 100 000 euro", "100 000 tot 200 000 euro", "200 000 tot 500 000 euro", "500 000 tot 1 miljoen euro", "1 miljoen euro en meer")
+3
source to share
3 answers
For example:
gsub('([0-9]) ([0-9])','\\1.\\2',bron$vermogen)
[1] "negatief" "0 tot 5.000 euro" "5.000 tot 10.000 euro"
[4] "10.000 tot 20.000 euro" "20.000 tot 50.000 euro" "50.000 tot 100.000 euro"
[7] "100.000 tot 200.000 euro" "200.000 tot 500.000 euro" "500.000 tot 1 miljoen euro"
[10] "1 miljoen euro en meer"
+6
source to share
You can replace the space with a dot:
gsub("\\d\\K (?=\\d)", ".", bron$vermogen, perl = TRUE)
# [1] "negatief" "0 tot 5.000 euro"
# [3] "5.000 tot 10.000 euro" "10.000 tot 20.000 euro"
# [5] "20.000 tot 50.000 euro" "50.000 tot 100.000 euro"
# [7] "100.000 tot 200.000 euro" "200.000 tot 500.000 euro"
# [9] "500.000 tot 1 miljoen euro" "1 miljoen euro en meer"
+4
source to share
Another similar option would be to use lookahead / behind
gsub("(?<=\\d)\\s(?=\\d)", ".", bron$vermogen, perl = TRUE)
# [1] "negatief" "0 tot 5.000 euro" "5.000 tot 10.000 euro" "10.000 tot 20.000 euro"
# [5] "20.000 tot 50.000 euro" "50.000 tot 100.000 euro" "100.000 tot 200.000 euro" "200.000 tot 500.000 euro"
# [9] "500.000 tot 1 miljoen euro" "1 miljoen euro en meer"
+3
source to share