How to replace only spaces between numbers with dots

I have a factor variable with several levels indicating the wealth of people. Unfortunately, thousands of numbers are indicated by spaces:

> levels(bron$vermogen)
 [1] "negatief"                   "0 tot 5 000 euro"           "5 000 tot 10 000 euro"     
 [4] "10 000 tot 20 000 euro"     "20 000 tot 50 000 euro"     "50 000 tot 100 000 euro"   
 [7] "100 000 tot 200 000 euro"   "200 000 tot 500 000 euro"   "500 000 tot 1 miljoen euro"
[10] "1 miljoen euro en meer"    

      

I want to replace those spaces with dots while keeping spaces between numbers and words at the same time. I can do this for example:

bron$vermogen <- gsub("5 000 tot 10 000 euro", "5.000 tot 10.000 euro", bron$vermogen)

      

Using this method, I have to repeat this procedure 8 times. How can I do this more efficiently?

A dput

levels:

c("negatief", "0 tot 5 000 euro", "5 000 tot 10 000 euro", "10 000 tot 20 000 euro", "20 000 tot 50 000 euro", "50 000 tot 100 000 euro", "100 000 tot 200 000 euro", "200 000 tot 500 000 euro", "500 000 tot 1 miljoen euro", "1 miljoen euro en meer")

      

+3


source to share


3 answers


For example:



gsub('([0-9]) ([0-9])','\\1.\\2',bron$vermogen)

 [1] "negatief"                   "0 tot 5.000 euro"           "5.000 tot 10.000 euro"     
 [4] "10.000 tot 20.000 euro"     "20.000 tot 50.000 euro"     "50.000 tot 100.000 euro"   
 [7] "100.000 tot 200.000 euro"   "200.000 tot 500.000 euro"   "500.000 tot 1 miljoen euro"
[10] "1 miljoen euro en meer"   

      

+6


source


You can replace the space with a dot:



gsub("\\d\\K (?=\\d)", ".", bron$vermogen, perl = TRUE)

 # [1] "negatief"                   "0 tot 5.000 euro"          
 # [3] "5.000 tot 10.000 euro"      "10.000 tot 20.000 euro"   
 # [5] "20.000 tot 50.000 euro"     "50.000 tot 100.000 euro"  
 # [7] "100.000 tot 200.000 euro"   "200.000 tot 500.000 euro" 
 # [9] "500.000 tot 1 miljoen euro" "1 miljoen euro en meer"

      

+4


source


Another similar option would be to use lookahead / behind

gsub("(?<=\\d)\\s(?=\\d)", ".", bron$vermogen, perl = TRUE)
# [1] "negatief"                   "0 tot 5.000 euro"           "5.000 tot 10.000 euro"      "10.000 tot 20.000 euro"    
# [5] "20.000 tot 50.000 euro"     "50.000 tot 100.000 euro"    "100.000 tot 200.000 euro"   "200.000 tot 500.000 euro"  
# [9] "500.000 tot 1 miljoen euro" "1 miljoen euro en meer"    

      

+3


source







All Articles