How to separate thousands of people
I would like to format the number so that every thousand has to be separated by a space.
What I have tried:
library(magrittr)
addSpaceSep <- function(x) {
x %>%
as.character %>%
strsplit(split = NULL) %>%
unlist %>%
rev %>%
split(ceiling(seq_along(.) / 3)) %>%
lapply(paste, collapse = "") %>%
paste(collapse = " ") %>%
strsplit(split = NULL) %>%
unlist %>%
rev %>%
paste(collapse = "")
}
> sapply(c(1, 12, 123, 1234, 12345, 123456, 123456, 1234567), addSpaceSep)
[1] "1" "12" "123" "1 234" "12 345" "123 456" "123 456"
[8] "1 234 567"
> sapply(c(1, 10, 100, 1000, 10000, 100000, 1000000), addSpaceSep)
[1] "1" "10" "100" "1 000" "10 000" "1e +05" "1e +06"
I feel very bad that I wrote this temporary function, but since I have not mastered regular expressions, this is the only way I have found it. And of course it won't work if the number is converted in scientific format.
source to share
This seems like a much better fit for a function format()
rather than a regular expression bother. Function format()
exists for formatting numbers
format(c(1, 12, 123, 1234, 12345, 123456, 123456, 1234567), big.mark=" ", trim=TRUE)
# [1] "1" "12" "123" "1 234" "12 345" "123 456"
# [7] "123 456" "1 234 567"
format(c(1, 10, 100, 1000, 10000, 100000, 1000000), big.mark=" ", scientific=FALSE, trim=TRUE)
# [1] "1" "10" "100" "1 000" "10 000" "100 000"
# [7] "1 000 000"
source to share
I agree with other answers that using other tools (for example format
) is the best approach. But if you really want to use regex and substitution, then here's an approach that works with Perl in perspective.
> test <- c(1, 12, 123, 1234, 12345, 123456, 1234567, 12345678)
>
> gsub('(\\d)(?=(\\d{3})+(\\D|$))', '\\1 ',
+ as.character(test), perl=TRUE)
[1] "1" "12" "123" "1 234"
[5] "12 345" "123 456" "1 234 567" "12 345 678"
Basically, it looks for a digit followed by 1 or more sets of 3 digits (not followed by a digit or the end of the line) and replaces the digit with itself plus a space (the forward part does not appear in the substitution because it is not part of the match, more match condition).
source to share