Capturing phone with ruby regex
I am trying to capture Spanish phone numbers, which can take the following forms:
- 123456789
- 123 45 67 89
- 123.45.67.89
- 123-45-67-89
I am using this regex in ruby:
text.match(/([6][0-9]+\s?\-?\.?[0-9]*\s?\-?\.?[0-9]*\s?\-?\.?[0-9]*)/)
The problem is that it also captures other numbers in the text. Specifically, I would like to capture all 9 numbers starting with 6, which can be separated by spaces, dashes, or periods; and not surrounded by other numbers (as sometimes I have big references like ref: 3453459680934983).
Any hint?
Many thanks!
source to share
The pattern for these characters is simple:
/[\d .-]+/
http://rubular.com/r/hSj7okaji3
You can make this a little more comprehensive and look for numbers and separators at specific positions:
/6(?:\d{8}|\d{2}[ .-](?:\d{2}[ .-]){2}\d{2})/
http://rubular.com/r/HkSp8qk0ph
For example:
strings = [
'foo 623456789 bar',
'foo 123456789 bar',
'foo 623 45 67 89 bar',
'foo 123 45 67 89 bar',
'foo 623.45.67.89 bar',
'foo 123.45.67.89 bar',
'foo 623-45-67-89 bar',
'foo 123-45-67-89 bar',
]
found_pns = strings.select{ |s| s[/6(?:\d{8}|\d{2}[ .-](?:\d{2}[ .-]){2}\d{2})/] }
# => ["foo 623456789 bar",
# "foo 623 45 67 89 bar",
# "foo 623.45.67.89 bar",
# "foo 623-45-67-89 bar"]
Once you have the numbers, generally you should normalize them before storing them in the database:
found_pns.map{ |s| s[/6(?:\d{8}|\d{2}[ .-](?:\d{2}[ .-]){2}\d{2})/].tr(' .-', '') }
# => ["623456789", "623456789", "623456789", "623456789"]
Once you've done that, format them as needed when you're ready to display them:
pn = "623456789".match(/(?<n1>\d{3})(?<n2>\d{2})(?<n3>\d{2})(?<n4>\d{2})/)
# => #<MatchData "623456789" n1:"623" n2:"45" n3:"67" n4:"89">
(I'm using named capture above, but that's just to illustrate how the values are retrieved.)
"%s-%s-%s-%s" % [*pn.captures] # => "623-45-67-89"
or
pn.captures.join('-') # => "623-45-67-89"
source to share