How to extract part of a string in ruby?
I have a line
line = "start on Saturday 1st April 07:30:37 2017"
and I want to extract
"Sat Apr 1 07:30:37 2017"
I tried this ...
line = "start running at Sat April 1 07:30:37 2017"
if (line =~ /start running at/)
line.split("start running at ").last
end
... but is there any other way to do this?
source to share
This is a way to extract from an arbitrary string a substring that represents the time in a given format. I assumed there is at most one such substring in the string.
require 'time'
R = /
(?:#{Date::ABBR_DAYNAMES.join('|')})\s
# match day name abbreviation in non-capture group. space
(?:#{Date::MONTHNAMES[1,12].join('|')})\s
# match month name in non-capture group, space
\d{1,2}\s # match one or two digits, space
\d{2}: # match two digits, colon
\d{2}: # match two digits, colon
\d{2}\s # match two digits, space
\d{4} # match 4 digits
(?!\d) # do not match digit (negative lookahead)
/x # free-spacing regex def mode
# /
# (?:Sun|Mon|Tue|Wed|Thu|Fri|Sat)\s
# (?:January|February|March|...|November|December)\s
# \d{1,2}\s
# \d{2}:
# \d{2}:
# \d{2}\s
# \d{4}
# (?!\d)
# /x
def extract_time(str)
s = str[R]
return nil if s.nil?
(DateTime.strptime(s, "%a %B %e %H:%M:%S %Y") rescue nil) ? s : nil
end
str = "start eating breakfast at Sat April 1 07:30:37 2017"
extract_time(str)
#=> "Sat April 1 07:30:37 2017"
str = "go back to sleep at Cat April 1 07:30:37 2017"
extract_time(str)
#=> nil
Alternatively, if there is a match with R
, but Time # strptime throws an exception (the value is s
not a valid time for the given time format), an exception could be made to inform the user.
source to share
Here's another (slightly faster, as it turns out) option using # partition :
# will return empty string if there is no match, instead of raising an exception like split.last will
line.partition('start running at ').last
I was wondering how this works against regex matching, so here's a quick test with 1 million executions each:
line.sub(/start running at (.*)/, '\1')
# => @real=1.7465
line.partition('start running at ').last
# => @real=0.712406
# => this is faster, but you'd need to be calling this quite a bit for it to make a significant difference
Bonus : it also makes it very easy to serve a more general case, for example. if you have lines starting with "start running at" and others that start with "stop running at". Then something like line.partition(' at ').last
will serve both (and actually execute a little faster).
source to share
The shortest one will be line["Sat April 1 07:30:37 2017"]
, which will return your string "Sat April 1 07:30:37 2017" if present and nil if not. The [] notation in String is shorthand for getting a substring from a string, and can be used with another string or regular expression. See https://ruby-doc.org/core-2.2.0/String.html#method-i-5B-5D
If the string is unknown, you can use this shorthand also as Carey suggested
line[/start running at (.*)/, 1]
If you want to be sure that the retrieved date is valid, you need a regex from his answer, but you can still use this method.
source to share