How to extract part of a string in ruby?

I have a line

line = "start on Saturday 1st April 07:30:37 2017"

and I want to extract

"Sat Apr 1 07:30:37 2017"

I tried this ...

line = "start running at Sat April 1 07:30:37 2017"
if (line =~ /start running at/)
   line.split("start running at ").last
end

      

... but is there any other way to do this?

+3


source to share


6 answers


This is a way to extract from an arbitrary string a substring that represents the time in a given format. I assumed there is at most one such substring in the string.

require 'time'

R = /
    (?:#{Date::ABBR_DAYNAMES.join('|')})\s
              # match day name abbreviation in non-capture group. space
    (?:#{Date::MONTHNAMES[1,12].join('|')})\s
              # match month name in non-capture group, space
    \d{1,2}\s # match one or two digits, space
    \d{2}:    # match two digits, colon
    \d{2}:    # match two digits, colon
    \d{2}\s   # match two digits, space
    \d{4}     # match 4 digits
    (?!\d)    # do not match digit (negative lookahead)
    /x        # free-spacing regex def mode
  # /
  #  (?:Sun|Mon|Tue|Wed|Thu|Fri|Sat)\s
  #   (?:January|February|March|...|November|December)\s
  # \d{1,2}\s
  # \d{2}:
  # \d{2}:
  # \d{2}\s
  # \d{4}
  # (?!\d)
  # /x 

      



def extract_time(str)
  s = str[R]
  return nil if s.nil?
  (DateTime.strptime(s, "%a %B %e %H:%M:%S %Y") rescue nil) ? s : nil
end

str = "start eating breakfast at Sat April 1 07:30:37 2017"
extract_time(str)
  #=> "Sat April 1 07:30:37 2017" 

str = "go back to sleep at Cat April 1 07:30:37 2017"
extract_time(str)
  #=> nil

      

Alternatively, if there is a match with R

, but Time # strptime throws an exception (the value is s

not a valid time for the given time format), an exception could be made to inform the user.

+4


source


try



line.sub(/start running at (.*)/, '\1')

      

+3


source


The standard way to do this is with regular expressions:

if md = line.match(/start running at (.*)/)
  md[1]
end

      

But you don't need regular expressions, you can do regular string operations:

prefix = 'start running at '
if line.start_with?(prefix)
  line[prefix.size..-1]
end

      

+2


source


Here's another (slightly faster, as it turns out) option using # partition :

# will return empty string if there is no match, instead of raising an exception like split.last will
line.partition('start running at ').last

      

I was wondering how this works against regex matching, so here's a quick test with 1 million executions each:

line.sub(/start running at (.*)/, '\1')
# => @real=1.7465

line.partition('start running at ').last
# => @real=0.712406
# => this is faster, but you'd need to be calling this quite a bit for it to make a significant difference

      


Bonus : it also makes it very easy to serve a more general case, for example. if you have lines starting with "start running at" and others that start with "stop running at". Then something like line.partition(' at ').last

will serve both (and actually execute a little faster).

+1


source


The shortest one will be line["Sat April 1 07:30:37 2017"]

, which will return your string "Sat April 1 07:30:37 2017" if present and nil if not. The [] notation in String is shorthand for getting a substring from a string, and can be used with another string or regular expression. See https://ruby-doc.org/core-2.2.0/String.html#method-i-5B-5D

If the string is unknown, you can use this shorthand also as Carey suggested

line[/start running at (.*)/, 1]

      

If you want to be sure that the retrieved date is valid, you need a regex from his answer, but you can still use this method.

0


source


And one more alternative:

puts $1 if line =~ /start running at (.*)/

      

0


source







All Articles