How to (easily) split a string in Ruby with length as well as delimiter

I need to split a string in Ruby that has the following format:

 [{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},{a:9,b:10,c:11,d:12},{a:13,b:14,c:15,d:16}]

      

i.e. it is a generated javascript array. Unfortunately this list is long and I would like to separate it with a comma separator of the array elements after reaching a certain length, suitable for editing it with a code editor, but in order to preserve the integrity of the elements. For example, the above line with a split width of 15 would look like this:

 [{a:1,b:2,c:3,d:4},
 {a:5,b:6,c:7,d:8},
 {a:9,b:10,c:11,d:12},
 {a:13,b:14,c:15,d:16}]

      

and with a width of 32 the text will look like this:

 [{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},
 {a:9,b:10,c:11,d:12},{a:13,b:14,c:15,d:16}]

      

Apart from the classic "brute force" approach (loop through, check the separator between }

and {

when increasing the length, divide if the length is greater than and found the separator), is there a more "Rubish" solution to the problem?

Edit: A naive approach is attached, definitely not Rubyiish as I don't have a very strong Ruby background:

def split(what, length)
  result = []
  clength = 0
  flag = FALSE
  what_copy = what.to_s
  what_copy.to_s.each_char do |c|
     clength += 1
     if clength > length
       flag = TRUE
     end

     if  c == '}' and flag
        result << what[0 .. clength]
        what = what[clength+1 .. -1]
        clength = 0
       flag = FALSE
     end
  end
  pp result
  sres = result.join("\n")
  sres
end

      

+3


source to share


3 answers


You can use regex with:

  • non-greedy repetition of characters width-2

  • and then }

  • and then ,

    or ]

    .



data = "[{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},{a:9,b:10,c:11,d:12},{a:13,b:14,c:15,d:16}]"

def split_data_with_min_width(text, width)
  pattern = /
    (                 # capturing group for split
      .{#{width-2},}? # at least width-2 characters, but not more than needed
      \}              # closing curly brace
      [,\]]           # a comma or a closing bracket
    )
    /x                # free spacing mode
  text.split(pattern).reject(&:empty?).join("\n")
end

puts split_data_with_min_width(data, 15)
# [{a:1,b:2,c:3,d:4},
# {a:5,b:6,c:7,d:8},
# {a:9,b:10,c:11,d:12},
# {a:13,b:14,c:15,d:16},
# {a:17,b:18,c:19,d:20}]

puts split_data_with_min_width(data, 32)
# [{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},
# {a:9,b:10,c:11,d:12},{a:13,b:14,c:15,d:16},
# {a:17,b:18,c:19,d:20}]

      

The method uses split

with the capture group instead scan

, because the last part of the line may not be long enough:

"abcde".scan(/../)
# ["ab", "cd"]
"abcde".split(/(..)/).reject(&:empty?)
# ["ab", "cd", "e"]

      

+6


source


code

def doit(str, min_size)   
  r = /
      (?:                # begin non-capture group                
        .{#{min_size},}? # match at least min_size characters, non-greedily
        (?=\{)           # match '{' in a positive lookahead
        |                # or
        .+\z             # match one or more characters followed by end of string
      )                  # close non-capture group
      /x                 # free-spacing regex definition mode
  str.scan(r)
end

      



<strong> Examples

str = "[{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},{a:9,b:10,c:11,d:12},{a:13,b:14,c:15,d:16}]"

doit(str, 18) # same for all min_size <= 18
  #=> ["[{a:1,b:2,c:3,d:4},",
  #    "{a:5,b:6,c:7,d:8},",
  #    "{a:9,b:10,c:11,d:12},",
  #    "{a:13,b:14,c:15,d:16}]"] 
doit(str, 19)
  #=> ["[{a:1,b:2,c:3,d:4},",
  #    "{a:5,b:6,c:7,d:8},{a:9,b:10,c:11,d:12},",
  #    "{a:13,b:14,c:15,d:16}]"]
doit(str, 20)
  #=> ["[{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},",
  #    "{a:9,b:10,c:11,d:12},",
  #    "{a:13,b:14,c:15,d:16}]"] 
doit(str, 21)
  #=> ["[{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},",
  #    "{a:9,b:10,c:11,d:12},",
  #    "{a:13,b:14,c:15,d:16}]"] 
doit(str, 22) # same for 23 <= min_size <= 37
  #=> ["[{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},",
  #    "{a:9,b:10,c:11,d:12},{a:13,b:14,c:15,d:16}]"]
doit(str, 38) # same for 39 <= min_size <= 58
  #=> ["[{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},{a:9,b:10,c:11,d:12},",
  #    "{a:13,b:14,c:15,d:16}]"] 
doit(str, 59) # same for min_size > 59
  #=> ["[{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},{a:9,b:10,c:11,d:12},{a:13,b:14,c:15,d:16}]"] 

      

+2


source


Like this?

2.3.1 :007 > a
 => "[{a:1,b:2,c:3,d:4},{a:5,b:6,c:7,d:8},{a:9,b:10,c:11,d:12},{a:13,b:14,c:15,d:16}]" 
2.3.1 :008 > q =  a.gsub("},", "},\n")
 => "[{a:1,b:2,c:3,d:4},\n{a:5,b:6,c:7,d:8},\n{a:9,b:10,c:11,d:12},\n{a:13,b:14,c:15,d:16}]" 
2.3.1 :009 > puts q
[{a:1,b:2,c:3,d:4},
{a:5,b:6,c:7,d:8},
{a:9,b:10,c:11,d:12},
{a:13,b:14,c:15,d:16}]
 => nil 
2.3.1 :010 > 

      

+1


source







All Articles