Split by space keeping the string inside curly braces

str = "cmd -opt1 { a b c  d e f g h } -opt2" 

      

I need the output like this:

[ 'cmd', '-opt1', '{ a b c  d e f g h }', '-opt2' ]  

      

+3


source to share


4 answers


In this situation, don't try to split, use re.findall

:

>>> import re
>>> re.findall(r'{[^}]*}|\S+', 'cmd -opt1 { a b c  d e f g h } -opt2')
['cmd', '-opt1', '{ a b c  d e f g h }', '-opt2']

      

if you need to deal with nested curly braces the re module is not enough, you need to use the "new" regex module which has a recursion function.

>>> import regex
>>> regex.findall(r'[^{}\s]+|{(?:[^{}]+|(?R))*+}', 'cmd -opt1 { a b {c d} e f} -opt2')
['cmd', '-opt1', '{ a b {c d} e f}', '-opt2']

      



Where (?R)

refers to the whole pattern itself.

or this one (which is better):

regex.findall(r'[^{}\s]+|{[^{}]*+(?:(?R)[^{}]*)*+}', 'cmd -opt1 { a b {c d} e f} -opt2')

      

+5


source


\s+(?![^{]*})

      

You can split into this. See demo.



https://regex101.com/r/jV9oV2/6

+4


source


Take a look at the moduleargparse

, as I am assuming that you are writing code to parse the arguments of your program. Usually these arguments are stored in sys.argv

, so you don't even have to worry about splitting the command line. If you insist on using the command line, you can convert the argument string to an argument list using the str.split

.

import argparse

parser = argparse.ArgumentParser(description='whatever cmd does.')
parser.add_argument('--opt1', metavar='N', type=int, nargs='+',
                   help='integers')

options = parser.parse_args()

for n in options.opt1:
   # do something with n

      

+2


source


Just split by {

and }

, then split the individual pieces into regular space:

str = "cmd -opt1 { a b c d e f g h } -opt2"
>>> a, b = str.split("{")
>>> c, d = b.split("}")
>>> a.split() + ["{{{0}}}".format(c)] + d.split()
['cmd', '-opt1', '{ a b c d e f g h }', '-opt2']

      

0


source







All Articles