Split by space keeping the string inside curly braces
In this situation, don't try to split, use re.findall
:
>>> import re
>>> re.findall(r'{[^}]*}|\S+', 'cmd -opt1 { a b c d e f g h } -opt2')
['cmd', '-opt1', '{ a b c d e f g h }', '-opt2']
if you need to deal with nested curly braces the re module is not enough, you need to use the "new" regex module which has a recursion function.
>>> import regex
>>> regex.findall(r'[^{}\s]+|{(?:[^{}]+|(?R))*+}', 'cmd -opt1 { a b {c d} e f} -opt2')
['cmd', '-opt1', '{ a b {c d} e f}', '-opt2']
Where (?R)
refers to the whole pattern itself.
or this one (which is better):
regex.findall(r'[^{}\s]+|{[^{}]*+(?:(?R)[^{}]*)*+}', 'cmd -opt1 { a b {c d} e f} -opt2')
source to share
Take a look at the moduleargparse
, as I am assuming that you are writing code to parse the arguments of your program. Usually these arguments are stored in sys.argv
, so you don't even have to worry about splitting the command line. If you insist on using the command line, you can convert the argument string to an argument list using the str.split
.
import argparse
parser = argparse.ArgumentParser(description='whatever cmd does.')
parser.add_argument('--opt1', metavar='N', type=int, nargs='+',
help='integers')
options = parser.parse_args()
for n in options.opt1:
# do something with n
source to share