Regex matches end of line
I'm looking for a BASH regex to pull the db commands from the below commands. However, the order of the arguments is not guaranteed. For some reason, I cannot get it to work completely.
What am I still
regex="--db (.*)($| --)"
[[ $@ =~ $regex ]]
DB_NAMES="${BASH_REMATCH[1]}"
# These are example lines
somecommand --db myDB --conf /var/home # should get "myDB"
somecommand --db myDB anotherDB manymoreDB --conf /home # should get "myDB anotherDB manymoreDB"
somecommand --db myDB # should get "myDB"
somecommand --db myDB anotherDB # should get "myDB anotherDB"
Any suggestion for regex?
source to share
The problem is that it bash
uses a flavor regex
that doesn't include the unwanted repetition operators ( *?
, +?
). Since it *
is greedy, and there is no way to tell it not to be greedy, the first subexpression in parentheses ( (.*)
) matches everything up to the end of the line.
You can get around this if you know that the values โโyou want to capture do not contain a specific character and are replaced .
with a character class that excludes that character.
For example, if the values โโafter --db
do not contain a dash ( -
), you can use this one regex
:
regex='--db ([^-]*)($| --)'
It matches all the examples posted in the question.
source to share
The following works:
regex="--db[[:space:]]([[:alnum:][:space:]]+)([[:space:]]--|$)"
[[ "$@" =~ $regex ]]
There were two questions:
- Character classes such as [: space:] should be used to represent spaces
-
(.*)
is greedy and will get to your last--
literal. Since bash does not support unwanted matching, we have to match using[[:alnum:][:space:]]
that will ensure we stop at the next one--
.
source to share
By default RegEx tries to get as many matches as possible, use a nonliving (lazy) quantifier. You can also set first --
so that the engine will use that first
--db[[:space:]](.*?)([[:space:]]--|$)
Demo
<h / "> If you don't want --
, you can use the group without capturing
--db[[:space:]](.*?)(?:[[:space:]]--|$)
^^ Notice the ?:
Demo
source to share