What does this regex do?

I am working on converting a program from Perl to Java. I ran into the line

my ($title) = ($info{$host} =~ /^\s*\(([^\)]+)\)\s*$/);

      

I am not very good at regex, but from what I can tell it is matching something in the $ info {$ host} line with a regex ^ \ s * (([^)] +)) \ s * $ and assigning the match to $ title.

My problem is that I don't know what the regex is doing and what it will match. Any help would be appreciated.

thank

+2


source to share


4 answers


The regular expression matches a string containing exactly one pair of matching parentheses (actually, one opening and one matching closing parenthesis, but within any number of additional opening parentheses).

A string can begin and end with whitespace characters, but not others. However, there can be arbitrary characters (at least one) in parentheses.

The following lines should match it:



 (abc)
 (()
   (ab)

      

By the way, you can just use as-is regex in Java (after exiting backslashes) using the Pattern

class.

+4


source


It will combine with a blank space, then a left guy, and then text not including a right pair, and then a correct guy, and then another space.

Matches:

      (some stuff)  

      



It fails:

 (some stuff

     some stuff)

   (some stuff)  asadsad

      

+4


source


Ok step by step

/ - quote regex

^ - beginning of line

\ s * - zero or more of any spacelike character

(- actual (symbol

(- start a capture group

[^)] + any of the characters ^ or) + specifying at least one

) - accept capture group

) and the actual) character

\ s * zero or more space as characters

$ - end of line

/ - close quote from regex

So, as far as I can understand this, we are looking for strings like "(^)" or "())" I think there is something missing here.

+1


source


my ($title) = ($info{$host} =~ /^\s*\(([^\)]+)\)\s*$/);

      

First, m//

in the context of a list, it returns the committed matches. my ($title)

puts the right side in the context of the list. Second, it $info{$host}

matches the following pattern:

/^ \s* \( ( [^\)]+) \) \s* $/x

      

Yes, used a flag x

so I could insert some spaces. ^\s*

skips all leading spaces. Then we have an escaped paratet (so no capturing group is created, then we have a capturing group containing [^\)]

. This character class is better written as [^)]

, because the correct parenthesis is not special in the character class and means something but a left parenthesis.

If there are one or more characters other than a closing parenthesis following an opening parenthesis followed by a closing parenthesis that is optionally surrounded on each side by a space, that sequence of characters is captured and placed in $x

.

0


source







All Articles