Powershell Regex vs. "other" Regex, what is different?

I have a powershell script to match the following regex:

---\n(0[1-9]|1[0-2][\/](0[1-9]|[12]\d|3[01])[\/]\d{2}[\s\S]+?)-----

      

The line to match is the following snippet of the log file:

-------------------------------------------------- ----------------------------- 
10/26/16 11:41:26 - Process (15925376.4) User (mqm) Program (amqzmuc0)
                    Host (aixmq1) Installation (Installation1)
                    VRMF (8.0.0.4) QMgr (ecs.queue.manager)
                    AMQ6287: WebSphere MQ V8.0.0.4 (p800-004-151017).

EXPLANATION: WebSphere MQ system information: Host Info: -
AIX 7.1 (MQ AIX 64-bit) Installation: - / usr / mqm (Installation1)
Version: - 8.0.0.4 (p800-004-151017) ACTION: None.
-------------------------------------------------- ----------------------------- 
10/26/16 11:41:26 - Process (15925376.4) User (mqm) Program (amqzmuc0)
                    Host (aixmq1) Installation (Installation1)
                    VRMF (8.0.0.4) QMgr (ecs.queue.manager)
                    AMQ6287: WebSphere MQ V8.0.0.4 (p800-004-151017).

EXPLANATION: FFF WebSphere MQ system information: Host Info: -
AIX 7.1 (MQ AIX 64-bit) Installation: - / usr / mqm (Installation1)
Version: - 8.0.0.4 (p800-004-151017) ACTION: None.
-------------------------------------------------- -----------------------------

Using this regex in perl and regexr.com, it matches two sections from this snippet of the log file perfectly.

Now I have implemented this same regex in powershell and will not return any matches unless I remove the minuses preceding the \ n. If I replace these minuses with a match group containing only minus, it works as well.

To reconcile and understand what's going on, I need to understand why the mapping behavior in PowerShell is so different. Why won't it match once there are cons in the beginning?

The following .NET regex tag shows the same behavior as in powershell:

http://regexstorm.net/tester

Could someone please explain to me why the mapping behavior is different from the powershell parameter when compared to perl / regexr.com?

This is the powershell code snippet I'm using to match this regex:

$matches = ([regex]::matches($sInput, "---\n(0[1-9]|1[0-2][\/](0[1-9]|[12]\d|3[01])[\/]\d{2}[\s\S]+?)\n-") | %{$_.value});

      

+3


source to share


1 answer


On Windows, line endings are (usually) CRLF

(two characters, carriage return followed by linefeed), whereas on unix-based operating systems (mostly nothing but Windows) it is just a line LF

. The escape sequence \n

refers to LF. For CR match use \r

.

So, I think if your input contains CRLF

then -\n

it won't match it. But it \n

will be because he skips the previous one CR

.

The websites you use for testing may or may not convert line endings properly and therefore match, while a .Net tester may do the opposite.



For reference, when I need to match strings in a regex, I use \r?\n

(optional CR

followed by LF

) so that I can catch both types of line endings.

So, in your example, you should be able to change the start of your regex from ---\n

to ---\r?\n

and make it work, if I am right about your particular problem.

+3


source







All Articles