Why \\. equals \. in preg_replace?

In a poll answer to this fantastic question, the following regex is used in a preg_replace

call (from a response function auto_version

):

'{\\.([^./]+)$}'

      

The ultimate goal of this regex is to extract the file extension from the given filename. However, I am confused as to why the beginning of this regex works. Namely:

Why \\.

matches in the same way as \.

in regex?

Shouldn't the first match (a) with one literal backslash, followed by (b) any character, and the second match one literal period? The rules for single quotes indicate, which \\

gives a literal backslash.

Consider this simple example:

$regex1 = '{\.([^./]+)$}';  // Variant 1 (one backslash)
$regex2 = '{\\.([^./]+)$}'; // Variant 2 (two backslashes)

$subject1 = '/css/foobar.css';   // Regular path
$subject2 = '/css/foobar\\.css'; // Literal backslash before period

echo "<pre>\n";
echo "Subject 1: $subject1\n";
echo "Subject 2: $subject2\n\n";

echo "Regex 1: $regex1\n";
echo "Regex 2: $regex2\n\n";

// Test Variant 1
echo preg_replace($regex1, "-test.\$1", $subject1) . "\n";
echo preg_replace($regex1, "-test.\$1", $subject2) . "\n\n";

// Test Variant 2
echo preg_replace($regex2, "-test.\$1", $subject1) . "\n";
echo preg_replace($regex2, "-test.\$1", $subject2) . "\n\n";
echo "</pre>\n";

      

Output:

Subject 1: /css/foobar.css
Subject 2: /css/foobar\.css

Regex 1: {\.([^./]+)$}  <-- Output matches regex 2
Regex 2: {\.([^./]+)$}  <-- Output matches regex 1

/css/foobar-test.css
/css/foobar\-test.css

/css/foobar-test.css
/css/foobar\-test.css

      

Long story short: why \\.

gives the same consistent results in a call preg_replace

like \.

?

+3


source to share


2 answers


Note that double-escaping occurs: PHP sees \\.

and says "OK, this is really \.

". Then the regex engine sees \.

and says "OK, this means a literal dot".



If you remove the first backslash, PHP sees \.

and says "this is a backslash followed by a random character, not a single quote or backslash as per the spec - so it stays \.

." Regular engine sees again \.

and gives the same result as above.

+11


source


In addition to John's perfectly correct answer:



Please consider using different kinds of quotes ( "

vs '

). If you are using '

, you cannot include control characters (like newline). Since "

it is possible, using special key combinations \?

, which ?

can be different things (eg \n

, \t

etc.). So, if you want to have a real \

in your double quoted string, you need to escape the backslash by using \\

. Note that this is not required when using single quotes.

0


source







All Articles