REGEXP returns false for special characters
I am not very good at regexp but hope someone can explain to me better, I found this in the code I am debugging. I wonder why I was always wrong about this scenario.
I know this \p{L}
matches one code point in the letter category. 0-9
is numeric.
$regExp = /^\s*
(?P([0-2]?[1-9]|[12]0|3[01]))\s+
(?P\p{L}+?)\s+
(?P[12]\d{3})\s*$/i;
$value = '12 Février 2015' ;
$matches = array();
$match = preg_match($regExp, $value, $matches);
More info, I came up with the following:
$match = preg_match("/^\s*(?P<monthDay>([0-2]?[1-9]|[12]0|3[01]))\s+(?P<monthNameFull>\p{L}+?)\s+(?P<yearFull>[12]\d{3})\s*$/i", "18 Février 2015");
var_dump($match); //It will print int(0).
But if a value 18 February 2015
, it will print int (1). Why is this so? Suppose you want to return 1 on both values, because it \p{L}
will accept Unicode characters.
source to share
$regExp = '/^\s*(?P<y>([0-2]?[1-9]|[12]0|3[01]))\s+(?P<m>\p{L}+?)\s+(?P<d>[12]\d{3})\s*$/usD';
$value = '12 Février 2015';
$matches = array();
$match = preg_match($regExp, $value, $matches);
var_dump($matches);
You always need to use <name>
with (?P
unless you want errors ... And on multi-line unicode strings, you need flags usD
. It's easy to remember as the US dollar ...
source to share