PHP Regex find all capitalized words in a string
PHP Regex finds all capitalized words in a string:
$string = "test sample test: 2015. ŽYDRŪNAS PAVARDENIS";
preg_match_all('/\b([A-Z-][\p{L}\pL]+)\b/', $string, $matches);
var_dump($matches);
Output:
array(2) {
[0]=>
array(2) {
[0]=>
string(8) "YDRŪNAS"
[1]=>
string(10) "PAVARDENIS"
}
[1]=>
array(2) {
[0]=>
string(8) "YDRŪNAS"
[1]=>
string(10) "PAVARDENIS"
}
}
Question: where is the disparear symbol ' Ž
'?
HOw to change the expression regex
that won't be removed by characters UTF-8
?
Code online: Code
+3
source to share
1 answer
Basically you need to use the modifieru
option when working with unicode strings. However, the regex can also be simplified using a character class :upper:
as it will match all uppercase unicode characters.
Like this:
$string = "test sample test: 2015. ŽYDRŪNAS PAVARDENIS";
preg_match_all("/[[:upper:]]+/u", $string, $matches);
var_dump($matches);
Output:
array(1) { [0]=> array(2) { [0]=> string(10) "ŽYDRŪNAS" [1]=> string(10) "PAVARDENIS" } }
+5
source to share