PHP Regex find all capitalized words in a string

PHP Regex finds all capitalized words in a string:

$string = "test sample test: 2015. ŽYDRŪNAS PAVARDENIS";

preg_match_all('/\b([A-Z-][\p{L}\pL]+)\b/', $string, $matches);

var_dump($matches);

      

Output:

array(2) {
  [0]=>
  array(2) {
    [0]=>
    string(8) "YDRŪNAS"
    [1]=>
    string(10) "PAVARDENIS"
  }
  [1]=>
  array(2) {
    [0]=>
    string(8) "YDRŪNAS"
    [1]=>
    string(10) "PAVARDENIS"
  }
}

      

Question: where is the disparear symbol ' Ž

'?

HOw to change the expression regex

that won't be removed by characters UTF-8

?

Code online: Code

+3


source to share


1 answer


Basically you need to use the modifieru

option when working with unicode strings. However, the regex can also be simplified using a character class :upper:

as it will match all uppercase unicode characters.

Like this:

$string = "test sample test: 2015. ŽYDRŪNAS PAVARDENIS";

preg_match_all("/[[:upper:]]+/u", $string, $matches);
var_dump($matches);

      



Output:

array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(10) "ŽYDRŪNAS"
    [1]=>
    string(10) "PAVARDENIS"
  }
}

      

Demo

+5


source







All Articles