Match java 8 regex string in any language

I am trying to match java 8 regex string in any language

if it contains letters, numbers and .

or-

String s = "Χ‘ΧœΧ” Χ‘ΧœΧ”";
String pattern= "^[\\p{L}\\p{Digit}_.-]*$";
return s.matches(pattern);

      

which I am missing as this code returns null for a Hebrew valid string.

+3


source to share


1 answer


You can add a space to your template and use \w

instead \p{L}\p{Digit}_

when passing a flag Pattern.UNICODE_CHARACTER_CLASS

:

String s = "Χ‘ΧœΧ” Χ‘ΧœΧ”";
String pattern= "(?U)[\\w\\s.-]*";
System.out.println(s.matches(pattern));
// => true

      

See Java demo



Since the template is used inside a method String#matches()

, bindings ^

and $

do not need. If you plan to use a template using a method Pattern#find()

, wrap the template in anchors, as in the source code ( "^(?U)[\\w\\s.-]*$"

).

Template details :

  • (?U)

    - inline modifier flag Pattern.UNICODE_CHARACTER_CLASS

    that allows short Unicode characters (you can see what matches \w

    in this mode)
  • [\\w\\s.-]*

    - zero or more:
    • \w

      - symbols of words (letters, numbers, _

      and some others)
    • \s

      - spaces
    • .

      - point (no need to escape it inside a character class)
    • -

      - hyphen (not necessary as it is at the end of the character class)
+4


source







All Articles