PHP - Multibyte Safe Regular Expression Support
PHP supports regular expressions in three ways :
- POSIX ERE , now removed in PHP 7+
- PCRE , which is the main component , but not always a multibyte safe
- Multibyte String which is not enabled by default
There is Unicode on the Internet today , and PHP has also since 5.6 because of i18n . While PHP itself is known to be terribly poor at Unicode support, Intl provides access to the released ICU library .
To avoid long wait for UString and repetition (and memory) when doin 'it right , I prefer Intl and don't leave iconv , Multibyte String along with DateTime and rewrite most of the SBCS string functions to be multibyte. Some problems arise in this process:
- Formatting large numbers in a local language is problematic on 32-bit platforms (like NAS) when the database offers storage for 64-bit numbers. This can be solved by using numbers as a string via BCMath .
- Inside the shell, there is no support for regular expression functions the ICU , the Unicode version of PCRE is.
To use PCRE with Unicode syntax , PHP buit-in PCRE , for compilation and configuration with Unicode support. On some systems it is not configured with Unicode by adding (*UTF8)
before the expression overrides the configuration.
- Am I missing a way to work with ICU regex functions from PHP?
- Are there any other errors to consider for Unicode PCRE?
- Am I missing the reason why should I use Multibyte String?
source to share
No one has answered this question yet
See similar questions:
or similar: