PATINDEX with a range of letters excludes diacritics (accented characters)
I'm trying to figure out how to use patindex to search for a range of literal characters, but excluding accented characters. If I do a direct search, using the standard (insensitive) set works fine. However, when I search for a range of letters it will match an accented character
SELECT
IIF('Ú' = 'U' COLLATE Latin1_General_CI_AI, 'Match', 'No') AS MatchInsensitive,
IIF('Ú' = 'U' COLLATE Latin1_General_CI_AS, 'Match', 'No') AS MatchSensitive,
PATINDEX('%[A-Z]%', 'Ú' COLLATE Latin1_General_CI_AI) AS PIInsensitive,
PATINDEX('%[A-Z]%', 'Ú' COLLATE Latin1_General_CI_AS) AS PISensitive
You will get the following results:
MatchInsensitive MatchSensitive PIInsensitive PISensitive
---------------- -------------- ------------- -----------
Match No 1 1
What I am really trying to do is determine the position of the accented characters in the string, so I was really looking PATINDEX('%[^A-Z0-9 ]%')
.
If I have the following request, I expect a result of 2 SELECT PATINDEX('%[^A-Z0-9 ]%', 'médico')
, but I get 0.
source to share
You can use binary sort for example. Latin1_General_100_BIN2
...
select patindex('%[^a-zA-Z0-9 ]%', 'médico' collate Latin1_General_100_BIN2)
rextester: http://rextester.com/ZICLN98474
returns 2
source to share