PATINDEX with a range of letters excludes diacritics (accented characters)

I'm trying to figure out how to use patindex to search for a range of literal characters, but excluding accented characters. If I do a direct search, using the standard (insensitive) set works fine. However, when I search for a range of letters it will match an accented character

SELECT
    IIF('Ú' = 'U' COLLATE Latin1_General_CI_AI, 'Match', 'No') AS MatchInsensitive,
    IIF('Ú' = 'U' COLLATE Latin1_General_CI_AS, 'Match', 'No') AS MatchSensitive,
    PATINDEX('%[A-Z]%', 'Ú' COLLATE Latin1_General_CI_AI)      AS PIInsensitive,
    PATINDEX('%[A-Z]%', 'Ú' COLLATE Latin1_General_CI_AS)      AS PISensitive

      

You will get the following results:

MatchInsensitive MatchSensitive PIInsensitive PISensitive
---------------- -------------- ------------- -----------
Match            No             1             1

      

What I am really trying to do is determine the position of the accented characters in the string, so I was really looking PATINDEX('%[^A-Z0-9 ]%')

.

If I have the following request, I expect a result of 2 SELECT PATINDEX('%[^A-Z0-9 ]%', 'médico')

, but I get 0.

+3


source to share


1 answer


You can use binary sort for example. Latin1_General_100_BIN2

...

select patindex('%[^a-zA-Z0-9 ]%', 'médico' collate Latin1_General_100_BIN2)

      



rextester: http://rextester.com/ZICLN98474

returns 2

+3


source







All Articles