Mysql diacritical insensitive search?

How to make diacritical insensitivity,

ex this is a Persian diacritical string

هواى بر آفتاب بارز

does not match deleted diacritic in mySql

هواى بر آفتاب بارز

Is there a way to tell mysql to ignore the diacritics or do I need to remove all the diacritics in my fields manually?

+2


source to share


5 answers


This is a bit like a case insensitive problem.

SELECT * FROM blah WHERE UPPER(foo) = "THOMAS"

      



Just compare both diacritical strings before comparing.

+1


source


I am using utf8 (utf8_general_ci) and searching for arabica without diacritics does not work, it is not sensitive or does not work, but it does not work correctly.

I tried to look at a character with and without diacritics using Hex and it looks like mysql, treating it as two different characters.

I am thinking about using hex and replace (a lot of replacement) to search for words while filtering diacritics.



My solution to have an insensitive search for Arabic words:

SELECT arabic_word FROM Word
WHERE
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(HEX(REPLACE(
arabic_word, "-", "")), "D98E", ""), "D98B", ""), "D98F", ""), "D98C", 
""),"D991",""),"D992",""),"D990",""),"D98D","") LIKE ?', '%'.$search.'%'

      

the values ​​formatted in hex are the diacritics that we want to filter out. ugly but I haven't found another underserver.

+2


source


Have you already read all MySQL Character Set Support to see if your question is answered? Comparisons should be especially understood.

My guess is that using utf8_general_ci might do the right things for you

0


source


Customization

set names 'utf8'

      

usually does the trick for Latin searches before executing the query. I'm not sure if this works for Arabic as well.

0


source


The cleanest solution I have come to is:

SELECT arabic_word 
FROM Word
WHERE ( arabic_word REGEXP '{$search}' OR SOUNDEX( arabic_word ) = SOUNDEX( '{$search}' ) );

      

I have not tested the cost of the SOUNDEX function. I guess this is possible for small tables, but not for large datasets.

0


source







All Articles