In Django with Postgresql 9.6, how to sort case and accent without regard to?

I would like this to be equivalent to use utf8_unicode_ci

in MySQL. So if I have these lines (default sort order with Postgresql):

  • Barn
  • Bubble
  • Beth
  • beef
  • Boulette
  • BEMOL

I would like them to be sorted like this (as utf8_unicode_ci

in MySQL):

  • Barn
  • beef
  • BEMOL
  • Beth
  • Boulette
  • Bubble

This sort of sort is case insensitive, accent is insensitive, and ligatures are converted to multiple characters.

I know about unaccent

and lower

in Postgresql, but I have no idea how to use them from Django.

Possible solutions with Django / Postgresql:

  • Add a new sort-only column with normalized data (below, unchanged).
  • Add an index ( like in this answer ) but I'm not sure how it will work with Django?

I don't think Full Text Search or Trigram could help me here because I am not necessarily doing a text search, but I need to get a good sort order.

Ideally, queries should be fast, so using a different indexed column looks like a good avenue. But I want to find a solution that I don't need to implement for every existing text column in my DB that is easy to maintain, etc. Is there a best practice for this?

+3


source to share


2 answers


This is not related to Django itself, PostgreSQL config lc_collate

defines it. I would suggest you look at its meaning:

SHOW lc_collate;

      

The right thing to do is fix this configuration. Don't forget to take a look at the corresponding settings ( lc_ctype

etc.) as well.



But if you cannot create another database with the required parameter, try specifying collate

on ORDER

as the following test case:

CREATE TEMPORARY TABLE table1 (column1 TEXT); 

INSERT INTO table1 VALUES('Barn'),
('beef'),
('bémol'),
('Bœuf'),
('boulette'),
('Bubble');

SELECT * FROM table1 ORDER BY column1 COLLATE "en_US"; --Gives the expected order
SELECT * FROM table1 ORDER BY column1 COLLATE "C"; --Gives "wrong" order  (in your case)

      

It is important to remember that PostgreSQL uses the locales of the operating system. This test case was run on CentOS 7. More details here and here .

+3


source


I did it like this:

But you need to enable the " unaccent " module in your postgresql before this:CREATE EXTENSION unaccent;



def get_value_ci(field):
    return Func(field, function='LOWER', template='UNACCENT(%(function)s(%(expressions)s))')

YoutModel.objects.order_by(get_value_ci('nome_your_field'))

      

and working;)

0


source







All Articles