How many multibyte characters can fit into a `TEXT` column?

According to the documentation (emphasis mine):

TEXT [(M)] [CHARACTER SET charset_name] [COLLATE collation_name]

Column TEXT with a maximum length of 65,535 (216 - 1) characters. The effective maximum length is less if the value contains multibyte characters. Each TEXT value is stored using a 2-byte prefix that indicates the number of bytes in the value.

Would it be more accurate to say that a column TEXT

can store 65535 bytes? What is the specific impact of multibyte characters in a column TEXT

?

Here's the source of my confusion:

In MySQL 5, the CHAR

and fields VARCHAR

have been changed so that they count characters instead of bytes (for example, you can put "你好, 世界!" In VARCHAR(6)

). Are the regions TEXT

getting the same access or are they still counting bytes?

+3


source to share


1 answer


My knowledge: a character in utf-8 is maximum 32 bits large (4 bytes).

Edit: utf8 is only a max 3 byte array in mysql. utf8mb4 - 4 bytes maximum.

So the worst case with only the largest characters:

utf8: 65535 / 3 = 21845
utf8mb4: 65535 / 4 = 16383,75 =~ 16383

      

fooobar.com/questions/100747 / ...

Edit2:

I have tested local with 10.1.21-MariaDB. Test characters utf-8:

1-byte: a



2-byte: ö

3-byte: 好

4-byte: 𠜎

utf8: 21845 @3-Byte (好)
utf8mb4: 16386  @4-Byte (𠜎)

      

Screenshot:

local test

http://i.imgur.com/5dmRteL.png

+3


source







All Articles