Working with Chinese Characters in C-Manipulation
It is known that in C a string is represented by an array
And on most 32-bit processors, it
takes one byte or eight bits. And the string consists of an array of one
Since extended characters like Chinese and Japanese take up more than 8 bits, I am a little confused about this.
For example, I tested that I can define an array of Chinese characters in the same way as an array of English letters using the type syntax
. So my question is:
Is there a mechanism that tries to bridge the gap between common 8-bit characters and characters more than 8-bit so that they are treated the same way like what I mentioned above.
source to share
I would suggest using the UTF8 lowercase encoding, as this allows normal (byte <= 127) characters to be used normally, and in addition, you will be able to use two, three, or four byte characters by defining a Unicode control character (byte> = 128) ... You can also use libiconv for some related problems. http://www.gnu.org/software/libiconv/