Convert C strings from local encoding to UTF8

I am writing a small application in which I read text from the console which is then stored in a classic char * string.
Be that as it may, I need to pass it to lib which only uses UTF-8 encoded strings. Since Windows console uses local encoding, I need to convert it from local encoding to UTF-8.
If I'm not mistaken, I can use MultiByteToWideChar (..) to encode to UTF-16 and then use WideCharToMultiByte (..) to convert to UTF-8.

However I am wondering if there is a way to convert directly from local encoding to UTF-8 without using any external Libs, since the idea behind converting to wchar is just to be able to convert back to char (utf-8, but still) seems strange to me.

+2


source to share


2 answers


Converting from UTF-16 to UTF-8 is a purely mechanical process, but converting from local encoding to UTF-16 or UTF-8 involves some large specialized lookup tables. The C-runtime just turns around and calls WideCharToMultiByte and MultiByteToWideChar for non-trivial cases.

As far as using UTF-16 as an intermediate step, as far as I know, there is no way around this - sorry.



Since you are already linking to an external library to get file input, you can also reference the same library to get WideCharToMultiByte and MultiByteToWideChar.

Using c-runtime will make your code recompile for other operating systems (in theory), but also add a layer of overhead between you and the library that does all the real work in this case - kernel32. dll.

+4


source


The POSIX world loves the lib icon for this. It converts from and to almost every encoding using char *.



+4


source







All Articles