What text encoding was obtained using the IFilter?

I was curious about the extracted text encoding with IFilter

.

IFilter::GetText()

extracts WCHAR*

, but what if the file is ASCII encoded? How about a different Unicode encoding (like UTF-8 or UTF-16?)?

As I see it, either the IFilter takes care of converting the extracted text to a single encoding (if so - what is this encoding?), And if not, how do I know which encoding?

+3


source to share


1 answer


The output text is UTF-16 (everything on Windows using WCHAR

is UTF-16). It is not possible to query the encoding of the input data, you will have to parse this data yourself if necessary.



+2


source







All Articles