What text encoding was obtained using the IFilter?
I was curious about the extracted text encoding with IFilter
.
IFilter::GetText()
extracts WCHAR*
, but what if the file is ASCII encoded? How about a different Unicode encoding (like UTF-8 or UTF-16?)?
As I see it, either the IFilter takes care of converting the extracted text to a single encoding (if so - what is this encoding?), And if not, how do I know which encoding?
+3
user1612927
source
to share
1 answer
The output text is UTF-16 (everything on Windows using WCHAR
is UTF-16). It is not possible to query the encoding of the input data, you will have to parse this data yourself if necessary.
+2
Remy Lebeau
source
to share