Read source file with Unicode characters
I have a file in the / res / raw folder (R.raw.test) with the following content:
This is Tésêt
I want to read it in line. My current code:
public static String readRawTextFile(Context ctx, int resId) {
InputStream inputStream = ctx.getResources().openRawResource(resId);
InputStreamReader inputreader;
try {
inputreader = new InputStreamReader(inputStream, "UTF-8");
} catch (UnsupportedEncodingException e1) {
e1.printStackTrace();
return null;
}
BufferedReader buffreader = new BufferedReader(inputreader);
String line;
StringBuilder text = new StringBuilder();
try {
while ((line = buffreader.readLine()) != null) {
text.append(line);
text.append('\n');
}
} catch (IOException e) {
return null;
}
return text.toString();
}
But the returned string is:
It's T st
How can I solve this? Thanks to
Your code seems to be OK. The string is also returned if you try to view it in a non- UTF-8
. I ran your code from groovyConsole
which is UNICODE and it displays the line fine UTF-8
.
First of all, you need to determine the encoding of the file / res / raw
If on UNIX you can enter the following commands
file /res/raw
And then put the correct encoding in
inputreader = new InputStreamReader(inputStream, "UTF-8");
Hi, I would try something like this:
StringBuilder str = new StringBuilder();
File file = new File("c:\\some_file.txt");
FileInputStream is = new FileInputStream(file);
Reader reader = new InputStreamReader(is, "UTF-8");
while(true){
int ch = reader.read();
if(ch < 0){
break;
}
str.append((char)ch);
}
String myString = str.toString();
If you want to write just use InputStreamWriter
with FileOutputStream
and set the correct encoding ... it works like a charm ...
Hope I can help :-)
I had a file that also gave me a result similar to "This is T s t" and for me setting the charsetName to UTF-16 did the trick