Read and write file with windows-1252

I am trying to write a file with some German characters to disk and read it using Windows-1252

encoding. I don't understand why, but my output looks like this:

<title>W�hrend und im Anschluss an die Exkursion stehen Ihnen die Ansprechpartner f�r O-T�ne</title>

<p>Die Themen im �berblick</p>

      

Any thoughts? Here is my code. You will need spring-core and commons-io to get it running.

private static void write(String fileName, Charset charset) throws IOException {
    String html = "<html xmlns=\"http://www.w3.org/1999/xhtml\">" +
                  "<head>" +
                  "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=windows-1252\">" +
                  "<title>Während und im Anschluss an die Exkursion stehen Ihnen die Ansprechpartner für O-Töne</title>" +
                  "</head>" +
                  "<body>" +
                  "<p>Die Themen im Überblick</p>" +
                  "</body>" +
                  "</html>";

    byte[] bytes = html.getBytes(charset);
    FileOutputStream outputStream = new FileOutputStream(fileName);
    OutputStreamWriter writer = new OutputStreamWriter(outputStream, charset);
    IOUtils.write(bytes, writer);
    writer.close();
    outputStream.close();
}

private static void read(String file, Charset windowsCharset) throws IOException {
    ClassPathResource pathResource = new ClassPathResource(file);
    String string = IOUtils.toString(pathResource.getInputStream(), windowsCharset);
    System.out.println(string);
}

public static void main(String[] args) throws IOException {
    Charset windowsCharset = Charset.forName("windows-1252");
    String file = "test.txt";
    write(file, windowsCharset);
    read(file, windowsCharset);
}

      

+3


source to share


2 answers


Your writing method is wrong. You are using writing to write bytes. A writer must be used to write characters or strings.

You have already encoded a string in bytes with a line

byte[] bytes = html.getBytes(charset);

      



These bytes can simply be written to the output stream:

IOUtils.write(bytes, outputStream);

      

This makes the writer unnecessary (removes him) and now you get the correct output.

+1


source


First make sure the compiler and editor are using the same encoding. This can be verified by trying (ugly) \uXXXX

escaping:

während
w\u00E4hrend

      

Then



    "<meta http-equiv='Content-Type' content='text/html; charset="
    + charset.name() + "' />" +

    byte[] bytes = html.getBytes(charset);
    Files.write(Paths.get(fileName), bytes);

      

Ahh, check that the file is in Windows-1252 too. A programmer's editor like NotePad ++ or JEdit lets you play with encodings.

0


source







All Articles