How to properly decode accented characters for display

Question

How to properly decode accented characters for display

The text file of the raw source file contains the line:

Caf&eacute (Should be Café)

The text file is a UTF8 file.

The output allows us to say that this is a different text file, so it is not necessary for the web page.

What C # method can I use to output the correct format Café

,?

Seemingly a common problem ?

+3

c # encoding special-characters decoding

Fixer 26 Mar 12 at 16:27

source to share

5 answers

It is HTML encoded text. It needs to be decoded:

string decoded = HttpUtility.HtmlDecode(text);

UPDATE: the french character "é" has HTML code " é

", so you need to correct your input string.

+2

Sergey Berezovskiy 26 Mar At 16:44

source to share

You must use SecurityElement.Escape when working with XML files.

HtmlEncode

will encode many additional objects that are not required. XML only requires you to run>, <, &, "and", which does SecurityElement.Escape

.

When reading a file through an XML parser, this transformation is done for you by the parser, you do not need to "decode" it.

EDIT: Of course, this is only useful when writing XML files.

+2

Matthew 26 Mar 12 at 16:55

source to share

I think this works:

string utf8String = "Your string";

Encoding utf8 = Encoding.UTF8;
Encoding unicode = Encoding.Unicode;

byte[] utf8Bytes = utf8.GetBytes(utf8String);

byte[] unicodeBytes = Encoding.Convert(utf8, unicode, utf8Bytes);

char[] uniChars = new char[unicode.GetCharCount(unicodeBytes, 0, unicodeBytes.Length)];
unicode.GetChars(unicodeBytes, 0, unicodeBytes.Length, uniChars, 0);

string unicodeString = new string(uniChars);

0

Cronan 26 Mar 12 at 16:36

source to share

Use HttpUtility.HtmlDecode

. Example:

class Program
{
    static void Main()
    {
        XDocument doc = new XDocument(new XElement("test", 
            HttpUtility.HtmlDecode("caf&eacute;")));

        Console.WriteLine(doc);
        Console.ReadKey();
    }
}

0

code4life 26 Mar At 16:53

source to share

LB · Accepted Answer · 2012-03-26T16:37:38+0000

Have you tried System.Web.HttpUtility.HtmlDecode("Café")

? it returns 538M results.

How to properly decode accented characters for display

More articles: