What behavior is browser url encoded?

Question

What behavior is browser url encoded?

I am doing the test as a Firefox encoding character.

But the fact confused me.

HTML code:

<html lang="zh_CN">
<head>
<title>some Chinese character</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<img src="http://localhost/xxx" />
</body>

XXX are some Chinese characters. This character must be formatted as% xx for HTTP transport.

First, I am encoding the original file in UTF-8. use firefox to open html file. The img tag will send a request, the "xxx" character has been UTF8 encoded.

(encode original HTML file UTF8, charset = utf8, browser UTF encoding url)

I changed the meta in <meta http-equiv="Content-Type" content="text/html; charset=gbk">

but nothing has changed.

(encode original HTML file UTF8, charset = gbk, browser UTF encoding url)

Secondly, I keep the original file in ANSI , maybe GBK or GB2312.

when charset = gbk, still encoding UTF8 character.

(encode original HTML file by GBK, charset = gbk, browser UTF encoding url)

BUT when charset = utf8 the characters are GBK encoded. By the way, another Chinese character cannot be displayed correctly, eg. Title line.

(encode original HTML file with GBK, charset = utf8, GBK browser encoding url)

How to control browser behavior?

0

html http browser encoding

HUA Di Dec 22. 12 at 8:21

source to share

1 answer

Esailija · Accepted Answer · 2012-12-22T08:33:03+0000

UTF-8 is the standard for URL encoding. If you physically encode the original file in GBK, but use it utf-8

in the content type, you are simply lying to the browser and you will get inconsistent or not working results.

When the new URI scheme defines a component representing textual data composed of universal character set [UCS] characters, the data must first be encoded as octets according to the UTF-8 character encoding [STD63]; then only those octets that do not correspond to characters in the unreserved set are encoded. For example, character A would be represented as "A", LATIN CAPITAL CHARACTER OF LETTER A WITH GRAVE would be represented as "% C3% 80", and character KATAKANA LETTER A would be represented as "% E3% 82% A2

What behavior is browser url encoded?

More articles: