What do the security implications for the default character set in mysqli_real_escape_string () mean?

The PHP documentation for mysqli_real_escape_string () says that

Caution Security: default character set

The character set must be set either at the server level or using the mysqli_set_charset () API function for it to affect mysqli_real_escape_string ().

Mysqli_real_escape_string source

The following link about character set mentions that

The character set must be understood and defined as it affects every action and includes security implications.

Suource Character Set

Why is it necessary to set a character set for security and what security implications does it include? Can anyone explain the concept of these lines?

Thank you in advance

+3


source to share


2 answers


How SQL queries are parsed depends on the character set of the connection. If you followed this query:

$value = chr(0xE0) . chr(0x5C);
mysql_query("SELECT '$value'");

      

then, if the join character set was latin-1, MySQL would see it invalid:

SELECT 'à\'

      

whereas if the character set was Shift-JIS, the byte sequence 0xE0,0x5C would be interpreted as a double-byte character:

SELECT '濬'

      

Add string literalization for security:

$value = mysql_real_escape_string($value);
mysql_query("SELECT '$value'");

      

Now if you set the Shift-JIS connection character set correctly with mysql_set_charset

MySQL still sees:



SELECT '濬'

      

But if you haven't set the connection character set, and the MySQL default character set is Shift-JIS, but the PHP default character set is ASCII, PHP doesn't know that the final 0x5C character is part of a double byte sequence, and eludes it by thinking that it generates valid output:

SELECT 'à\\'

      

while MySQL reads it using Shift-JIS as:

SELECT '濬\'

      

On completion '

with a backslash, this will cause the string literal to be opened. The next character '

in the query will terminate the line, leaving something in the raw SQL content. If you can enter there, the request will be vulnerable.

This issue only applies to a few East Asian encodings, such as Shift-JIS, where multibyte sequences can contain bytes that are themselves valid ASCII characters such as backslashes. If the inconsistent encodings handle low bytes as always - ASCII (strict ASCII supersets such as the more common Latin-1 vs. UTF-8 mismatch), such confusion is impossible.

Fortunately, the servers that default to these encodings are unusual, so this is a rare problem in practice. But if you have to use mysql_real_escape_string

, you have to do it right. (It's best to avoid this altogether by using parameterized queries.)

+4


source


If you want to protect your applications from SQL injection , you must use prepared statements and not avoid your login. (Don't let MySQL or PDO imitate, or use real prepared statements if you can!)

Only in situations where you cannot use prepared statements should consideration be avoided (dynamically generated queries, LIMIT

). In these particular cases, make sure that you do not make mysqli_real_escape_string()

bypassable due to incorrectly configured character sets
. (This linked StackOverflow answer from ircmaxell explains the problem better than I ever could.)



WordPress recently had an issue where multibyte characters could bypass their SQL escaping strategy and the security team fixed it under the guise of Emoji support .

If you use mysql_real_escape_string()

or mysqli_real_escape_string()

, you are playing with fire. Be careful not to be burned.

+1


source







All Articles