Is there an SQL injection possibility even when using mysql_real_escape_string()
function?
Consider this sample situation. SQL is constructed in PHP like this:
$login = mysql_real_escape_string(GetFromPost('login'));
$password = mysql_real_escape_string(GetFromPost('password'));
$sql = "SELECT * FROM table WHERE login='$login' AND password='$password'";
I have heard numerous people say to me that code like that is still dangerous and possible to hack even with mysql_real_escape_string()
function used. But I cannot think of any possible exploit?
Classic injections like this:
aaa' OR 1=1 --
do not work.
Do you know of any possible injection that would get through the PHP code above?
Well, there's nothing really that can pass through that, other than
%
wildcard. It could be dangerous if you were usingLIKE
statement as attacker could put just%
as login if you don't filter that out, and would have to just bruteforce a password of any of your users. People often suggest using prepared statements to make it 100% safe, as data can't interfere with the query itself that way. But for such simple queries it probably would be more efficient to do something like$login = preg_replace('/[^a-zA-Z0-9_]/', '', $login);
Consider the following query:
mysql_real_escape_string()
will not protect you against this. The fact that you use single quotes (' '
) around your variables inside your query is what protects you against this. The following is also an option:The short answer is yes, yes there is a way to get around
mysql_real_escape_string()
.For Very OBSCURE EDGE CASES!!!
The long answer isn't so easy. It's based off an attack demonstrated here.
The Attack
So, let's start off by showing the attack...
In certain circumstances, that will return more than 1 row. Let's dissect what's going on here:
Selecting a Character Set
For this attack to work, we need the encoding that the server's expecting on the connection both to encode
'
as in ASCII i.e.0x27
and to have some character whose final byte is an ASCII\
i.e.0x5c
. As it turns out, there are 5 such encodings supported in MySQL 5.6 by default:big5
,cp932
,gb2312
,gbk
andsjis
. We'll selectgbk
here.Now, it's very important to note the use of
SET NAMES
here. This sets the character set ON THE SERVER. If we used the call to the C API functionmysql_set_charset()
, we'd be fine (on MySQL releases since 2006). But more on why in a minute...The Payload
The payload we're going to use for this injection starts with the byte sequence
0xbf27
. Ingbk
, that's an invalid multibyte character; inlatin1
, it's the string¿'
. Note that inlatin1
andgbk
,0x27
on its own is a literal'
character.We have chosen this payload because, if we called
addslashes()
on it, we'd insert an ASCII\
i.e.0x5c
, before the'
character. So we'd wind up with0xbf5c27
, which ingbk
is a two character sequence:0xbf5c
followed by0x27
. Or in other words, a valid character followed by an unescaped'
. But we're not usingaddslashes()
. So on to the next step...mysql_real_escape_string()
The C API call to
mysql_real_escape_string()
differs fromaddslashes()
in that it knows the connection character set. So it can perform the escaping properly for the character set that the server is expecting. However, up to this point, the client thinks that we're still usinglatin1
for the connection, because we never told it otherwise. We did tell the server we're usinggbk
, but the client still thinks it'slatin1
.Therefore the call to
mysql_real_escape_string()
inserts the backslash, and we have a free hanging'
character in our "escaped" content! In fact, if we were to look at$var
in thegbk
character set, we'd see:Which is exactly what the attack requires.
The Query
This part is just a formality, but here's the rendered query:
Congratulations, you just successfully attacked a program using
mysql_real_escape_string()
...The Bad
It gets worse.
PDO
defaults to emulating prepared statements with MySQL. That means that on the client side, it basically does a sprintf throughmysql_real_escape_string()
(in the C library), which means the following will result in a successful injection:Now, it's worth noting that you can prevent this by disabling emulated prepared statements:
This will usually result in a true prepared statement (i.e. the data being sent over in a separate packet from the query). However, be aware that PDO will silently fallback to emulating statements that MySQL can't prepare natively: those that it can are listed in the manual, but beware to select the appropriate server version).
The Ugly
I said at the very beginning that we could have prevented all of this if we had used
mysql_set_charset('gbk')
instead ofSET NAMES gbk
. And that's true provided you are using a MySQL release since 2006.If you're using an earlier MySQL release, then a bug in
mysql_real_escape_string()
meant that invalid multibyte characters such as those in our payload were treated as single bytes for escaping purposes even if the client had been correctly informed of the connection encoding and so this attack would still succeed. The bug was fixed in MySQL 4.1.20, 5.0.22 and 5.1.11.But the worst part is that
PDO
didn't expose the C API formysql_set_charset()
until 5.3.6, so in prior versions it cannot prevent this attack for every possible command! It's now exposed as a DSN parameter.The Saving Grace
As we said at the outset, for this attack to work the database connection must be encoded using a vulnerable character set.
utf8mb4
is not vulnerable and yet can support every Unicode character: so you could elect to use that instead—but it has only been available since MySQL 5.5.3. An alternative isutf8
, which is also not vulnerable and can support the whole of the Unicode Basic Multilingual Plane.Alternatively, you can enable the
NO_BACKSLASH_ESCAPES
SQL mode, which (amongst other things) alters the operation ofmysql_real_escape_string()
. With this mode enabled,0x27
will be replaced with0x2727
rather than0x5c27
and thus the escaping process cannot create valid characters in any of the vulnerable encodings where they did not exist previously (i.e.0xbf27
is still0xbf27
etc.)—so the server will still reject the string as invalid. However, see @eggyal's answer for a different vulnerability that can arise from using this SQL mode.Safe Examples
The following examples are safe:
Because the server's expecting
utf8
...Because we've properly set the character set so the client and the server match.
Because we've turned off emulated prepared statements.
Because we've set the character set properly.
Because MySQLi does true prepared statements all the time.
Wrapping Up
If you:
mysql_set_charset()
/$mysqli->set_charset()
/ PDO's DSN charset parameter (in PHP ≥ 5.3.6)OR
utf8
/latin1
/ascii
/ etc)You're 100% safe.
Otherwise, you're vulnerable even though you're using
mysql_real_escape_string()
...This is another, (perhaps less?) obscure EDGE CASE!!!
In homage to @ircmaxell's excellent answer (really, this is supposed to be flattery and not plagiarism!), I will adopt his format:
The Attack
Starting off with a demonstration...
This will return all records from the
test
table. A dissection:Selecting an SQL Mode
As documented under String Literals:
If the server's SQL mode includes
NO_BACKSLASH_ESCAPES
, then the third of these options—which is the usual approach adopted bymysql_real_escape_string()
—is not available: one of the first two options must be used instead. Note that the effect of the fourth bullet is that one must necessarily know the character that will be used to quote the literal in order to avoid munging one's data.The Payload
The payload initiates this injection quite literally with the
"
character. No particular encoding. No special characters. No weird bytes.mysql_real_escape_string()
Fortunately,
mysql_real_escape_string()
does check the SQL mode and adjust its behaviour accordingly. Seelibmysql.c
:Thus a different underlying function,
escape_quotes_for_mysql()
, is invoked if theNO_BACKSLASH_ESCAPES
SQL mode is in use. As mentioned above, such a function needs to know which character will be used to quote the literal in order to repeat it without causing the other quotation character from being repeated literally.However, this function arbitrarily assumes that the string will be quoted using the single-quote
'
character. Seecharset.c
:So, it leaves double-quote
"
characters untouched (and doubles all single-quote'
characters) irrespective of the actual character that is used to quote the literal! In our case$var
remains exactly the same as the argument that was provided tomysql_real_escape_string()
—it's as though no escaping has taken place at all.The Query
Something of a formality, the rendered query is:
As my learned friend put it: congratulations, you just successfully attacked a program using
mysql_real_escape_string()
...The Bad
mysql_set_charset()
cannot help, as this has nothing to do with character sets; nor canmysqli::real_escape_string()
, since that's just a different wrapper around this same function.The problem, if not already obvious, is that the call to
mysql_real_escape_string()
cannot know with which character the literal will be quoted, as that's left to the developer to decide at a later time. So, inNO_BACKSLASH_ESCAPES
mode, there is literally no way that this function can safely escape every input for use with arbitrary quoting (at least, not without doubling characters that do not require doubling and thus munging your data).The Ugly
It gets worse.
NO_BACKSLASH_ESCAPES
may not be all that uncommon in the wild owing to the necessity of its use for compatibility with standard SQL (e.g. see section 5.3 of the SQL-92 specification, namely the<quote symbol> ::= <quote><quote>
grammar production and lack of any special meaning given to backslash). Furthermore, its use was explicitly recommended as a workaround to the (long since fixed) bug that ircmaxell's post describes. Who knows, some DBAs might even configure it to be on by default as means of discouraging use of incorrect escaping methods likeaddslashes()
.Also, the SQL mode of a new connection is set by the server according to its configuration (which a
SUPER
user can change at any time); thus, to be certain of the server's behaviour, you must always explicitly specify your desired mode after connecting.The Saving Grace
So long as you always explicitly set the SQL mode not to include
NO_BACKSLASH_ESCAPES
, or quote MySQL string literals using the single-quote character, this bug cannot rear its ugly head: respectivelyescape_quotes_for_mysql()
will not be used, or its assumption about which quote characters require repeating will be correct.For this reason, I recommend that anyone using
NO_BACKSLASH_ESCAPES
also enablesANSI_QUOTES
mode, as it will force habitual use of single-quoted string literals. Note that this does not prevent SQL injection in the event that double-quoted literals happen to be used—it merely reduces the likelihood of that happening (because normal, non-malicious queries would fail).In PDO, both its equivalent function
PDO::quote()
and its prepared statement emulator call uponmysql_handle_quoter()
—which does exactly this: it ensures that the escaped literal is quoted in single-quotes, so you can be certain that PDO is always immune from this bug.As of MySQL v5.7.6, this bug has been fixed. See change log:
Safe Examples
Taken together with the bug explained by ircmaxell, the following examples are entirely safe (assuming that one is either using MySQL later than 4.1.20, 5.0.22, 5.1.11; or that one is not using a GBK/Big5 connection encoding):
...because we've explicitly selected an SQL mode that doesn't include
NO_BACKSLASH_ESCAPES
....because we're quoting our string literal with single-quotes.
...because PDO prepared statements are immune from this vulnerability (and ircmaxell's too, provided either that you're using PHP≥5.3.6 and the character set has been correctly set in the DSN; or that prepared statement emulation has been disabled).
...because PDO's
quote()
function not only escapes the literal, but also quotes it (in single-quote'
characters); note that to avoid ircmaxell's bug in this case, you must be using PHP≥5.3.6 and have correctly set the character set in the DSN....because MySQLi prepared statements are safe.
Wrapping Up
Thus, if you:
OR
OR
in addition to employing one of the solutions in ircmaxell's summary, use at least one of:
NO_BACKSLASH_ESCAPES
...then you should be completely safe (vulnerabilities outside the scope of string escaping aside).