I'm writing some unit tests to ensure my code isn't vulnerable to SQL injection under various charsets.
According to this answer, you can create a vulnerability by injecting \xbf\x27
using one of the following charsets: big5
, cp932
, gb2312
, gbk
and sjis
This is because if your escaper is not configured correctly, it will see the 0x27
and try to escape it such that it becomes \xbf\x5c\x27
. However, \xbf\x5c
is actually one character in these charsets, thus the quote (0x27
) is left unescaped.
As I've discovered through testing, however, this is not entirely true. It works for big5
, gb2312
and gbk
but neither 0xbf27
or 0xbf5c
are valid characters in sjis
and cp932
.
Both
mb_strpos("abc\xbf\x27def","'",0,'sjis')
and
mb_strpos("abc\xbf\x27def","'",0,'cp932')
Return 4
. i.e., PHP does not see \xbf\x27
as a single character. This returns false
for big5
, gb2312
and gbk
.
Also, this:
mb_strlen("\xbf\x5c",'sjis')
Returns 2
(it returns 1
for gbk
).
So, the question is: is there another character sequence that make sjis
and cp932
vulnerable to SQL injection, or are they actually not vulnerable at all? or is PHP lying, I'm completely mistaken, and MySQL will interpret this totally differently?
The devil is in the details ... let's start with how answer in question describes the list of vulnerable character sets:
This gives us some context -
0xbf5c
is used as an example forgbk
, not as the universal character to use for all of the 5 character sets.It just so happens that the same byte sequence is also a valid character under
big5
andgb2312
.At this point, your question becomes as easy as this:
To be fair, most of the google searches I tried for these character sets don't give any useful results. But I did find this CP932.TXT file, in which if you search for
'5c '
(with the space there), you'll jump to this line:And we have a winner! :)
Some Oracle document confirms that
0x815c
is the same character for bothcp932
andsjis
and PHP recognizes it too:Here's a PoC script for the attack: