可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I've the following method which allows me to protect MySQL entities:
public function Tick($string)
{
$string = explode('.', str_replace('`', '', $string));
foreach ($string as $key => $value)
{
if ($value != '*')
{
$string[$key] = '`' . trim($value) . '`';
}
}
return implode('.', $string);
}
This works fairly well for the use that I make of it.
It protects database, table, field names and even the * operator, however now I also want it to protect function calls, ie:
AVG(database.employees.salary)
Should become:
AVG(`database`.`employees`.`salary`) and not `AVG(database`.`employees`.`salary)`
How should I go about this? Should I use regular expressions?
Also, how can I support more advanced stuff, from:
MAX(AVG(database.table.field1), MAX(database.table.field2))
To:
MAX(AVG(`database`.`table`.`field1`), MAX(`database`.`table`.`field2`))
Please keep in mind that I want to keep this method as simple/fast as possible, since it pretty much iterates over all the entity names in my database.
回答1:
Using the test case ndp gave I created a regex to do the hard work for you. The following regex will replace all word boundaries around words that are not followed by an opening parenthesis.
\b(\w+)\b(?!\()
The Tick() functionality would then be implemented in PHP as follows:
function Tick($string)
{
return preg_replace( '/\b(\w+)\b(?!\()/', '`\1`', $string );
}
回答2:
If this is quoting parts of an SQL statement, and they have only complexity that you descibe, a RegEx is a great approach. On the other hand, if you need to do this to full SQL statements, or simply more complicated components of statements (such as "MAX(AVG(val),MAX(val2))"), you will need to tokenize or parse the string and have a more sophisticated understanding of it to do this quoting accurately.
Given the regular expression approach, you may find it easier to break the function name out as one step, and then use your current code to quote the database/table/column names. This can be done in one RE, but it will be tricker to get right.
Either way, I'd highly recommend writing a few unit test cases. In fact, this is an ideal situation for this approach: it's easy to write the tests, you have some existing cases that work (which you don't want to break), and you have just one more case to add.
Your test can start as simply as:
assert '`ticked`' == Tick('ticked');
assert '`table`.`ticked`' == Tick('table.ticked');
assert 'db`.`table`.`ticked`' == Tick('db.table.ticked');
And then add:
assert 'FN(`ticked`)' == Tick('FN(ticked)');
etc.
回答3:
It's generally a bad idea to pass the whole SQL to the function. That way, you'll always find a case when it doesn't work, unless you fully parse the SQL syntax.
Put the ticks to the names on some previous abstraction level, which makes up the SQL.
回答4:
Before you explode your string on periods, check if the last character is a parenthesis. If so, this call is a function.
<?php
$string = str_replace('`', '', $string)
$function = "";
if (substr($string,-1) == ")") {
// Strip off function call first
$opening = strpos($string, "(");
$function = substr($string, 0, $opening+1);
$string = substr($string, $opening+1, -1);
}
// Do your existing parsing to $string
if ($function == "") {
// Put function back on string
$string = $function . $string . ")";
}
?>
If you need to cover more advanced situations, like using nested functions, or multiple functions in sequence in one "$string" variable, this would become a much more advanced function, and you'd best ask yourself why these elements aren't being properly ticked in the first place, and not need any further parsing.
EDIT: Updating for nested functions, as per original post edit
To have the above function deal with multiple nested functions, you likely need something that will 'unwrap' your nested functions. I haven't tested this, but the following function might get you on the right track.
<?php
function unwrap($str) {
$pos = strpos($str, "(");
if ($pos === false) return $str; // There's no function call here
$last_close = 0;
$cur_offset = 0; // Start at the beginning
while ($cur_offset <= strlen($str)) {
$first_close = strpos($str, ")", $offset); // Find first deep function
$pos = strrpos($str, "(", $first_close-1); // Find associated opening
if ($pos > $last_close) {
// This function is entirely after the previous function
$ticked = Tick(substr($str, $pos+1, $first_close-$pos)); // Tick the string inside
$str = substr($str, 0, $pos)."{".$ticked."}".substr($str,$first_close); // Replace parenthesis by curly braces temporarily
$first_close += strlen($ticked)-($first_close-$pos); // Shift parenthesis location due to new ticks being added
} else {
// This function wraps other functions; don't tick it
$str = substr($str, 0, $pos)."{".substr($str,$pos+1, $first_close-$pos)."}".substr($str,$first_close);
}
$last_close = $first_close;
$offset = $first_close+1;
}
// Replace the curly braces with parenthesis again
$str = str_replace(array("{","}"), array("(",")"), $str);
}
回答5:
If you are adding the function calls in your code, as opposed to passing them in through a string-only interface, you can replace the string parsing with type checking:
function Tick($value) {
if (is_object($value)) {
$result = $value->value;
} else {
$result = '`'.str_replace(array('`', '.'), array('', '`.`'), $value).'`';
}
return $result;
}
class SqlFunction {
var $value;
function SqlFunction($function, $params) {
$sane = implode(', ', array_map('Tick', $params));
$this->value = "$function($sane)";
}
}
function Maximum($column) {
return new SqlFunction('MAX', array($column));
}
function Avg($column) {
return new SqlFunction('AVG', array($column));
}
function Greatest() {
$params = func_get_args();
return new SqlFunction('GREATEST', $params);
}
$cases = array(
"'simple'" => Tick('simple'),
"'table.field'" => Tick('table.field'),
"'table.*'" => Tick('table.*'),
"'evil`hack'" => Tick('evil`hack'),
"Avg('database.table.field')" => Tick(Avg('database.table.field')),
"Greatest(Avg('table.field1'), Maximum('table.field2'))" => Tick(Greatest(Avg('table.field1'), Maximum('table.field2'))),
);
echo "<table>";
foreach ($cases as $case => $result) {
echo "<tr><td>$case</td><td>$result</td></tr>";
}
echo "</table>";
This avoids any possible SQL injection while remaining legible to future readers of your code.
回答6:
You could use preg_replace_callback()
in conjunction with your Tick()
method to skip at least one level of parens:
public function tick($str)
{
return preg_replace_callback('/[^()]*/', array($this, '_tick_replace_callback'), $str);
}
protected function _tick_replace_callback($str) {
$string = explode('.', str_replace('`', '', $string));
foreach ($string as $key => $value)
{
if ($value != '*')
{
$string[$key] = '`' . trim($value) . '`';
}
}
return implode('.', $string);
}
回答7:
Are you generating the SQL Query or is it being passed to you? If you generating the query I wouldn't pass the whole query string just the parms/values you want to wrap in the backticks or what ever else you need.
EXAMPLE:
function addTick($var) {
return '`' . $var . '`';
}
$condition = addTick($condition);
$SQL = 'SELECT' . $what . '
FROM ' . $table . '
WHERE ' . $condition . ' = ' . $constraint;
This is just a mock but you get the idea that you can pass or loop through your code and build the query string rather than parsing the query string and adding your backticks.