介绍
首先我一般的问题是,我想字符串替换字符串中的问号,但只有当他们没有报价。 所以我发现SO(一个类似的答案链接 ),并开始测试出来的代码。 不幸的是,当然,代码中并没有考虑到转义引号。
例如: $string = 'hello="is it me your are looking for\\"?" AND test=?';
我已经适应了从一个正则表达式和代码这个问题的答案的问题: 如何更换双和单引号外面的话 ,这是在这里转载为了便于阅读,我的问题:
<?php
function str_replace_outside_quotes($replace,$with,$string){
$result = "";
$outside = preg_split('/("[^"]*"|\'[^\']*\')/',$string,-1,PREG_SPLIT_DELIM_CAPTURE);
while ($outside)
$result .= str_replace($replace,$with,array_shift($outside)).array_shift($outside);
return $result;
}
?>
实际问题
所以我试图调整格局,允许它来搭配任何不报价"
并且被转义引号\"
:
<?php
$pattern = '/("(\\"|[^"])*"' . '|' . "'[^']*')/";
// when parsed/echoed by PHP the pattern evaluates to
// /("(\"|[^"])*"|'[^']*')/
?>
但正如我曾希望这是行不通的。
我的测试字符串是: hello="is it me your are looking for\"?" AND test=?
我正在以下的比赛:
array
0 => string 'hello=' (length=6)
1 => string '"is it me your are looking for\"?"' (length=34)
2 => string '?' (length=1)
3 => string ' AND test=?' (length=11)
匹配索引大二不应该存在。 这个问号,应考虑配建指标1,而不是单独重复的部分。
一旦解决了这个相同的修复也应该纠正的主要交替另一侧为单引号/撇号以及'
。
在此之后被它应该输出的完整功能解析:
echo str_replace_outside_quotes('?', '%s', 'hello="is it me your are looking for\\"?" AND test=?');
// hello="is it me your are looking for\"?" AND test=%s
我希望这是有道理的,我已经提供了足够的信息来回答这个问题。 如果不是我会很乐意提供任何你需要的。
调试代码
我现在(完成)的代码示例是在键盘的分叉以及 :
function str_replace_outside_quotes($replace, $with, $string){
$result = '';
var_dump($string);
$pattern = '/("(\\"|[^"])*"' . '|' . "'[^']*')/";
var_dump($pattern);
$outside = preg_split($pattern, $string, -1, PREG_SPLIT_DELIM_CAPTURE);
var_dump($outside);
while ($outside) {
$result .= str_replace($replace, $with, array_shift($outside)) . array_shift($outside);
}
return $result;
}
echo str_replace_outside_quotes('?', '%s', 'hello="is it me your are looking for\\"?" AND test=?');
样品输入和预期的输出
In: hello="is it me your are looking for\\"?" AND test=? AND hello='is it me your are looking for\\'?' AND test=? hello="is it me your are looking for\\"?" AND test=?' AND hello='is it me your are looking for\\'?' AND test=?
Out: hello="is it me your are looking for\\"?" AND test=%s AND hello='is it me your are looking for\\'?' AND test=%s hello="is it me your are looking for\\"?" AND test=%s AND hello='is it me your are looking for\\'?' AND test=%s
In: my_var = ? AND var_test = "phoned?" AND story = 'he said \'where is it?!?\''
Out: my_var = %s AND var_test = "phoned?" AND story = 'he said \'where is it?!?\''
Answer 1:
下面的测试脚本首先检查是否一个给定的字符串是有效的,仅由单引号,双引号和未引述块。 在$re_valid
正则表达式执行此验证任务。 如果字符串是有效的,它然后使用解析字符串一个大块同时preg_replace_callback()
和$re_parse
正则表达式。 回调函数处理使用未加引号块preg_replace()
并返回所有引用块不变。 逻辑的唯一棘手的部分是通过在$replace
和$with
main函数的回调函数的参数值。 (请注意,PHP程序代码,使得从main函数回调函数有点别扭这个变量传递。)下面是脚本:
<?php // test.php Rev:20121113_1500
function str_replace_outside_quotes($replace, $with, $string){
$re_valid = '/
# Validate string having embedded quoted substrings.
^ # Anchor to start of string.
(?: # Zero or more string chunks.
"[^"\\\\]*(?:\\\\.[^"\\\\]*)*" # Either a double quoted chunk,
| \'[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*\' # or a single quoted chunk,
| [^\'"\\\\]+ # or an unquoted chunk (no escapes).
)* # Zero or more string chunks.
\z # Anchor to end of string.
/sx';
if (!preg_match($re_valid, $string)) // Exit if string is invalid.
exit("Error! String not valid.");
$re_parse = '/
# Match one chunk of a valid string having embedded quoted substrings.
( # Either $1: Quoted chunk.
"[^"\\\\]*(?:\\\\.[^"\\\\]*)*" # Either a double quoted chunk,
| \'[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*\' # or a single quoted chunk.
) # End $1: Quoted chunk.
| ([^\'"\\\\]+) # or $2: an unquoted chunk (no escapes).
/sx';
_cb(null, $replace, $with); // Pass args to callback func.
return preg_replace_callback($re_parse, '_cb', $string);
}
function _cb($matches, $replace = null, $with = null) {
// Only set local static vars on first call.
static $_replace, $_with;
if (!isset($matches)) {
$_replace = $replace;
$_with = $with;
return; // First call is done.
}
// Return quoted string chunks (in group $1) unaltered.
if ($matches[1]) return $matches[1];
// Process only unquoted chunks (in group $2).
return preg_replace('/'. preg_quote($_replace, '/') .'/',
$_with, $matches[2]);
}
$data = file_get_contents('testdata.txt');
$output = str_replace_outside_quotes('?', '%s', $data);
file_put_contents('testdata_out.txt', $output);
?>
Answer 2:
»代码已更新,以解决意见提出的所有问题,我现在能正常«
有$s
输入, $p
短语字符串, $v
的替代变量,使用的preg_replace如下:
$r = '/\G((?:(?:[^\x5C"\']|\x5C(?!["\'])|\x5C["\'])*?(?:\'(?:[^\x5C\']|\x5C(?!\')' .
'|\x5C\')*\')*(?:"(?:[^\x5C"]|\x5C(?!")|\x5C")*")*)*?)' . preg_quote($p) . '/';
$s = preg_match($r, $s) ? preg_replace($r, "$1" . $v, $s) : $s;
检查这个演示 。
注意:在正则表达式, \x5C
表示\
字符。
Answer 3:
此正则表达式匹配有效引号的字符串。 这意味着它知道转义引号的。
^("[^\"\\]*(?:\\.[^\"\\]*)*(?![^\\]\\)")|('[^\'\\]*(?:\\.[^\'\\]*)*(?![^\\]\\)')$
准备PHP使用:
$pattern = '/^((?:"([^"\\\\]*(?:\\\\.[^"\\\\]*)*(?![^\\\\]\\\\))")|(?:\'([^\'\\\\]*(?:\\\\.[^\'\\\\]*)*(?![^\\\\]\\\\))\'))$/';
适用于str_replace_outside_quotes()
$pattern = '/((?:"(?:[^"\\\\]*(?:\\\\.[^"\\\\]*)*(?![^\\\\]\\\\))")|(?:\'(?:[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*(?![^\\\\]\\\\))\'))/';
Answer 4:
编辑,更改答案。 不与正则表达式的工作(仅现在的正则表达式 - 我认为这将是更好地使用的preg_replace而不str_replace函数,但你可以改变)):
function replace_special($what, $with, $str) {
$res = '';
$currPos = 0;
$doWork = true;
while (true) {
$doWork = false; //pesimistic approach
$pos = get_quote_pos($str, $currPos, $quoteType);
if ($pos !== false) {
$posEnd = get_specific_quote_pos($str, $quoteType, $pos + 1);
if ($posEnd !== false) {
$doWork = $posEnd !== strlen($str) - 1; //do not break if not end of string reached
$res .= preg_replace($what, $with,
substr($str, $currPos, $pos - $currPos));
$res .= substr($str, $pos, $posEnd - $pos + 1);
$currPos = $posEnd + 1;
}
}
if (!$doWork) {
$res .= preg_replace($what, $with,
substr($str, $currPos, strlen($str) - $currPos + 1));
break;
}
}
return $res;
}
function get_quote_pos($str, $currPos, &$type) {
$pos1 = get_specific_quote_pos($str, '"', $currPos);
$pos2 = get_specific_quote_pos($str, "'", $currPos);
if ($pos1 !== false) {
if ($pos2 !== false && $pos1 > $pos2) {
$type = "'";
return $pos2;
}
$type = '"';
return $pos1;
}
else if ($pos2 !== false) {
$type = "'";
return $pos2;
}
return false;
}
function get_specific_quote_pos($str, $type, $currPos) {
$pos = $currPos - 1; //because $fromPos = $pos + 1 and initial $fromPos must be currPos
do {
$fromPos = $pos + 1;
$pos = strpos($str, $type, $fromPos);
}
//iterate again if quote is escaped!
while ($pos !== false && $pos > $currPos && $str[$pos-1] == '\\');
return $pos;
}
例:
$str = 'hello ? ="is it me your are looking for\\"?" AND mist="???" WHERE test=? AND dzo=?';
echo replace_special('/\?/', '#', $str);
回报
你好#= “是我你正在寻找\”?”和薄雾= “???” WHERE测试=#AND dzo =#
----
--old答案(我住在这里,因为它确实解决了一些虽然不是完全的问题)
<?php
function str_replace_outside_quotes($replace, $with, $string){
$result = '';
var_dump($string);
$pattern = '/(?<!\\\\)"/';
$outside = preg_split($pattern, $string, -1, PREG_SPLIT_DELIM_CAPTURE);
var_dump($outside);
for ($i = 0; $i < count($outside); ++$i) {
$replaced = str_replace($replace, $with, $outside[$i]);
if ($i != 0 && $i != count($outside) - 1) { //first and last are not inside quote
$replaced = '"'.$replaced.'"';
}
$result .= $replaced;
}
return $result;
}
echo str_replace_outside_quotes('?', '%s', 'hello="is it me your are looking for\\"?" AND test=?');
Answer 5:
作为@ridgerunner提到关于该问题的意见存在另一种可能的正则表达式的解决方案:
function str_replace_outside_quotes($replace, $with, $string){
$result = '';
$pattern = '/("[^"\\\\]*(?:\\\\.[^"\\\\]*)*")' // hunt down unescaped double quotes
. "|('[^'\\\\]*(?:\\\\.[^'\\\\]*)*')/s"; // or single quotes
$outside = array_filter(preg_split($pattern, $string, -1, PREG_SPLIT_DELIM_CAPTURE));
while ($outside) {
$result .= str_replace($replace, $with, array_shift($outside)) // outside quotes
. array_shift($outside); // inside quotes
}
return $result;
}
注意用array_filter
删除一些比赛是从正则表达式空回来,并打破该函数的性质交替。
我很快敲了一个没有正则表达式的方法。 它的工作原理,但我相信有一些优化技术可以做。
function str_replace_outside_quotes($replace, $with, $string){
$string = str_split($string);
$accumulation = '';
$current_unquoted_string = null;
$inside_quote = false;
$quotes = array("'", '"');
foreach($string as $char) {
if ($char == $inside_quote && "\\" != substr($accumulation, -1)) {
$inside_quote = false;
} else if(false === $inside_quote && in_array($char, $quotes)) {
$inside_quote = $char;
}
if(false === $inside_quote) {
$current_unquoted_string .= $char;
} else {
if(null !== $current_unquoted_string) {
$accumulation .= str_replace($replace, $with, $current_unquoted_string);
$current_unquoted_string = null;
}
$accumulation .= $char;
}
}
if(null !== $current_unquoted_string) {
$accumulation .= str_replace($replace, $with, $current_unquoted_string);
$current_unquoted_string = null;
}
return $accumulation;
}
在我的基准测试所花费的正则表达式的方法的两倍以上的时间,当字符串长度的增加,正则表达式选项资源的使用不会受太大上去。 上述上线性文本馈送给它的长度另一方面增加的方法。
文章来源: How can I adapt my regex to allow for escaped quotes?