正则表达式与转义引号括起来的字符串正则表达式与转义引号括起来的字符串(Regex for quote

2019-05-09 07:16发布

站内文章 / 前沿技术

19 0

女 | 书童

私信

我如何获得子" It's big \"problem "使用正则表达式？

s = ' function(){  return " It\'s big \"problem  ";  }';

Answer 1:

/"(?:[^"\\]|\\.)*"/

工作在正则表达式教练和PCRE工作台。

在JavaScript中测试的实施例：

  var s = ' function(){ return " Is big \\"problem\\", \\no? "; }'; var m = s.match(/"(?:[^"\\]|\\.)*"/); if (m != null) alert(m);

Answer 2:

这其中来自于许多Linux发行版提供nanorc.sample。它是用来对C风格的字符串的语法高亮

\"(\\.|[^\"])*\"

Answer 3:

正如ePharaoh提供，答案是

/"([^"\\]*(\\.[^"\\]*)*)"/

有上述适用于任何单引号或双引号中的字符串，用

/"([^"\\]*(\\.[^"\\]*)*)"|\'([^\'\\]*(\\.[^\'\\]*)*)\'/

Answer 4:

"(?:\\"|.)*?"

交替\"和.经过转义引号，而懒惰量词*?确保你不走过去的引用字符串的结尾。工程与.NET框架RE类

Answer 5:

这里大部分提供的解决方案中使用替代重复的路径，即（A | B）*。

您可能会遇到上大投入的栈溢出，因为一些模式编译器实现了这个使用递归。

Java的，例如： http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6337993

是这样的： "(?:[^"\\]*(?:\\.)?)*" ，或由Guy贝德福德提供将减少避免最堆栈溢出解析步骤的量的一个。

Answer 6:

/"(?:[^"\\]++|\\.)*+"/

来自直man perlre在Linux系统上安装的Perl 5.22.0。作为一种优化，这个正则表达式同时使用的“posessive”形式+和*防止回溯，因为预先知道没有结束引号的字符串将不会在任何情况下匹配。

Answer 7:

这一个完美的作品在PCRE，不与StackOverflow的下降。

"(.*?[^\\])??((\\\\)+)?+"

说明：

每个引号字符串开头字符： " ;
它可以包含任意数量的任何字符： .*? {懒惰匹配}; 与非转义字符结束[^\\] ;
声明（2）是懒惰（！），可选的，因为字符串可以是空的（“”）。所以： (.*?[^\\])??
最后，每一个带引号的字符串用字符（结束" ），但它可以与偶数逃生标志对前面(\\\\)+ ;它是贪婪可选：（！） ((\\\\)+)?+ {贪婪匹配}，bacause字符串可以为空或不结束对！

Answer 8:

/(["\']).*?(?<!\\)(\\\\)*\1/is

应该与任何引用字符串工作

Answer 9:

这里是一个既“和”工作，你很容易在开始添加其他。

("|')(?:\\\1|[^\1])*?\1

它使用反向引用（\ 1）匹配exactley什么是在第一组中（”或“）。

http://www.regular-expressions.info/backref.html

Answer 10:

人们必须记住，正则表达式是不是一切串-Y银弹。有些东西是简单的做一个光标和线性，手动，求。一个CFL会做的伎俩很平凡，但不会有太多的CFL实现（据我所知）。

Answer 11:

更广泛的版本https://stackoverflow.com/a/10786066/1794894

/"([^"\\]{50,}(\\.[^"\\]*)*)"|\'[^\'\\]{50,}(\\.[^\'\\]*)*\'|“[^”\\]{50,}(\\.[^“\\]*)*”/

此版本还包含

的50最低报价长度
额外的类型引号（打开的“闭” ）

Answer 12:

大约搞砸regexpal并结束了与此正则表达式：（不要问我它是如何工作的，我不太了解甚至寿我写的笑）

"(([^"\\]?(\\\\)?)|(\\")+)+"

Answer 13:

如果从一开始就搜查，也许这可以工作？

\"((\\\")|[^\\])*\"

Answer 14:

尚未触及前一个选项：

反向的字符串。
在逆转字符串进行匹配。
再反向匹配的字符串。

这具有能够正确地匹配逃脱开放标签额外的奖励。

比方说，你有下面的字符串; String \"this "should" NOT match\" and "this \"should\" match"在这里， \"this "should" NOT match\"不应该被匹配， "should"应该是。在那上面this \"should\" match应该匹配和\"should\"不应该。

首先一个例子。

// The input string.
const myString = 'String \\"this "should" NOT match\\" and "this \\"should\\" match"';

// The RegExp.
const regExp = new RegExp(
    // Match close
    '([\'"])(?!(?:[\\\\]{2})*[\\\\](?![\\\\]))' +
    '((?:' +
        // Match escaped close quote
        '(?:\\1(?=(?:[\\\\]{2})*[\\\\](?![\\\\])))|' +
        // Match everything thats not the close quote
        '(?:(?!\\1).)' +
    '){0,})' +
    // Match open
    '(\\1)(?!(?:[\\\\]{2})*[\\\\](?![\\\\]))',
    'g'
);

// Reverse the matched strings.
matches = myString
    // Reverse the string.
    .split('').reverse().join('')
    // '"hctam "\dluohs"\ siht" dna "\hctam TON "dluohs" siht"\ gnirtS'

    // Match the quoted
    .match(regExp)
    // ['"hctam "\dluohs"\ siht"', '"dluohs"']

    // Reverse the matches
    .map(x => x.split('').reverse().join(''))
    // ['"this \"should\" match"', '"should"']

    // Re order the matches
    .reverse();
    // ['"should"', '"this \"should\" match"']

好吧，现在来解释的正则表达式。这是正则表达式可以很容易地分为三段。如下：

# Part 1
(['"])         # Match a closing quotation mark " or '
(?!            # As long as it's not followed by
  (?:[\\]{2})* # A pair of escape characters
  [\\]         # and a single escape
  (?![\\])     # As long as that's not followed by an escape
)
# Part 2
((?:          # Match inside the quotes
(?:           # Match option 1:
  \1          # Match the closing quote
  (?=         # As long as it's followed by
    (?:\\\\)* # A pair of escape characters
    \\        # 
    (?![\\])  # As long as that's not followed by an escape
  )           # and a single escape
)|            # OR
(?:           # Match option 2:
  (?!\1).     # Any character that isn't the closing quote
)
)*)           # Match the group 0 or more times
# Part 3
(\1)           # Match an open quotation mark that is the same as the closing one
(?!            # As long as it's not followed by
  (?:[\\]{2})* # A pair of escape characters
  [\\]         # and a single escape
  (?![\\])     # As long as that's not followed by an escape
)

这可能是在图像更清晰的形式很多：使用产生的杰克斯的Regulex

在github上的图像（JavaScript的正则表达式可视化。）对不起，我没有足够高的声誉，包括图像，因此，它只是一个链接了。

下面是一个使用这个概念，更是一个高级一点的例子功能的要点： https://gist.github.com/scagood/bd99371c072d49a4fee29d193252f5fc#file-matchquotes-js

Answer 15:

我遇到了类似的问题，试图删除引用的字符串可能与某些文件的解析干涉。

我结束了与击败任何令人费解的正则表达式，你可以拿出一个两步的解决方案：

 line = line.replace("\\\"","\'"); // Replace escaped quotes with something easier to handle
 line = line.replaceAll("\"([^\"]*)\"","\"x\""); // Simple is beautiful

更容易阅读和可能更有效。

文章来源: Regex for quoted string with escaping quotes