可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
In Javascript I have defined a regular expression and now a user is typing in a string. I want to tell him if his string still could match the RegExp if he continues typing or if he's already on the wrong way. For instance:
var re = /a*b/;
"a".isPrefixOf( re ); // true
"x".isPrefixOf( re ); // false
How could an implementation of isPrefixOf
look like?
Update: Thanks for your answers, making the regex prefix-proof, as suggested by brad, seems to be a good workaround. But I'm still trying to find a general solution.
Maybe this way: We create a new regex with the user input followed by .*
. This regex describes all words that the user still may enter. If the intersection of this created regex and the original regex is empty then the user is already on the wrong way. If it's not, he's doing fine. For instance:
var re = /a*b/;
var sInput = "a";
var reInput = new RegExp( sInput + ".*" );
reIntersection = re.intersect( reInput );
reIntersection.isEmpty(); // false
intersect()
returns a new regex that accepts only word which both re
and reInput
would accept. The function doesn't exist yet but we can implement it using look-ahead:
RegExp.prototype.intersect = function( pattern2 ) {
return new RegExp( '(?=' + this.source + ')' + pattern2.source );
}
What remains open is the isEmpty()
function. How could we check, if a Javascript regex matches any word or if it's empty?
回答1:
People seem to be splitting evenly on how they interpret this question, so I'll demonstrate the concept with a Java example.
import java.util.regex.*;
public class Test
{
public static void main(String[] args) throws Exception
{
tryMatch("^a*b+$", "a", "ab", "abc");
}
public static void tryMatch(String regex, String... targets)
{
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher("");
System.out.printf("%nregex: %s%n", regex);
System.out.printf("target | matches() | hitEnd()%n");
for (String str : targets)
{
m.reset(str);
System.out.printf("%-6s | %-9B | %-9B%n",
str, m.matches(), m.hitEnd());
}
}
}
output:
regex: ^a*b+$
target | matches() | hitEnd()
a | FALSE | TRUE
ab | TRUE | TRUE
abc | FALSE | FALSE
Target string "a" doesn't match because the regex requires at least one b
, but it could be the prefix of a successful match, so hitEnd()
returns true
. String "ab" has all that's required for a match, but it would also match if we added more b
's to the end, so hitEnd()
still returns true
. With "abc" the match attempt fails before it reaches the end of the target string, so the regex couldn't match any string that starts with "abc".
As far as I know, Javascript doesn't have anything like Java's hitEnd()
method, but it might be possible to fake it. If anyone knows how, it'll be that Flagrant Badass, Steven Levithan.
回答2:
I think your best bet here is to make your Regex prefix-proof. For the example you gave, /a*b/
, I think you could probably use /a*b?/.test(userinput)
. For more complex patterns this could get increasingly difficult, but I still think it can be done by nesting each subexpression in a series of optional quantifiers (?
). For instance:
/a*bcd*e/
The prefix regex could be:
/a*(b(c(d*e?)?)?)?/
Its a little messy, but will solve your problem rather well I think.
回答3:
Very interesting question. In my quick search, I didn't find anything predefined (not even in Perl) that solves this problem.
EDIT: Ouch, it seems Java has something similar called hitEnd() -- see Alan M's answer. What hitEnd() does is say that the result of match() (either true or false) might be modified by additional input. The book 'Mastering Regular Expressions" says it's not very reliable though (not sure why, page 392 not available in google books).
Depending on what features of regular expressions you use, a quick hack like writing some sort of prefixes of your regexp:
e.g. for a+a*b+c your prefixes will be:
a+
a+a*
a+a*b+
a+a*b+c
and try to match any of them with your string might work. This quick hack is made difficult if you use the choice operator, if you use the range operator {n,m} or back-references.
That being said, I think the good solution is to slightly modify the matching algorithm.
The matching algorithm normally employed is a backtracking algorithm (which works well in practice, even if worst case behavior is exponential). This algorithm successfully terminates whenever it has reached the end of the regexp (even if not the entire string was consumed). What you need to do is to modify the termination condition such that it also terminates successfully when it has consumed all of the input.
That being said, you'd probably have to actually implement the algorithm in JavaScript. Hopefully this will become part of libraries such as Jquery.
For more references and theory on the algorithm, check this article out:
http://swtch.com/~rsc/regexp/regexp1.html
(even if it makes a case against the backtracking algorithm and suggests a FA based algorithm (but the FA cannot handle back-references)).
回答4:
First you define your regular expression as:
var re = new RegExp(/^(regexp here)$/);
on the onKeypress event, you check the regexp like this:
text.match(regexp) - where the text is the string entered.
Is this clear?
回答5:
One way of doing this could be to hook to the onKeyUp event of a text box and .test the text against the regular expression.
My assumption is of course that you want to do a regular expression matching.
I'm not sure if this is exactly what you need, in fact your code:
"a".isPrefixOf( re ); // true
will never match since it's required to also have a subsequent "b" character (you may want to modify the regular expression).
For instance, this code will test against any string matching this format:
a-n(n)-b
Here is the code, save it as a page and load it in your browser:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="it">
<body>
<input type="text" size="20" id="txtData" onkeyup="showResult()" />
<div id="dvResult" />
</body>
</html>
<script type="text/javascript">
//<![CDATA[
theRegExp = /^a\-\d{1,2}\-b$/;
function isPrefixOf( aText, aRegExp )
{
return aRegExp.test( aText );
}
function showResult()
{
res = document.getElementById( "dvResult" );
res.innerHTML = isPrefixOf( document.getElementById( "txtData" ).value, theRegExp ) ? "Correct" : "Bad input";
}
//]]>
</script>