Yup, you read that right. I needs something that is capable of generating random text from a regular expression. So the text should be random, but be matched by the regular expression. It seems it doesn't exist, but I could be wrong.
Just a an example: that library would be capable of taking '[ab]*c
' as input, and generate samples such as:
abc
abbbc
bac
etc.
Update: I created something myself: Xeger. Check out http://code.google.com/p/xeger/.
Here is a Python implementation of a module like that: http://www.mail-archive.com/python-list@python.org/msg125198.html It should be portable to Java.
Here's a few implementations of such a beast, but none of them in Java (and all but the closed-source Microsoft one very limited in their regexp feature support).
based on Wilfred Springer's solution together with http://www.brics.dk/~amoeller/automaton/ i build another generator. It do not use recursion. It take as input the patter/regularExpression minimum String length and maximum String length. The result is an accepted String between min and max length. It also allow some of the XML "short hand character classes". I use this for an XML Sample Generator that build valid String for facets.
I just created a library for doing this a minute ago. It's hosted here: http://code.google.com/p/xeger/. Carefully read the instructions before using it. (Especially the one referring to downloading another required library.) ;-)
This is the way you use it:
I am not aware of such a library. If you're interested in writing one yourself, then these are probably the steps you'll need to take:
Write a parser for regular expressions (you may want to start out with a restricted class of regexes).
Use the result to construct an NFA.
(Optional) Convert the NFA to a DFA.
Randomly traverse the resulting automaton from the start state to any accepting state, while storing the characters outputted by every transition.
The result is a word which is accepted by the original regex. For more, see e.g. Converting a Regular Expression into a Deterministic Finite Automaton.