I just found a bug in some code I didn't write and I'm a bit surprised:
Pattern pattern = Pattern.compile("\\d{1,2}.\\d{1,2}.\\d{4}");
Matcher matcher = pattern.matcher(s);
Despite the fact that this code fails badly on input data we get (because it tries to find dates in the 17.01.2011 format and gets back things like 10396/2011 and then crashed because it can't parse the date but that really ain't the point of this question ; ) I wonder:
isn't one of the point of Pattern.compile to be a speed optimization (by pre-compiling regexps)?
shouldn't all "static" pattern be always compiled into static pattern?
There are so many examples, all around the web, where the same pattern is always recompiled using Pattern.compile that I begin to wonder if I'm seeing things or not.
Isn't (assuming that the string is static and hence not dynamically constructed):
static Pattern pattern = Pattern.compile("\\d{1,2}.\\d{1,2}.\\d{4}");
always preferrable over a non-static pattern reference?
Static Patterns would remain in memory as long as the class is loaded.
If you are worried about memory and want a throw-away
Pattern
that you use once in a while and that can get garbage collected when you are finished with it, then you can use a non-staticPattern
.Pattern
is to only do it once.static
fields should be fine. (UnlikeMatcher
s, which aren't threadsafe and therefore shouldn't really be stored in fields at all, static or not.)The only caveat with compiling patterns in static initializers is that if the pattern doesn't compile and the static initializer throws an exception, the source of the error can be quite annoying to track down. It's a minor maintainability problem but it might be worth mentioning.
first, the bug in pattern is because dot (.) matches everything. If you want to match dot (.) you have to escape it in regex:
Pattern pattern = Pattern.compile("\\d{1,2}\\.\\d{1,2}\\.\\d{4}");
Second,
Pattern.compile()
is a heavy method. It is always recommended to initialize static pattern (I mean patterns that are not being changed or not generated on the fly) only once. One of the popular ways to achieve this is to put thePattern.compile()
into static initializer.You can use other approach. For example using singleton pattern or using framework that creates singleton objects (like Spring).
Yes, compiling the Pattern on each use is wasteful, and defining it statically would result in better performance. See this SO thread for a similar discussion.
It is a classical time vs. memory trade-off. If you are compiling a Pattern only once, don't stick it in a static field. If you measured that compiling Patterns is slow, pre-compile it and put it in a static field.