Java gravel of the day: String.match() is anchored

jdavidb on 2009-01-28T16:40:58

In Java, String.match() conveniently takes a regular expression, not just an ordinary String. However, I just found the following:

"2".matches("2");  // returns true; good
"2007".matches("2");  // returns false -- huh???

Turns out it looks like matches() is anchored; it's implicitly acting as if the regular expression is wrapped in ^ and $. So the regular expression I supply to matches() has to consume the entire string, from beginning to end, or it doesn't match. Sigh.

"2007".matches("2.*");  // returns true -- sigh

This of course is not specified in the documentation anywhere that I can see. At least, not in the Javadocs for the String.matches() method.


It is specified…

dakkar on 2009-01-28T18:29:06

although you have to really search for it.

String("foo").matches("bar") says that it's equivalent to Pattern.matches("bar","foo"), which is equivalent to Pattern.compile("bar").matcher("foo").matches(); the Pattern.matcher method returns a Matcher object, whose documentation says «The matches method attempts to match the entire input sequence against the pattern» and «The find method scans the input sequence looking for the next subsequence that matches the pattern».

No, it's not clear, it's not easy to find, but it is there… Python, on the other hand, has a section in the Library Reference clearly titled «Matching vs Searching» where it explains the same thing.

Agreed

ChrisDolan on 2009-01-29T01:32:28

As dakkar said, it IS documemented, but just barely. This same problem has bitten me about once every 2 months for the last two years (since switching to Java 1.5). You'd think I'd learn...

If only they made matchesAll() and matchesAny() instead of matches().

Re:Agreed

david.romano on 2009-01-30T06:13:07

This same problem has bitten me about once every 2 months for the last two years (since switching to Java 1.5).

I remember this biting me a few years ago when I was doing text parsing and tagging, before Java 1.5. The project I was working on constantly reminded me of this difference from Perl because I actually had to translate the "prototype" code from C-styled Perl to Java for performance reasons (it was never benchmarked, actually). Sigh.