Java finally catches up to Perl (and Python, Tcl)

pudge on 2002-02-14T23:07:28

rjray writes "Over at /. they're reporting that Sun has finally released the first official version of their Java 2 SDK version 1.4. Read the release notes here. Java finally has native support of regular expressions, one of the first things I found lacking when I took a shot at Java programming some time ago. The regex's are even referred to as "Perl-like" in at least one place, maybe more!"


Yeah, but

djberg96 on 2002-02-14T23:30:03

...it's still dirt slow

...and overly verbose

Not very Perl-like....

VSarkiss on 2002-02-15T00:08:15

The example program takes 92 lines to implement a file grep. It implements the pattern .*\r?\n, and if you wonder why that's bad, read Ovid's Death to Dot Star post over at PerlMonks. (I'm presuming the Java regex engine backtracks, like other regex engines I know of.)

Re:Not very Perl-like....

Ovid on 2002-02-15T00:59:17

Hmmm... now I'm rather curious as to what's going on here. Here's the regex code to match a line:

// Pattern used to parse lines
private static Pattern linePattern
= Pattern.compile(".*\r?\n");

I don't think VSarkiss' criticism of the dot star is appropriate in this case as Java documentation states that the dot does not match a line terminator. However, they appear to have goofed up the line terminator! What about \r on Macs? From what I can tell from their docs, the carriage return/newline in the regex is superfluous. They could at least get their regexes straight if they post sample code. They have multiline mode which appears to be what they were actually looking for and would be more robust than the listed solution.

Re:Not very Perl-like....

Desmodromic on 2002-02-15T05:58:23

Argh. Don't have to convince me that Java is a big fat slug, but let's be fair: The first 43 lines of that are comments. A far cry from Perl, but only half as far ...

Re:Not very Perl-like....

VSarkiss on 2002-02-15T21:19:40

I only counted the code, not the comments. With the comments it's 135 lines.

Severely b0rken

Matts on 2002-02-15T07:08:21

I believe there's something broken about Java's split() fuction from the regexp class. If you do split(/=/, "foo=bar=20", 2) in Java, then you get two return values, as you would expect. What you might not expect is those return values to be "foo" and "bar". I know this was the case in the betas, but I haven't checked if they fixed this or not. I'd love it if someone could confirm that.

Re:Severely b0rken

manu4ever on 2002-02-15T13:40:58

This was fixed between beta 2 and beta 3. eg:import java.util.regex.*;

public class Regexp { public static void main(String[] args) { Pattern p = Pattern.compile("="); String[] theStrings = p.split("foo=bar=20", 2);

for (int i=0; itheStrings.length; i++) { System.out.println(i + " : " + theStrings[i]); } } }

Gives

0 : foo 1 : bar=20

as expected on 1.4.0 beta3 (and final release), but

0 : foo 1 : bar

on the beta 2.

Re:Severely b0rken

jdavidb on 2002-02-15T18:48:18

I don't understand your example. split(/=/, "foo=bar=20", 2) I presume the "2" means "only give me two return values," because you say we'd expect to get two return values. Without the 2 I'd expect to get a list of {foo bar 2}, with the 2 I'd expect to get what you say I wouldn't expect: {foo bar}. Huh?

Never mind...

jdavidb on 2002-02-15T18:49:53

Oh, wow! I just learned something about Perl.

I've rarely used the final parameter to split, so I didn't know.

Re:Never mind...

Matts on 2002-02-15T18:57:14

That final parameter is *incredibly* useful. Imagine parsing email headers, or cookies, or config files, or... the list goes on. Glad to hear they fixed it though.

Regexps are not everything

Shlomi Fish on 2002-02-17T08:13:28

Perl has other features besides Regexps, which I like very much. Stuff like nested data-structures, dynamic typing, functions as first-order values, closures, eval, multiple-inheritance, etc. etc.

I don't think Java has all that, or should. That's why I still prefer Perl for most purposes.

Perl like regexes for Java

erikharrison on 2002-02-18T00:20:10

Yes, the new Java regex package is almost identical to Perl regexes. The description is here, including a description of the differences, most of which seem to be insignifigant (although I like having character classes be an allowable part of a character class.)

separate java library offers regexp too

jeppe on 2002-02-20T15:09:47

The Jakarta subproject Oro offers perl5-compatible regexps. I never really stress-tested it, but it did what I needed to do (which was doing basic sanity testing on email addresses).