Why Tcl is Better than Perl

pudge on 2000-07-03T16:04:42

There is a piece about Why Tcl is better than Perl on the Tcl Developer Xchange site, hosted by parent company Scriptics, erm, Ajuba.

The piece offers many insights into why Tcl is better, noting that Perl is rarely extended; it can be used in many fewer situations and for many fewer purposes than Tcl; Perl is based on 8-bit characters; Perl is write-only; the commercial tools available for Perl development are not complete and run only under Windows.

Note: remember that this site of ours is for civilized discussion.


regexes

pudge on 2000-07-03T20:51:33

The new Tcl regular expression package has all of the advanced features that Perl has, plus Unicode support which Perl is lacking.

I am trying to think how the statement could be more wrong. The Tcl regex package does not have all the advanced features Perl has in its regex engine. I don't know when this thing was written, but I doubt very much it supports all the perl 5.005 features, let alone the 5.6 features. This isn't necessarily a bad thing, but it is still a wrong thing.

As to lack of Unicode support, well, that's just plain false. Maybe this is just an old document, I don't know. It certainly should be dated.

Re:regexes

technik on 2000-07-03T21:45:20

I remember this from a while ago. The bottom of the document includes (c)1998-2000 and was last updated March 9,2000. The last update was probably s/Scriptics/Ajuba/

I think we can take this for what it is- marketing.

Re:regexes

Abigail on 2000-07-03T23:28:44

The Tcl regex package does not have all the advanced features Perl has in its regex engine.

From the last section of the document:

Tcl has most but not all of the advanced features of Perl's regular expressions and it also has some of its own unique features.

As to lack of Unicode support, well, that's just plain false.

Really? Then Dominus must be lying in his summary when he writes:

Now here's a dirty secret: Overloading the regex engine this way is difficult, and hasn't been done yet.Regex matching ignores the UTF8 flag in its target. Instead, it uses the old method that was abandoned: if it was compiled with use utf8 in scope, it assumes that its argument is in UTF8 format, and if not, it assumes its argument is a byte string.

Maybe this is just an old document, I don't know. It certainly should be dated.

Again, from the article:

Please make sure that this page does not become yet another out of date comparison. Make the effort to keep up to date with changes in Perl and update the comparison if necessary. It would be useful if you could indicate at the top of the page which versions you are comparing and when.
Paul Duffin, June 8 02:17:57, 2000

-- Abigail

Re:regexes

pudge on 2000-07-04T01:55:28

From the last section of the document

That was from a reader comment, not from the article.

>As to lack of Unicode support, well, that's just plain false.

Really? Then Dominus must be lying in his summary when he writes

Huh? Unicode regexes are supported, as Dominus' quote says. Apparently support could be better, but that still makes the statement patently false.

And also the last thing you mention is not from the article, but from a reader comment.

Of course, a lot more is demonstrably false about the article. Leaving out the misleading and irrelevant statements (and those having to do with Unicode that may just be out of date), the patently false include gems like if you come back to a Perl script after a couple of months, you probably won't be able to understand it anymore (whether or not someone else can read your code is another story); the commercial (development) tools available for Perl are less complete and run only under Windows (last I checked, Unix/Solaris/etc. was more complete than TclPro AND is still commercial); though it is possible in principle to extend Perl, it is rarely done in practice (perhaps it is done too often!); Tcl can be used in many more situations, for many more purposes, than Perl (heh!).

BTW, I don't understand what "I think we can take this for what it is- marketing" means, technik. Perhaps that is because I have for years realized that I have no idea what "marketing" is, and I am not sure it even exists. I think perhaps the concept is a fiction.

Pointless lies

jericson on 2000-07-04T06:24:59

Misconceptions

If you explore the Tcl Web sites, you'll find various claims about why Tcl is superior to Perl. Many of these claims are either incorrect or out-of-date, including all of the following:

They would say that wouldnt they.

jns on 2000-07-04T09:43:15

They are a commercial company selling products which are based on Tcl - in the modern marketing mind they have to rubbish the competition ...

/J\

Re:They would say that wouldnt they.

pudge on 2000-07-04T12:25:49

Again, I don't know what "marketing" is, though.

Re:They would say that wouldnt they.

Silver on 2000-07-04T20:12:04

Again, I don't know what "marketing" is, though.

Perhaps this will help you understand, grasshopper:

At the bank for which I used to work, there was a robbery. This is not unusual, since there are a lot of stupid people out there, except that it was unusually stupid... a daylight robbery, a downtown bank branch which was glass on all four walls (giving cameras and drive-through customers a good view of the guy before the dye pack blew up in his face), etc.

It was sufficiently entertaining that it made the news on all four local stations.

During the filming, one of the camera crews panned past a poster advertising the bank's CD rate. During the 5pm news, the phone center received several calls about opening CD's, including one who specifically mentioned having seen the rate on a newscast.

The marketing director was thrilled, because for $5000 the bank got, in his mind, a prime-time TV commercial on every channel.

That is marketing.

Re:regexes

Abigail on 2000-07-04T21:34:20

Unicode regexes are supported, as Dominus' quote says.

Uhm, no. Regexes don't check whether the string they match against are UTF8 strings or not. They follow the abondened "all or nothing" strategy.

#!/opt/perl/bin/perl -w

use strict;

sub one_char {$_ [0] =~ /^.$/;}

{  use utf8;
   my $str1 = v1024;   # Unicode ch ar 1024.
   my $str2 = v32;     # Unico de char   32.
   print "Matched 1\n" if one_char $str1;
   print "Matched 2\n" if one_char $str2;
}

__END__

This prints:
Matched 2
which clearly indicates that Perl regexes don't support Unicode. Or, at least, they support it enough to not complain, yet they produce bogus results - which is worse than complaining and refusing to continue.

-- Abigail

PS, why does use.perl.org introduce line breaks in my code? Now there's an bogus space in "ch ar", and in "Unico de".

Re:Pointless lies

alleria on 2000-07-05T05:47:56

Hmm, I guess your post got cut off, but there are lots:
not binary clean.
everything is a string. no numerics whatsoever.
no standardardized OO implementation.
not as many prefab modules (a la CPAN)

are just some of the first that floated to the top of my head.

At this point. Perl can do everything that Python and Tcl can, and more. IMO. I personally think that it's also superior to C/C++ for all applications programming as well, but then again, I'm crazy. ;p

Re:regexes

pudge on 2000-07-05T13:07:34

>Unicode regexes are supported, as Dominus' quote says.

Uhm, no. Regexes don't check whether the string they match against are UTF8 strings or not.

And this does not appear in any way make my statement above false. It mitigates the statement, it clarifies it, it makes it look worse that originally one might have thought, but it does not make it false.

However, your code is curious. Does it fail because v1024 is UTF-16 and not UTF-8?

As to the "line breaks": they are inserted spaces. When a linke of text does not have any spaces in it for x chars, we insert one, so that malicious users can't fuX0r a browser by putting in very long lines of text. This is old code in Slash; if there is a better way to deal with it, I'm all ears. One thing you could do is put in an arbitrary space or linebreak between a pair of  's. Something I could possibly do is in the regex to find long lines of text, count  's as whitespace.

Re:regexes

Abigail on 2000-07-06T22:19:53

And this does not appear in any way make my statement above false. It mitigates the statement, it clarifies it, it makes it look worse that originally one might have thought, but it does not make it false.

I wouldn't call the fact that regexes ignore the fact that a string is in UTF-8 format "supporting Unicode". It makes the term "support" meaningless. The criticism on this point is correct.

However, your code is curious. Does it fail because v1024 is UTF-16 and not UTF-8?

That is a question showing you don't understand Unicode. Unicode is a mapping from numbers to characters. v1024 is a string consisting of one character, the 1024th Unicode character. UTF-16 and UTF-8 are 2 of several ways of encoding characters. UTF-16 isn't supported by Perl, if v1024 would result in a UTF-16 encoding, you found a serious bug. In the code I gave, use utf8; is in effect, so v1024 results in a UTF-8 string. Being Unicode, it is a one character string. But due to the encoding, it is a multi-byte string. The regex machine does not understand that.

-- Abigail

Style

grufolone on 2000-07-07T08:22:30

I have used TCL for a while, mostly for exploiting Tk, and what I say is that in my opinion Tcl code is by far less "neat" and "clean" than Perl code could be. Like nearly any other programming language, it makes you type a lot of conceptually unnecessary details; in this sense Perl is a different animal, and thats why I love it.

Re:regexes

pudge on 2000-07-08T13:47:01

It does not make the term "support" meaningless. Sorry, it just doesn't. If the regex engine supports UTF-8 when the utf8 pragma is in effect, then UTF-8 is de facto supported, though not supported in the way you want it to be.

As to my not understanding Unicode: I understand the basics of how it works. I understand encodings. I do not know specifics if Unicode, or the specifics of the various Unicode encodings. I did not know if v1024 is a proper character in UTF-8, because if it were, and the perl docs were true, then it would be supported by the regex engine. You say it is; so I believe it is. So why do the Perl docs say that this should work, and yet, according to you, it doesn't?

Re:regexes

Abigail on 2000-07-08T19:49:47

I did not know if v1024 is a proper character in UTF-8, because if it were, and the perl docs were true, then it would be supported by the regex engine.

But, as I telling you, and showing you with code, the regex machine does not properly support Unicode. It follows the old model, abandoned by the rest of Perl. Only were the old and new model happen to coincide it "works", but that's purely by accident.

Old model: everything is assumed to be in Unicode if, and only if, the utf8 pragma is in effect.
New model: something is in Unicode if and only if the appropriate flag in the SV is set.

So why do the Perl docs say that this should work

They do? man perlunicode starts with

WARNING: The implementation of Unicode support in Perl is incomplete.
and one paragraph later:
Regular Expressions
The existing regular expression compiler does not produce polymorphic opcodes. This means that the determination on whether to match Unicode characters is made when the pattern is compiled, based on whether the pattern contains Unicode characters, and not when the matching happens at run time. This needs to be changed to adaptively match Unicode if the string to be matched is Unicode.

If you know of parts of the documentation that say regexes properly support Unicode, please use perlbug so the documentation can be rectified.

-- Abigail

Re:regexes

pudge on 2000-07-09T01:24:28

From perldoc utf8:

In the absence of inputs marked as UTF-8, regular expressions within the
scope of this pragma will default to using character semantics instead
of byte semantics.

Perhaps I am wrong, but that tells me that regexes will be forced to treat text as utf8 if the utf8 pragma is in effect.

Re:regexes

Abigail on 2000-07-09T04:58:25

Perhaps I am wrong, but that tells me that regexes will be forced to treat text as utf8 if the utf8 pragma is in effect.

Exactly my point. Considering what the rest of Perl does, that is outdated, and hence wrong. It might give the right answer by accident. But it will often give the wrong answer.

-- Abigail

Re:regexes

pudge on 2000-07-09T12:13:22

But ... if this were true -- that it will be forced to be treated as utf8 -- then why did it not treat v1024 as utf8?

Oh wait, I think I am being a moron. This is because the regex is outside the pragma's scope. Sigh. It does do exactly what I thought, but your code failed because I can't read; I'm sorry for wasting your time about that whole thing.

Now, as to being "wrong": should it be different? Yes, that would be nice. But the fact remains that it does work, even though it is a kludge. UTF-8 IS supported in regexes, with this workaround. It is not ideal. It is not the way it should be. UTF-8 is supported, just not as well as it should be.

web/db toolkit for tcl, but not for Perl

markjugg on 2000-09-04T15:32:00

I'm a heavy perl programmer myself, but I have to say, there's one plus I see to tcl. This is not a merit of the language itself, but what's been done with it: There's an excellent web/db toolkit written in tcl called the ACS, or Arsdigita Community System. This toolkit supports a huge amount of web/db functionality (a recent version included over 3000 scripts). Something like this could certainly be done in Perl, but there's not one like this available know that I know of. There are smaller projects-- but nothing of this scale.