Subversion repository is finally back, and I'm committing things we've been hacking on in the past two days. luqui and nothingmuch hacked on reifying the PIL AST as Perl 6 objects (so we get true macro support for free), and a codegen written in Perl 6 that takes PIL and emits Perl 5. Stevan worked on a Perl 5 metamodel that runs Perl 6 roles and classes from Perl 5 space, so we can use Perl 5 as our host VM (in addition to Parrot and Haskell); nothingmuch further joked about a PIL codegen that targets Forth (but turns out he was entirely not serious); Patrick implemented builtin rules for PGE, and put qualified and inheritable+lexical grammar/subrules support in place. Ingy worked on his new Perldoc framework that provides unified, comment-driven, haddock/javadoc-like metadata and macro-friendly way to generate Kwid and POD and translate between them. lwall continued to hack through mad segfaults on PDD/MAD_SKILLS, refactoring Perl 5's toke.c to be more informative.
Allison arrived today and we debated for nearly an hour on the implementation strategy. This Pugs Hackathon work was focused on having the Perl 6 compiler written in Perl 6, that translates the parse tree into PIL, then from PIL to either a Parrot syntax tree that emits PIR, or to Perl 5, Mono or even Javascript. However, that means to compile a newer version of Perl 6 compiler, we need to have an older version of Perl 6 around, first via the non-Parrot side of Pugs, and then in the form of a Perl6.pbc runtime. This is just like how the Mono C# compiler is written in C#, or GCC itself written in C, or how PyPy is written in Python.
Allison prefers instead to see the production Perl 6 compiler and related tools to be written entirely in PIR or other non-Perl6 Parrot languages, so that we can compile any version of Perl 6 without having access to previous versions of Perl 6, and she suspects that a parser/compiler/emitter written in PIR would be easier to write and maintain than the same toolchain written in Perl 6. The parser and emitter tools will then be reusable by other languages (eg. PHP) that want to target Parrot, because PHP folks would prefer to use the compiler suite written in PIR than one written in either Perl 6 or PHP. That ties Perl 6 to PIR, but one can persumably link in libparrot in Mono or JVM to run the PIR code via their foreign call interface. This is similar to how the Perl 5 runtime is written in XS/C, and just like a (hypothetical) Mono-targetting compiler written in the .NET/IL high-level assembly language.
Thus, we have two possible implementation strategies that will evolve separately. People will, well, use whichever one that actually works. :-)
Now... revelation time!
perl5:
in the front of the module name:
use perl5:DBI;
Extending this metaphor, to use a python module:
use python:Zope;
*
takes an aggregate, or reference
to an aggregate, and flatten them out on the invocation list. Unary splat on
hash arguments flattens it out as pairs for named bindings; splat on scalars
deref it to find an array/hash reference; for code and
non-reference-to-aggreate scalars it's a no-op.
&prefix:<int>
now always mean the same thing as &int
. In the symbol table
it's all stored in the ``prefix'' category; &int is just a short name way for
looking it up -- it's just sugar, so you can't rebind it differently.
Any | Object | Item | ...pretty much everything else goes here... | Pair | Junction | int | str | num
multi sub foo (3) { ... } multi sub foo (2..10) { ... }
really means:
multi sub foo ($x where { $_ ~~ 3 }) { ... } multi sub foo ($x where { $_ ~~ 2..10 }) { ... }
which compiles two different long names:
# use introspection to get the constraints &foo&foo
which really means this, which occurs after the type-based MMD tiebreaking phase:
given $x { when 3 { &foo.goto } when 2..10 { &foo .goto } }
in the type-based phase, any duplicates in MMD is rejected as ambiguous; but in the value-based phase, the first conforming one wins.
sub
, class
and module
always trumps hash dereferences:
sub{...} module{...} class{...}
do
form is now taking a single statement (that may still be a block); what
it does is turning the statement into an expression form, immediately
evaluating it when the left hand side demands a value.
my $val = do use CGI; # same as my $val = BEGIN { use CGI }; # This assigns 4 to $foo my $foo = do given 3 { when 3 { 4 } };
but
is desugared into a do given
block that
eliminates the need of returning $_
explicitly. So those two forms
are equivalent:
my $foo = Cls.new but { .attr = 1; }; my $foo = do given Cls.new { .attr = 1; $_; };
is
and does
any class/roles;
it will be composed in compile time into an anonymous (but unique) class,
same way as an anonymous closures remembers its original place of definition:
role Foo {...} class Boo is Baz {...} (class{ does Foo; is Boo }).new(1);
is Foo
and does Bar
declarations inside class body is always
lifted up as class traits and executed at class composition time.
OUR::
and symbol table form %OUR::
that contains the symbols in your current package namespace.
trust
is lexical: It controls accessor generation for all my
, our
and has
forms in its scope. Inside a class body, the my $.x
and
our $.x
always generates public accessors on the spot:
class Foo { trusts Bar; # sees the accessor methods in the scope { # some inner scope my $.x; # This creates accessors by inserting these two lines # die "Duplicate accessor" if %OUR::<&x>; # our &x := method () is rw { $.x }; my $:y; # This create accessors by inserting these two lines # my &:y := method () is rw { $:y }; # $?SELF.trust_access.push(&:y); # adds to Bar -- $?SELF is class object } trusts Baz; # this is always an no-op because there's nothing below } class Pie is Foo { } Pie.x = 5; # lvalue method - writes back to lexical $.x say Bar.x; # 5 - shared by Bar and Pie
The twigil .
and :
controls the generated accessor's scope (our
and my
respectively). The scope of the variable itself is orthogonal
to the accessor.
In parameter lists, the is constant
default trait on parameter variables is
not really acting on their containers; it creates a transparent container on
top of an existing container. It is ``transparent'' because it autoderef for
everything except for write-type STORE/PUSH/SPLICE methods; all read methods
like FETCH are passed to the underlying container. Even .ref
and .does
calls are passed through to the underlying container, but the .tied
call
does get you the wrapped implementation object.
The upshot is that these are now errors:
sub foo ($x) is rw { $x } my $a; foo($a) = 4; # runtime error - assign to constant sub constref ($x) { \$x } my $a; my $r = constref($a); $$r = 4; # runtime error - assign to constant
To get a normal reference, use the is rw
trait on the parameters.
Larry is probably the only man on earth who can even think about changing toke.c without breaking half the test suite.lwall continued to hack through mad segfaults on PDD/MAD_SKILLS, refactoring Perl 5's toke.c to be more informative.