Perl 6 Design Minutes for 26 March 2008

chromatic on 2008-03-27T21:07:03

The Perl 6 design team met by phone on 26 March 2008. Larry, Allison, Patrick, Will, Jerry, Jesse, Nicholas, and chromatic attended.

Patrick:

  • Perl 6 is going pretty well
  • mainly catching up on mail
  • finding out where people are and what they're doing
  • I had a lot of mail
  • pleased on the progress on Rakudo
  • people are adding more features and looking at it
  • still need to look at the new PCT tutorial
  • my plan is to continue reviewing the new changes to Rakudo
  • going to get back on track writing about what's going on

Larry:

  • no spec work this week
  • lots of conversations on p6l about various parsing issues
  • relationships of unaries with indexes

Patrick:

  • is it safe for me to ignore that thread?

Larry:

  • for now
  • whatever we come up with for the operator precedence parser in STD will be the standard way of doing it
  • mostly fighting bugs in YAML and losing
  • giving up on YAML for lexer storage
  • it does not like to deal with Unicode

Jesse:

  • is this a particular implementation of YAML?

Larry:

  • all of them
  • I'm using my own formatted file at the moment
  • fighting bugs in tagged regular expressions
  • it loves to coredump for no apparent reason, if you have too many alternatives
  • in addition to the difficulties with null patterns and null character sets
  • it always coredumps with those
  • if I have 28 or so statement control alternatives, it'll run if I delete all but two or three of them
  • apparently a forward or back pointer stored in a byte or something else badly
  • may have to dive into the C code to fix that
  • also finding a few Perl 5 bugs
  • I really, really exercise lexical scoping within subroutines in my output
  • nest maps very deeply and expects to keep straight lexical variables within those blocks
  • something is leaking somewhere
  • mostly working around those problems

Jerry:

  • Summer of Code stuff now
  • a few applications are coming in
  • two look good for Parrot and Rakduo
  • need to advertise for more
  • also had a contact from someone who wants to port OpenGL to Parrot
  • not Geoff Broadwell
  • seems like a very serious approach
  • pushing him to apply for funding from TPF
  • also trying to keep on top of mail and catch up
  • haven't had any coding time
  • doing a lot more managing and answering questions
  • to some extent, that's fine
  • there are more people involved in the project

Allison:

  • my brain is full of the strings PDD at the moment
  • some substantial changes from Simon's original draft
  • he had a good perspective
  • I'm looking at overall architecture changes
  • still supporting what we need to support
  • just in a different way

c:

  • started some Perl 6-related arguments online; it's been a while
  • made a first patch that gives PIR profiling
  • it's not a great approach
  • it gives some visibility though, and I've found a few places for optimization
  • found a ten-percent speedup in PGE in some cases
  • Tcl spends most of its time parsing and re-parsing
  • also going to go through the bug tracker again and see if we can clear out more stuff

Jesse:

  • are you still thinking about applying Warnocked Perl 6/Rakudo patches?

c:

  • unless Jerry or Patrick yell

Patrick:

  • if you reply to them, I'll take a look at them
  • I had architectural concerns about some of them
  • don't want people cargo-cult things if we check them in
  • but I'll respond to them if you find them and bring them up
  • not all are in RT, some were just on the list

Richard:

  • spent a couple of days at EclipseCon
  • trying to get Perl 5 as a supported language within Eclipse
  • working on a spec, and then we'll shop that around
  • next week is a day trip to New York on a potential sponsorship call
  • could be significant

Will:

  • ripping out deprecated items
  • hope to get everything we've deprecated out before the next release

Jerry:

  • I'm in favor of that

Jesse:

  • mostly having conversations about making progress this week
  • lots of people are burned out
  • we're not hitting milestones that make people feel like they've been productive
  • I don't know that we have a good set of milestones in Perl 6
  • nor that we could lay out a series of good, dated milestones for Rakudo

Patrick:

  • I agree
  • but you just keep working away and more things become available to more people
  • one blocker is IO
  • keeps coming up

Jerry:

  • also exceptions

Patrick:

  • Larry made a comment somewhere that the design is waiting on the implementations to figure out what they need
  • we can go where we need to go
  • but we can change later
  • there's no point in waiting for now
  • there's a draft design for IO
  • having an implementation would help people do file IO in Rakudo
  • I hear Perl is good about that

Allison:

  • what do you need from Parrot?

Patrick:

  • how's the IO PDD?

Allison:

  • it's written but not implemented
  • the implementation date is June, I think

Patrick:

  • how about the basic stuff?
  • open a file and stuff

Allison:

  • that mostly stays the same
  • you can start using that interface now
  • I'll rip out the guts later when we start implementing the new system

Jesse:

  • are you comfortable doing an implementation against what's there today?

Patrick:

  • that's one of those areas I'd like to delegate

Jerry:

  • I worked on the IO design
  • I'm comfortable with Parrot's IO
  • I'll read up on Perl 6's IO

Patrick:

  • it'll make Rakudo more visible where people can use it and make something work
  • reading from and writing to files will help

c:

  • Haskell went a long time without that and it's pretty popular

Patrick:

  • it comes up on the channel regularly
  • we can write our own filters and stuff for test suites in Rakudo instead of Perl 5

Jerry:

  • eating our own dog food

Patrick:

  • do you have a feeling for the strings PDD delivery and implementation?

Allison:

  • due for implementation June 1
  • probably ready for rolling in for mid-June

Patrick:

  • Rakudo is holding off on reading Perl 6 source as Unicode waiting for that

Allison:

  • you can probably use some of Simon's optimization techniques in the PDD
  • he defines a new string type
  • you can use that before the full integration into Parrot
  • gives you always a fixed width lookup
  • as far as I can tell, that's what's expensive

Patrick:

  • if I switched and translated everything over to UCS-2?
  • I don't want to implement any C code personally
  • what will exist in Parrot and when?

Jerry:

  • let's lay on Simon to get something working soon

Allison:

  • I can't guarantee that we'll have something before June 1
  • but we can start implementing the new string type right away
  • if we can get Simon to do a first draft, that'll help

Patrick:

  • I just don't want to switch to a variable-width encoding, which'll make parsing really slow

Allison:

  • if you transcode when something first comes in, you'll take a first hit but not subsequent hits

Patrick:

  • the problem with transcoding to UCS-2 right now is that it requires ICU, and we don't have ICU on all platforms right now
  • I could potentially add those operations...
  • I did that for UTF-8

Larry:

  • you might be able to use the Perl 5 program that spits out Unicode tables into Perl 5 friendly tables
  • they turn into bitmaps in a way that you probably don't care about
  • could use that to write something based on UCS-4 or UCS-2ish integers

Patrick:

  • the UTF-8 code is directly based on those codepoints
  • we work only with codepoints at that level

Allison:

  • how much effort do you want to spend, knowing that the new string implementation is coming?

Patrick:

  • the lack of Unicode support in Rakudo prevents the French angles

Jesse:

  • how much of the Pugs test suite uses those?

Patrick:

  • they don't show up much
  • it's not a killer feature

Jesse:

  • seems like you could go a long way without it being a problem

Nicholas:

  • which codepoints do you need?

Patrick:

  • in a case-insensitive search, we fold everything to a single case
  • without ICU, when you hit a codepoint outside of Latin-1, Parrot throws an exception
  • we check for downcasing first, which is slow

Jerry:

  • or we could trap the exception

Patrick:

  • but a downcase on the French quotes is a no-op
  • I could catch it
  • but it's a bit of painful overhead to add

Nicholas:

  • with a UTF-16 implementation which matched downcase for Latin-1, would that work?
  • or do people expect to use accented characters and have them work?

Patrick:

  • short answer: yes
  • right now, Parrot downcases ASCII, checks for ICU and downcases, and throws an exception for everything else
  • one patch I have is smarter about the non-ASCII codepoint on the ICU part
  • if it's Latin-1, then we can figure out how to do it
  • that's pretty easy to downcase
  • not that many additional codepoints
  • if it's outside of that, we can throw the exception
  • that range includes the French quotes

Allison:

  • let's see if we can get Simon to do an initial implementation

Patrick:

  • one of the milestones was documentation for PCT
  • is the PCT tutorial close to that?

Allison:

  • it needs to be in PDD form
  • Will's talking to him about that

Patrick:

  • I'm happy to work with him on that

Nicholas:

  • have you been struggling alone with those bugs, Larry?
  • have you had help from others?

Larry:

  • when you have a bug in TRE from a DFA that's too large and you try to cut it down, and it goes away when you cut it down, I worry that I'll have to solve it on my own

Nicholas:

  • AEvar wrapped it for Perl 5
  • he may be familiar with it

Larry:

  • it's down in the guts
  • in the long run, we may abandon TRE and write our own DFA
  • just a question if I can work around it right now
  • TRE might need modification anyway
  • it gives me the longest token, but it won't give me the second longest token if the first one fails
  • not sure how to backtrack into that
  • a parallel NFA might be more reasonable in that case

Nicholas:

  • don't you need all the decreasing order of longest?

Larry:

  • you make a list of all candidate token resolutions
  • find the longest unique
  • call that and hope it succeeds
  • if not, and if you're not ratcheting, you need to try something shorter
  • all the way down to nothing

Patrick:

  • once you know the longest one, it's a lot easier to find the shortest one
  • then you know the lookahead

Larry:

  • that assumes you have a hash to look up the shorter keys in

Patrick:

  • there's some value in knowing the longest one
  • it'd be better to have an automoton in this case

Larry:

  • I'd like to have this applicable beyond parsing and lexing
  • any regular expression-like thing can automatically do DFA-style matching to the extent that it's reasonable
  • and gracefully fall over to the other one
  • there are various ways of hacking around it that would work for a lexer
  • that's not the direction I want to go

Patrick:

  • you're after bigger game

Larry:

  • Perl originally integrated regex matching into the language
  • we've ignored DFA-style matching for so long, we're late in integrating it
  • but I think we can do it better than anyone else so far

Patrick:

  • is that a new motto for Perl 6?

Larry:

  • uh oh, another new motto!

Jerry:

  • one thing in Rakudo stops us from writing Perl 6 methods and classes in Perl 6
  • it's a bug or limitation in PCT, I think
  • when you compile Perl 6 code to PIR to create bytecode you can call as a library, it creates subs with the same name as other subs
  • the generator for the sub names starts with the same number in every file
  • you'll get _BLOCK_10 twice
  • will HLL and namespaces help that?
  • does something else in PCT need modification for that?

Patrick:

  • namespaces would do a lot for it
  • I don't have a good answer for that
  • the name generator needs a better universally-unique identifier for its names

Jerry:

  • a UUID generator would go a very long way to solving this problem with Parrot in its current state
  • anything we can steal?

Patrick:

  • it'd be nice to have a good UUID generator in Parrot itself

Will:

  • are they generated separately and then included in the same file?

Patrick:

  • when I say "universally unique" I mean "I unique"

Allison:

  • needs external tracking or something

Will:

  • sounds like we're solving the wrong problem there

Patrick:

  • if there's a way to make identifiers for subs static like in C -- file-scoped but don't leak out, we could use that
  • I don't know if the :anon flag does that

Allison:

  • that just makes sure that they don't get stored in the namespace

Patrick:

  • we're typically talking about nested closures
  • all in the same compilation unit
  • as long as the PIR compiler can make all of those linkages such that there's no runtime symbol lookup, there's no problem

Allison:

  • can we include the name of the sub the closure is in in the generated name?

Patrick:

  • that may be a short-term solution

Jerry:

  • if you put in the namespace and the name as part of the generated name, does that help?

Patrick:

  • does the :anon flag gives us what we want?

Allison:

  • it may not be enough

Jerry:

  • we can test that easily enough
  • I'd like to see this problem solved before conference season
  • I want developers to be able to jump in and implement things in Perl 6 by then

Allison:

  • seems like a more urgent problem than French quotes
  • if it doesn't work, it's a bug in the current implementation of anonymous subs

Patrick:

  • is there a ticket for this?

Jerry:

  • if there isn't, I'll add one