Perl for Geoscience
Traditionally our Perl courses have been targeted at people who work with computers as a job. Developers, software QA, sysadmins, database admins (DBAs), and the like. However, recently I've noticed a sharp increase in course bookings from people who don't have computers as their primary job. In particular, we've been seeing a lot of bookings from people who work in geoscience, and hydrology in particular. These bookings aren't just from within Australia, I'm seeing international interest as well.
It seems that Perl has gained a reputation as being a good cross-platform language for when you have a big hunk of data in one format, and you need to get that data into another format. Geoscience folks deal with lots of data, and they seem to be perpetually massaging it (often by hand, or in a spreadsheet) to get it into the format they need.
Perl is certainly well-suited for this task, and people learning Perl for the purposes of data transformation have been a mainstay on our courses since we first started running them. However I've found our new wave of interest particularly challenging to teach. They don't have any programming backgrounds, so I can't draw analogies to other languages, or even assume they'd know why subroutines are a good idea. They may have never used a command-line before, so the concept of providing command-line arguments is completely foreign. Even the way they think about data is different; for many students in my last class, the idea of hierarchical data (such as a tree) was a completely new concept; in their world, data had always been tabular in nature.
I'm thrilled that I'm able to teach people new skills, and I'm sure some of the students from my classes will be going back to their workplaces and replacing some of their colleagues with very small Perl scripts. However I'm deeply worried that their understanding will be too incomplete. I teach a very modern Perl course, with a lot of focus on best practices, maintainability, validation, and theory. While I know some of my students can grasp this a little, others are still struggling with consistent indentation, and can't grok the long-term concepts at all. I fear they'll under-use the CPAN, reinvent the wheel, copy-and-paste from bad examples, fail to use revision control, fail to document their code, pile on technical debt, and do all the other things that inexperienced programmers do.
I don't know what to do about this. For this group of people, I think I'd get great feedback (and more bookings and money!) if I taught a class that focused on short-cuts and quick'n'dirty programming, since my students can grasp those short-term gains; they can't as easily grasp the long-term ones of maintainability, testing, source control, and correctness. One could argue those long-term goals aren't important for my new clients, since they're writing code for "once-off" tasks, but as most of us know, there's an awful lot of once-off code that's still being used decades after it was written. I feel the concepts of code quality and maintainability are more important for inexperienced programmers than anyone else, since they're the ones most likely to make these mistakes. I'd rather not teach at all than teach bad practices.
I think we'll probably extend our most popular course to five days, and slow down the material; I've got enough cool bonus material to fill an extra day if we have a class of more computer-oriented students. I've also found myself dropping entirely some of the more foreign concepts like pipes, buffering, file locking, and running external commands; students can look these up if they ever need them.
What I really wish I could do is sit down with my new classes for a week and teach them basic programming and computer science concepts, preferably without computers getting involved at all.
Very interesting post PJF. Just wondering how many of your scientific students have an interest in PDL. I would have thought PDL would be useful to geoscience folks, just as it is to other scientists, notably astronomers. Or is it that most of your students are Perl beginners, not yet ready for PDL?