For those that might not be aware, I got made redundant on 31st March (the day after the QA Hackathon had finished). Thankfully, I start a new job next week, so I've managed to land on my feet. However, this has meant that I've ended up having the whole of April off to do stuff. My plan was to work on some of the Open Source projects that I'm involved with to move them further along to where I wanted them to be. As it turned out two specific projects got my attention over the last 4 weeks, and I thought it worth giving a summary of what has been going on.
YAPC Conference Surveys
Since 2006, I've been running the conference surveys for YAPC::Europe. The results have been quite interesting and hopefully have help organisers improve the conferences each year. For 2009 I had already planned to run the survey for YAPC::Europe in Lisbon, but this year will also see YAPC::NA in Pittsburgh having a survey of their own.
The survey site for Copenhagen in 2008 added the ability to give feedback to Master Classes and talks. The Master Classes feedback was a little more involved, as I was able to get the attendee list, but the talks feedback was quite brief. As such, I wanted to try and expand on this aspect and generally improve the process of running the surveys. Part of this involved contacting Eric and BooK to see if ACT had an API I could use to automate some of the information. I was delighted to get an email back from Eric, who very quickly incorporated an API that I could use, to retrieve the necessary data to keep the survey site for a particular conference up to date, even during the conference.
With the API and updates done, it was time to focus on expanding the surveys and skinning the websites to match that of the now live conference sites. The latter was relatively easy, and only required a few minor edits to the CSS to get them to work with the survey site. The survey site now has 3 types of survey available, though only 2 are visible to anyone not taking a Master Class. Those that have taken one of the YAPC::Europe surveys will be aware I don't use logins, but a key code to access the survey. This has been extended so that it can now be used to access your portion of the survey website. This can now be automatically emailed to attendees before the conference, and during if they pay on the door, and will allow everyone to feedback on talks during the conference. On the last day of the conference the main survey will be put live, so you can then answer questions relating to your conference experience.
I'm hoping the slight change won't be too confusing, and that we'll see some ever greater returns for the main survey. Once it does go live, I'd be delighted to receive feedback on the survey site, so I can improve it for the future.
CPAN Testers Reports
Since taking over the CPAN Testers Reports site in June 2008, I have spent a great deal of time improving it's usability for users. However, it's come at a price. By using more and more Javascript to dynamically change the contents of the core pages, it's meant that I have received a number of complaints that the site doesn't work for those with Javascript disabled or who use a browser that doesn't implement Javascript. For this reason I had decided that I should create a dynamic site and static site. The problem with this is that the current system to create all the files takes several hours for each set of updates (currently about 16 hours per day). I needed a way to drive the site without worrying about how long everything was taking, but also add some form of prioritisation so that the more frequently requested pages would get updated more quickly than those rarely seen.
During April, JJ and I went along to the Milton Keynes Perl Mongers technical meeting. One of the talks was about memcached and it got me thinking as to whether I could use it for the Reports site. Discussing this with JJ on the way home, we threw a few ideas around and settled on a queuing system to decide what needed updating, and to better managed the current databases to add indexes to speed up some of the complex lookups. I was still planning to use caching, but as it turned out memcached wasn't really the right way forward.
The problem with caching is that when there is too much stuff in the cache, the older stuff gets dumped. But what if the oldest item to get dumped is extremely costly on the database, and although it might not get hit very often, it's frequent enough to be worth keeping in the cache permanently. It's possible this could be engineered with memcached if this was for a handful of pages, but for the Reports site it's true for quite a few pages. So I hit on a slightly different concept of caching. As the backend builder process is creating all these static files, part of the process involves grabbing the necessary data to display the basic page, with the reports then being read in via the now static Javascript file for that page. Before dropping all the information and going on to the next in the list, the backend can simply write the data to the database. The dynamic site can then simply grab that data and display the page pretty quickly, saving ALOT of database lookups. Add to the fact that the database tables have been made more accessible to each other, the connection overhead has also been reduced considerably.
The queuing system I've implemented is extremely simple. On grabbing the data from the cache, the dynamic site checks quickly to see if there is a more recent report in existence. If there is, then a entry is added to the queue, with a high weighting to indicate that a website user is actually interested in that data. Behind the scenes the regular update system simply adds an entry in the queue to indicate that a new entry is available, but at a low weighting. The backend builder process then looks to build the entries with the most and highest weightings and builds all the static files, both for the dynamic site and the static site, including all the RSS, YAML and JSON files. It seems to work well on the test system, but the live site will be where it really gets put through its paces.
So you could be forgiven in thinking that's it, the new site is ready to go. Well not quite. Another part of the plan had always been to redesign the website. Leon had designed the site based on the YUI layouts, and while it works for the most part, there are some pages which don't fit well in that style. It also has been pretty much the same kind of style since it was first launched, and I had been feeling for a while that it needed a lick of paint. Following Adam's blog post recently about the state of Perl websites, I decided that following the functional changes, the site would get a redesign. It's not perhaps as revolutionary as some would want, judging from some of the ideas for skins I've seen, but then the site just needs to look professional, not state of the art. I think I've managed that.
The work to fit all the pieces together and ensure all the templates are correct is still ongoing, but I'm hopeful that at some point during May, I'll be able to launch the new look websites on the world.
So that's what I've been up to. I had hoped to work on Maisha, my other CPAN distributions, the YAPC Conference Survey data, the videos from the QA Hackathon among several other things, but alas I've not been able to stop time. These two projects perhaps have the highest importance to the Perl community, so I'm glad I've been able to get on with them and get done what I have. It's unlikely I'll have this kind of time again to concentrate solely on Open Source/Perl for several years, which in some respects is a shame, as it would be so nice to be paid to do this as a day job :) So for now, sit tight, it's coming soon...