Moving Stats

barbie on 2007-12-21T14:58:15

I was watching 'top' last night while the cpanstats scripts were running. The main stats collection scripts take just under 30 minutes, but the checking script took nearly 7 hours! (I wasn't watching for 7 hours I might add ;)) It was useful watching 'top' as I noted that the checking script was using 80-90% of CPU and lots of swap, hence why it takes so long to run. The machine is my old server, now used as my mail server and backup server, which only has 192MB RAM and 500MB swap, so all things considered it wasn't doing too badly. Although I've refactored the scripts a little, there is still going to be a large amount of data resident in memory.

Seeing as the Birmingham.pm server isn't getting swamped, I've taken the decision to move the stats computation closer to the website. The server has 289MB RAM and 255MB swap, so although my old server has more available overall, I'm hoping the extra RAM may be enough to avoid hitting swap so often. I now have the weekend to watch what happens and make sure the DB doesn't get corrupted. My old server is now on standby, so if the move doesn't work I only need to restart cron :)

There is still more refactoring to be done, but hopefully the current changes will get me closer to updating daily. The parsing scripts are a lot stricter now and anything that fails any part of the parsing now gets dropped and must be reviewed manually. So far only two items have required my help in the last month, which when compared to the 10000s being posted each month is a small price.

I'm hoping to get some more time to work on CPAN::YACSmoke over Christmas, as I want to refactor it and bring in more of the report metrics that CPAN::Reporter uses. However, the main aim is to refactor the code so that CPANPLUS or CPAN/CPAN::Reporter can be used under the hood. I also want to make it a simple process to plugin the future releases of CPAN::Testers modules that are planned. If it becomes a fairly painless process to change the transport mechanism, then we're more likely to get more weight behind moving, plus the testers themselves are likely to move over fairly quickly. Expect more news in the coming year.