I haven't committed much to the github repository for this project since my last update, because a lot of what I've been doing is fairly exploratory stuff. Essentially I've been through the original report with a view to replicating the findings of the original survey, and trying to segment out some more interesting analyses.
Unfortunately this has proven to be a pain. I could sit down and mung the data some more, but I don't think that would be a good use of my time. It's much better to thoroughly ensure that the next set of collected data is clean and easy to use from the word go.
Don't get me wrong, there's a lot of useful stuff here, but there's a lot of detailed unpicking of the data structure to do to ensure consistency for cross-tabulation and other analysis. Unfortunately I don't think the benefits of doing this would exceed the costs, particularly because the 2007 dataset is getting a bit old now. However, I can still use the 2007 data to ensure a properly structured 2009 survey that's structured in a way that is easily sliced and diced.
So my next steps are as follows: