The Database That Ate the Census

Alias on 2009-11-12T03:03:19

http://ali.as/census

Following up from my previous post, and some concerns raised about the legalities of using the Google Maps geocoder, I've managed to find a way to qualify for a free unlimited subscription to Microsoft Bing Geocoder, formerly Microsoft MSN Geocoder.

So if we do hit problems with Google, I have a workable fallback option.

Moving onwards, my biggest problem with the http://geo2gov.com.au/ service has been that I link to census keys which aren't particularly usable for most people.

The Census provides a huge range of information for these keys, but behind layers and layers of web interface. This is fine for basic situations, but what if I what if I want to go watch the Lithuania vs Argentina World Cup match to cheer against Argentina. How do I search for pubs and clubs located in areas or enclaves with high numbers of Lithuanian immigrants, so I can maximise my chance of bumping into others to cheer with?

(This is, of course, highly notional)

Fortunately, at many University libraries you can find copies of this ABS online information compiled into a single file tree. This, of course, still isn't particularly useful because it isn't in easy to use file formats.

So I've fixed that problem with The Database That Ate the Census.

This is a single file download, in either PostgreSQL backup or SQLite formats, of the entire Australian 2006 Census data. Or more specifically, the CD (Collection District) layer of the BCP (Basic Community Profile) subset of the census compilation.

The timing makes it too late to be used in the Mashup Australia competition, but just in time for the apps4nsw competition, which was launched today.


Corrupt sqlite version?

pjm on 2009-11-13T04:56:18

Hi Adam,

very nice idea, but I think the sqlite version is truncated/corrupted. I've downloaded twice and obtained a 165.3MB file both times (rather than the advertised 174MB).

sqlite3 produces "Error: database disk image is malformed" when it tries to open the unzipped db.

The md5 sum of my downloaded version (census-20091111.sqlite) is

cb3b4f03ffab095df265f4c9dc6ad6c5

if that's any help.

Cheers,
Paul
   

geostuff

duckysherwood on 2009-11-13T05:14:20

Adam -- Thanks for your thoughtful comments on my mashup. Note that http://ali.as/contact.html gives a 404, alas, so I am responding here in case you don't go back and read the followup comments on the mashup site... we actually DID a lot of statistical analysis, but it got lost in an update. It's on the site now, enjoy, have fun. (And there is a bias. It isn't subtle. However, we don't know how monies are getting spent in the REST of the stim programs.)

Looks like you are doing cool stuff here -- I will have to peruse your site more carefully after I get some sleep!