PostgreSQL Replication

grantm on 2003-07-24T09:08:18

A couple of weeks ago I mentioned that I was looking into PostgreSQL replication. I'm on to other things right now but thought I'd report back on what I found.

I began by looking at Postgres-R. It's a promising looking project which is aiming for syncronous multi-master replication. That effectively means that you can distribute your database over a cluster of servers (potentially geographically dispersed) and fire your queries at any servers - any updates will be applied to all servers. The underlying theory is described in a research paper here and it is based on a messaging layer (using Spread) that guarantees delivery and 'total order' (rather than the more traditional but slower two-phase commit).

As I said, this project looks promising and is progressing (I think the plan is to have it integrated into the PostgreSQL core in 7.5 or later), but it's not ready for prime time yet. The current implementation is master-slave with auto failover in the event of the master dying. Unfortunately there are currently no tools for recovering a node that has been offline. I had numerous segfaults while I got the config sorted out and even when that was right I was never able to shut the master down cleanly.

I've since turned my attention to RServ. This is a collection of scripts and a small C library that ships with PostgreSQL in the contrib directory. It supports Master-Slave replication using triggers on the master to identify records which have changed. Then as often as you require, you can take a snapshot of those updates from the master and apply them to a slave (or store them in a file and apply them later). It's obviously not as flash as syncronous online updates but it has two significant advatanges. Number one: it seems to work now. Number two: it's simple. The small bit of C code defines functions that get called from the triggers to log updates. The remainder of the code is emminently readable Perl so it's pretty easy to get to grips with it and customise it for you requirements.

The current question I'm grappling with is the communication with the application servers - how do we decide we need to promote one of the slaves into master mode and how do the application servers know which database to talk to.