The official Postgres documentation covers log shipping (aka WAL Archiving, aka warm standby) in some depth. But it's still quite a lot to absorb. The basic idea is to give you a hot spare to fail over to if your main database server crashes. Here is a step by step guide on how to set this up on the command line in Ubuntu (10.04). I assume you have two identical servers (vm clone?), a master and a slave.
First, you need to setup a directory on the slave which will hold the WAL archives (logs). Unless otherwise notes, all commands are run as root.
# on the slave, must be outside main b/c we are going to blow that away mkdir /var/lib/postgresql/8.4/pg_wal chown postgres:postgres /var/lib/postgresql/8.4/pg_wal
We're going to use ssh/scp to copy WAL files from the master to the slave(10.177.1.247), so you need to exchange ssh keys for the postgres users on each end.
# on master, share postgres credentials with slave sudo su postgres ssh-keygen -t rsa # press enter to all prompts (don't set explicit password) ssh email@example.com mkdir -p ~/.ssh cat ~/.ssh/id_rsa.pub | ssh firstname.lastname@example.org 'cat >> ~/.ssh/authorized_keys' # test the ssh connection ssh email@example.com # should not prompt for a password exit # exit new ssh session exit # exit sudo as postgres, you're now root again # setup postgres WAL archiving to copy to slave vim /etc/postgresql/8.4/main/postgresql.conf # make the following edits archive_mode = on archive_command = 'scp %p firstname.lastname@example.org:/var/lib/postgresql/8.4/pg_wal/%f </dev/null' archive_timeout = 600 # 10 minutes # now, outside of vim, restart postgres /etc/init.d/postgresql restart
If you wait 10 minutes, you should see a new file in /var/lib/postgresql/8.4/pg_wal on the slave. Assuming that's working, you're ready to create the initial snapshot.
I was confused by the documentation on this point. Basically, you want to issue a pg_start_backup SQL command on the master, then copy the entire /var/lib/postgresql/8.4/main directory to the slave. The pg_start_backup ensures that the physical files will be in a consistent state. You can perform this copy while master is running.
# on master sudo -u postgres psql mydbname SELECT pg_start_backup('initial'); # may take a few minutes, wait then control-D to exit # zip up the main data directory, and copy it to the slave cd /var/lib/postgresql/8.4 zip -r main.zip main scp main.zip email@example.com:/var/lib/postgresql/8.4 sudo -u postgres psql mydbname SELECT pg_stop_backup();
Now the initial snapshot is on the slave, but we need to actually restore those files as the new primary files on the slave. At the same time, we can restart the slave in recovery mode, using pg_standby. Recovery is enabled by creating a .conf file right in main.
# on the SLAVE apt-get install postgresql-contrib-8.4 # installs the pg_standby tool /etc/init.d/postgresql-8.4 stop cd /var/lib/postgresql/8.4 mv main main.bak unzip main.zip rm -f main/pg_xlog/* # these are from master, and are not needed # create the recovery config file vim /var/lib/postgresql/8.4/main/recovery.conf # add the following, and save restore_command = '/usr/lib/postgresql/8.4/bin/pg_standby -d -t /tmp/pgsql.trigger.5442 /var/lib/postgresql/8.4/pg_wal %f %p %r 2>>/var/log/postgresql/standby.log' recovery_end_command = 'rm -f /tmp/pgsql.trigger.5442' # make sure the new main (as well as recovery.conf) have the correct ownership chown postgres:postgres main -R # restart, and keep and eye on the new standby.log file to see if it's working /etc/init.d/postgresql-8.4 restart tail -f /var/log/postgresql/standby.log -n 100
You should see log lines like the following.
Trigger file : /tmp/pgsql.trigger.5442 Waiting for WAL file : 0000000100000006000000AE.00000020.backup WAL file path : /var/lib/postgresql/8.4/pg_wal/0000000100000006000000AE.00000020.backup Restoring to : pg_xlog/RECOVERYHISTORY Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgresql/8.4/pg_wal/0000000100000006000000AE.00000020.backup" "pg_xlog/RECOVERYHISTORY" Keep archive history : 000000000000000000000000 and later running restore : OK Trigger file : /tmp/pgsql.trigger.5442 Waiting for WAL file : 0000000100000006000000AE WAL file path : /var/lib/postgresql/8.4/pg_wal/0000000100000006000000AE Restoring to : pg_xlog/RECOVERYXLOG Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgresql/8.4/pg_wal/0000000100000006000000AE" "pg_xlog/RECOVERYXLOG" Keep archive history : 000000000000000000000000 and later running restore : OK Trigger file : /tmp/pgsql.trigger.5442 Waiting for WAL file : 0000000100000006000000AF WAL file path : /var/lib/postgresql/8.4/pg_wal/0000000100000006000000AF Restoring to : pg_xlog/RECOVERYXLOG Sleep interval : 5 seconds Max wait interval : 0 forever Command for restore : cp "/var/lib/postgresql/8.4/pg_wal/0000000100000006000000AF" "pg_xlog/RECOVERYXLOG" Keep archive history : 0000000100000006000000AE and later WAL file not present yet. Checking for trigger file... WAL file not present yet. Checking for trigger file... WAL file not present yet. Checking for trigger file...