![]() |
Since version 9.0, PostgreSQL has supported
various forms of replication and as of 9.2 a complete
hot-standby system can be configured. This guide covers a few basic recipes on
PostgreSQL 9.2 using lcfg-postgresql
4.0 or newer. It was written
for DICE SL6 where the postgresql-9.2-server.h
header is
assumed.
This guide presently covers streaming replication but without log shipping. It is assumed that a consistent backup will be made available using the streaming replication mechanism. However, log shipping (with or without a base backup from which to begin) is supported by the postgresql component as of 4.0, and there should be no obstacle to its configuration in LCFG.
Full configuration instructons follow. But in practice, those wishing to replicate may simply wish to use the pre-configured replication headers, thus:
#include <dice/options/postgresql-replication-master.h> PSQL_REPLICATION_STANDBY(standby1,<standby 1 ip>) PSQL_REPLICATION_STANDBY(standby2,<standby 2 ip>) PG_OPTION(max_wal_senders,3) /* no. replias + 1 */
#define PSQL_MASTER_HOSTNAME <master>.inf.ed.ac.uk #define PSQL_REPLICATION_INIT #define PSQL_REPLICATION_STREAMING #include <dice/options/postgresql-replication-standby.h>
Restart postgresql component on all machines (master first) to activate. You will need to destroy the database (by physically removing the dbdata
directory.
Configuring the master server requires just two main changes; adding and authorising a user account and configuring replication itself.
This is straightforward and can additionally make use of the
pgluser
component to automate the addition and removal of a
PostgreSQL role with appropriate permissions. For example:
/* Assuming a pgluser "user" capability looking something like the following... */ pgluser.select_user SELECT rolname FROM pg_roles WHERE rolinherit; pgluser.create_user CREATE ROLE "%(role)s"; ALTER ROLE "%(role)s" LOGIN; pgluser.drop_user REASSIGN OWNED BY "%(role)s" TO postgres; DROP ROLE IF EXISTS "%(role)s"; [...] /* Users can be authorised for replication with a configuration like this... */ !pgluser.caps mADD(repl) pgluser.implies_repl user pgluser.comment_repl Grants replication privileges to a user. pgluser.select_repl SELECT rolname FROM pg_roles WHERE rolreplication; pgluser.create_repl ALTER ROLE "%(role)s" REPLICATION; pgluser.drop_repl ALTER ROLE "%(role)s" NOREPLICATION;
Using the above configuration, creating a new user on the database with replication permission is as simple as:
!pgluser.users_repl mADD(replicator)
This user will course require connection privileges, and more often than not
this will be from a keytab-based server-to-server connection. So, assuming a
replication server with name standby.inf.ed.ac.uk
:
!postgresql.hostssl mADD(standby1) postgresql.ssldatabase_standby1 replication postgresql.ssluser_standby1 replicator postgresql.ssladdress_standby1 129.215.x.y postgresql.sslnetmask_standby1 255.255.255.255 postgresql.sslauthtype_standby1 krb5 postgresql.sslauthopt_standby1 map=repl !postgresql.ident mADD(standby1) postgresql.idmap_standby1 repl postgresql.sysname_standby1 pgstandby/standby.inf.ed.ac.uk postgresql.pgname_standby1 replicator
This grants "pgstandby" (likely a principal owned by the postgres user on the standby) full replication capabilities. Normally you'd expect a macro to perform most of the latter configuration in one step.
How best to configure replication depends on your circumstances and it is best to read the copious documentation the Postgresql website before deciding on a course of action. However it can be as simple as:
!postgresql.options mADD(max_wal_senders wal_level) /* How many simultaneous connections to allow. * You will need one spare if you want to use basebackup -X stream */ postgresql.option_max_wal_senders 2 /* Writes sufficient data into WAL segments to allow hot standby service */ postgresql.option_wal_level 'hot_standby'
If you'd also like to make replication synchronous by default (this can always be altered on a per-command basis):
/* Set to the "appname" of a connecting server if you wish to set a preference, * otherwise "*" to make the first connecting standby a synchronous one. */ !postgresql.options mADD(synchronous_standby_names) postgresql.synchronous_standby_names 'standby1'
Noting in the configuration of the master the requirent of Kerberos to connect, we set up a job to renew connectiong credentials.
!kerberos.keys mADD(pgstandby) kerberos.keytab_pgstandby /etc/pgstandby.keytab kerberos.keytabuid_pgstandby <%postgresql.pgowner%> kerberos.keytabgid_pgstandby <%postgresql.pggroup%> postgresql.krb5ccname /path/to/pgstandby.k5cache !cron.additions mADD(pgstandby) cron.add_pgstandby @reboot /usr/bin/k5start -b -L -f <%kerberos.keytab_pgstandby%> -U \ -g <%postgresql.pggroup%> -o <%postgresql.pgowner%> \ -K 10 -k <%postgresql.krb5ccname%> -p /var/run/k5start.pgstandby.pid
Remember that standby servers should have as similar a configuration as
possible to the master. Mismatches in options (even where unrelated to data,
such as max_connections
) will cause the standby to fail to
start. Beyond any regular postgresql configuration, you will require
the following.
!postgresql.recopts mADD(standby_mode primary_conninfo)
postgresql.recopt_standby_mode 'on'
postgresql.recopt_primary_conninfo 'host=master.inf.ed.ac.uk
user=replicator'
If you wish to name the server specially, for example to mark it out for
synchronous replication, simply set the application_name
parameter
in the primary_conninfo
resource above.
Adding the above invokes the creation of the recovery.conf file, which marks the server as being a recovery server. On top of the above we can now configure additional features:
/* Enable hot standby, allowing us serve queries */ !postgresql.options mADD(hot_standby) postgresql.option_hot_standby 'on' /* Specify a trigger file whose presence should cause streaming replication to fail over */ !postgresql.recopts mADD(trigger_file) postgresql.recopt_trigger_file '/path/to/trigger/file' /** * For log shipping, you will also need... * !postgresql.recopts mADD(restore_command archive_cleanup_command) * postgresql.recopt_restore_command /* see PostgreSQL docs */ * postresql.recopt_archive_cleanup_command /* see PostgreSQL docs */ */
Last but not least we can alter the default initialisation mode of the component. This option allows us to avoid a manual reload on first connection to the master server.
postgresql.initmode basebackup postgresql.basebkargs -X stream -h standby.inf.ed.ac.uk -U replicator
Replication can now be started. The changes above require the master server to be fully restarted:
master$ om postgresql restart
And on the standby server. However, if the standby server has a data directory already, it should be moved aside so that the component can start afresh with the basebackup command.
standby$ om postgresql stop # if required [OK] postgresql: stop standby$ qxprof postgresql.pgdata pgdata=/disk/data/db standby$ mv /disk/data/db{,_old} standby$ om postgresql start [INFO] postgresql: Initializing database (from basebackup)...
Replication failure is easy to spot but not presently automated. On the the master server, you should see all connected replicating servers:
=> SELECT application_name, usename, state, sync_state FROM pg_stat_replication; application_name | usename | state | sync_state ------------------+---------+-----------+------------ host_name | repl | streaming | async (1 row)
Absence of a row matching the appropriate standby hostname (in the 'application_name
' column) indicates loss of a standby. On the standby itself, expect to see errors logged, but note that failure to synchronise will not otherwise disable the server. Most commonly with a streaming-only server, this would be because the master is too far ahead of the standby:
FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000000D00000021 has already been removed
The simplest solution is of course just to stop the postgresql component on the standby server, move the old data aside and start the component again to restore from a fresh basebackup.
Please contact us with any
comments or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh |
![]() |