White dot for spacing only
The Dice Project


upgrade.html,v 1.59 2019/11/08 22:13:06 gdutton Exp

Upgrading a DICE PostgreSQL Server

Getting started

Since version 9.0, PostgreSQL has featured the pg_upgrade tool to automate the process of transferring data to a new cluster data version. This guide covers its use on DICE SL7 servers, but is transferrable to any modern LCFG environment.

It's worthwhile checking the PostgreSQL Release Notes for each version involved in the upgrade. It is often also worth reading the migration notes for each major version between old and new. The pg_upgrade tool and source or target server versions could have unique limitations or provisos which are unlikely to be obvious without reading the notes.

Procedure

These steps, or some equivalent thereof, must be followed in strict order.

  1. Now might be the time to take a full backup, using:
    om postgresql backup preupgrade
    where preupgrade is just a suggested name for the backup .sql.gz file.
  2. If you have some time to wait whilst a backup takes place, you could add headers for both old and new software versions side-by-side, to make sure everything is installed and ready. After any service-specific headers which themselves include postgresql, add:
    #include <dice/options/postgresql-oldver.h> /* old version */
    #include <dice/options/postgresql-newver.h> /* new version */
  3. At this point you can run the install:
    $ om updaterpms run
  4. Once backup has completed, As yourself, stop the existing database service (and component):
    $ om postgresql stop
    (Check that no postgres processes are still running with $ ps -ef | grep postgres).
  5. In the server's profile, you should modify the two versioned includes which you added in step (2). After any service-specific headers which themselves include postgresql, add or amend:
    #include <dice/options/postgresql-oldver.h> /* old version */
    #include <dice/options/postgresql-newver-server.h> /* new version */
  6. You should also identify a location for the new data directory; for example, assuming postgresql.pgdata has the value /disk/scratch/db I would recommend something like /disk/scratch/dbNN where NN is the new version number. Add this to the machine profile, also:
    !postgresql.pgdata mSET(/path/to/new/cluster) /* e.g. /disk/scratch/dbNN */
  7. Wait for the client profile changes to propagate.
  8. $ om updaterpms run (verifying that the new postgresql packages are installed, and starting any new components (other than postgresql) such as rmirrorclient)
  9. $ om postgresql start (you will see initdb being run on a new cluster).
  10. $ om postgresql stop
  11. Then become whatever user postgresql runs as (usually postgres, as given by qxprof postgresql.pgowner), and...
  12. postgres$ mv /path/to/old/cluster /path/to/old/cluster.old
    (this is purely for safety if you changed cluster location in step (5) but mandatory otherwise.) You might need to perform this as root depending on the permissions set on the cluster directories.
  13. Perform the automated upgrade. You will need to determine the paths to the old and new binaries; typically this is /usr/pgsql-NN/bin and you can get a hint by checking for the locations of utilities as defined by qxprof postgresql | grep bin:
    postgres$ cd /tmp/ (or some other user-writable directory)
  14. postgres$ /path/to/new/binaries/pg_upgrade \
    -d /path/to/old/cluster.old -D /path/to/new/cluster \
    -b /path/to/old/binaries -B /path/to/new/binaries
  15. Wait, potentially several minutes, for the upgrade to complete. If you receive any warnings check the pg_upgrade and wider PostgreSQL documentation for advice.

The database cluster has been upgraded. If the database is to be returned to its previous location, now is the time to do it:

  1. postgres$ mv /path/to/new/cluster /path/to/old/cluster
  2. In the machine profile, remove the relocation line:
    !postgresql.pgdata mSET(/path/to/new/cluster)

Don't forget to:

  1. $ om postgresql start to start the database server in its new version.
  2. Run the analyze_new_cluster.sh script created by the upgrade script, as the postgres user.
  3. Update any service headers to include the new postgresql-newver-server.h header.
  4. Remove the explicit postgresql-xxx headers from the machine profile once they are no longer required.

Optional optimisations:

Seamless cross-host upgrades

The following procedure allows you to upgrade a server and its data from an older version of PostgreSQL on one host, to a newer version on another host, with less than a minute's downtime (and typically a half-hour window of read-only access to data whilst clients' DNS caches update). The procedure works without the upgrade step, of course, but it's particularly useful during upgrade.

PostgreSQL's data structures are particularly amenable to incremental transfer using rsync, which allows you to perform an initial, slow "dirty" copy of a running database, followed by a near-instantaneous final transfer of data with only the briefest of downtime. So, on the running database server oldhost:
postgres$ rsync -av --delete /disk/data/db/ <you>@newdb:/disk/data/olddb/

You should repeat the above step as close in time to the next steps as possible, to decrease the length of time taken by the final transfer.

Then, when you're ready to commit to the upgrade, perform the following steps on the same, existing server:

  1. In the machine profile, enable read-only transactions which prevents clients (by default) from making changes to the running server[1]:
    !postgresql.options mADD(default_transaction_read_only)
    postgresql.option_default_transaction_read_only on
  2. Without waiting for the profile change to propagate, you may shut down the database server:
    $ om postgresql stop
  3. And perform the final copy:
    postgres$ rsync -av --delete /disk/data/db/ <you>@newdb:/disk/data/olddb/
  4. Ensuring that the client profile has updated, restart postgresql:
    $ om postgresql start
  5. You may now begin to work on your new server dataset, following the upgrade instructions above.
  6. As soon as your new server is functional, move any DNS alises to the new host. Clients connecting to the new location will of course regain read-write capability immediately.

An alternative to the rsync command above is possible where there is a replication relationship (master → slave) between old and new servers. In this case one can take advantage of the the postgresql basebackup command which allows a copy to be taken of the data without downtime on the master. However, given that both the default_transaction_read_only and replication setup steps still involve a restart of the master, it is not recommended over a simple rsync.

[1] Clients can in theory override this setting and modify data, but it is expected that users will be informed of the period of read-only service. Read-only status can be enforced by careful application of DENY statements but this is considered unnecessary for the purposes of a scheduled upgrade.

Wrinkles

Servers running third-party extensions (e.g. PostGIS)

The PostgreSQL extension system allow registered extensions to be upgraded automatically and as a result no special action is usually necessary for most extensions, so long as both versions of the extension library are installed during upgrade.

Upgrades should be handled automatically by the postgresql-<extname>.h header if they are loaded after all postgresql-version headers. If you can't completely control the header ordering (e.g. pgresearch.inf where postgresql versions are determined by a stable header) you may follow the protocol below:

/* predefine new software header before main header */
#include <dice/options/postgresql-newver.h>

/* (existing header, includes oldver-server and postgis headers) */
#include <dice/options/postgresql-research-server.h>

/* post-define server header as per upgrade procedure */
#include <dice/options/postgresql-newver-server.h>

Avoiding leap-frog extension upgrades

Before (and periodically after) every PostgreSQL upgrade it is worth checking to make sure that your extension is the latest version to avoid the risk of having to perform future PostgreSQL upgrades in multiple leaps.

To upgrade extensions (within the same PostgreSQL version) you will sometimes need both software versions installed simultaneously: this doesn't map well to an RPM world, which will require something like

# rpm -iv --force extension-oldver.arch.rpm extension-newver.arch.rpm /* and on the machine profile, if changes must persist */
!profile.packages mEXTRA( +extension-oldver/arch:i +extension-newver/arch:i )

Remembering to remove the old version as soon as upgrade is comfortably complete.

Upgrading Kerberised Servers to 9.5 and above

9.5 makes a further change to the default settings of the gss authtype. Assuming you're upgrading from ≥9.3 and using the gss authtype already then for seamless forward compatibility it's recommended that, before upgrade, the following backwards-compatible auth options are added to each HBA entry:

/* For each HBA line with the gss type */
sslauthtype_tag   gss
/* Add the following auth options: */
!sslauthopts_tag   mADD(include_realm=0 krb_realm=<%kerberos.realm%>)

This preserves the old behaviour of connecting clients, and restricts the connection to the current realm only.

If your requirements are more complex, for example allowing clients from non-default realms to connect, or you are planning a new database server from scratch, these instructions need not apply: servers configured without the above will present connecting users with qualified kerberos principal names, i.e. user@MY.REALM and new installations merely need to take this change into account.


 : Units : Research_and_teaching : Documentation : Postgresql 

Mini Informatics Logo - Link to Main Informatics Page
Please contact us with any comments or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh
Spacing Line