Like a good
relational database, PostgreSQL maintains a set of transactional log
files known as write-ahead-logs (WAL) in the pg_xlog directory.
These logs are written to for every change in the database files and are used
to recovery from a crash condition. If you crash, replay all the WAL
files since the last backup and you will be back in business right at the point
of failure.
Pre-process
recommendations
Primary
Setup
Standby Setup
I required high
availability but not real time fail over so I opted to use the warm standby
which is a pseudo-built in feature of postgres as of version 8.3 that provides
easy failover that can be automated if necessary.
Well, if you
have this capability, what about keeping a warm-standby system and feeding it
all the WAL files. If you teach the warm-standby how to
continuously process the incoming write ahead logs from the live system you
will have a system ready to go at a moments notice. When you read about
this setup in other places online the primary server is known as ‘master’
and the secondary the ‘slave’.
NOTE : Both your
primary and your secondary need to be running the same major version of the
postgreSQL database.
You can follow
these steps for both a brand new postgres installation as well as an existing
heavy traffic installation. Furthermore, no downtime is required to set this
up.
Pre-process
recommendations
§
Use pg_standby for
your restore_command in the recovery.conf file on the standby. pg_standby is
included in PostgreSQL 8.3, and you can copy the source from there to compile
it for 8.2 yourself. It isn't compatible with 8.1.
§
Set up your standby host's
environment and directory structure exactly the same as your primary. Otherwise
you'll need to spend time changing any symlinks you've created on the primary
for xlogs, tablespaces, or whatnot which is really just opportunity for error.
§
Pre-configure both the
postgresql.conf and recovery.conf files for your standby. I usually keep all of
my different config files for all of my different servers in a single,
version-controlled directory that I can then check out and symlink to. Again,
consistent environment & directory setups make symlinks your best friend.
§
Use ssh keys for simply,
and safely, transferring files between hosts. Configure password postgres user primary to secondary
by below link..
§
Follow all of the advice
in the manual with respect to handling errors.
Here Primary server(master) test_db01 and
secondary server (slave) test_db02.
Archive location -- /u01/psqlwarch
PGDATA Location -- /var/lib/pgsql/data/
Primary
Setup
All we need to do on the primary server is set it up to push
archive WAL files to our NFS mount or archive location on
slave server .
1 -- (Primary server(master) test_db01) ( By postgres user ) Modify the /var/lib/pgsql/data/postgresql.conf. Set the following
parameters
archive_mode = on
archive_command =
'/var/lib/pgsql/data/archive_command.sh %p %f' # command to use to archive a
logfile segment
#archive_command = 'rsync -a %p
postgres@test_db02:/u01/psqlwarch/%f'
#archive_timeout = 3600
script for archive mode (cat
/var/lib/pgsql/data/archive_command.sh)
*********************************************************
cat /var/lib/pgsql/data/archive_command.sh
#!/bin/sh
# archive command to archive the WAL files to the archive
directory
# It also writes the name to a file which can later be
picked up by a backup command
# First argument is the full path to the WAL file and the
second argument is the filename
#REPLICANTS="test_db01"
REPLICANTS2="test_db02"
FULLPATH=$1
FILENAME=$2
echo $FULLPATH $FILENAME >> /usr/tmp/debug
export ARCHIVE_DIR=/u01/psqlwarch
cp -i $FULLPATH ${ARCHIVE_DIR}/${FILENAME}
for REPLICA in $REPLICANTS2; do
scp $FULLPATH
$REPLICA:/u01/psqlwarch
done
echo ${FILENAME}>>${ARCHIVE_DIR}/wal_files.txt
2 -- (Primary server(master) test_db01) ( By postgres user ) Restart postgres to apply the changes
(with out
downtime)
/usr/bin//pg_ctl
restart -D /var/lib/pgsql/data
(with
downtime)
(stop postgres)
/usr/bin/pg_ctl stop -D
/var/lib/pgsql/data
(start postgres)
/usr/bin/pg_ctl -D /var/lib/pgsql/data -l
logfile start
Note -- Verify that the WALs are
being shipped to their destination.
postgres=# select pg_switch_xlog();
pg_switch_xlog
----------------
0/7000088
(1 row)
Standby Setup
1 -- (secondary server (slave) test_db02) ( By root user )Before we configure
stuff we'll need to build /install pgstandby from source/RPM. It's not provided with the default posgres installation.
You can verify that it works by running which pgstandby. If we are not able to
get pgstandby need to build /install from source/RPM.Here I install by RPM.
-> List out
installed postgres RPM
[root@test_db02 ~]# rpm -qa postgres*
postgresql-docs-8.4.13-1.el6_3.x86_64
postgresql-8.4.13-1.el6_3.x86_64
postgresql-odbc-08.04.0200-1.el6.x86_64
postgresql-libs-8.4.13-1.el6_3.x86_64
postgresql-server-8.4.13-1.el6_3.x86_64
-> Install crontrib
RPM
[root@test_db02 ~]# yum install
postgresql-contrib.x86_64
-> verify pgstandby
[root@test_db02 ~]# which pg_standby
/usr/bin/pg_standby
2 -- (secondary server (slave) test_db02) ( By postgres user ) (Optional ) If postgres running then we'll stop the standby server to prepare for a
checkpoint backup from our primary system.
3 -- (Primary server(master) test_db01) ( By postgres user )
start the hot backup
3.1-> echo "SELECT
pg_start_backup('mybackup');" | psql -U postgres
-bash-4.1$ echo "SELECT
pg_start_backup('mybackup');" | psql -U postgres
pg_start_backup
-----------------
0/A000020
(1 row)
3.2 -> Create base
backup by tar or rsync command.
-bash-4.1$ pwd
/var/lib/pgsql
-bash-4.1$ tar -cvf
/u01/psqlwarch/mybackup.tar data
3.3 -> Copy base
backup from primary to secondary server
Cd /u01/psqlwarch/
Scp mybackup.tar
postgres@test_db02:/var/lib/pgsql/
4 -- (secondary server (slave) test_db02) ( By postgres user )
Restore base backup
4.1-> remove old data directory
Cd /var/lib/pgsql/
Rm - rf data
4.2-> Restore base
backup
Cd /var/lib/pgsql/
tar -xvf mybackup.tar
chown -R postgres:postgres data
5 -- (Primary server(master) test_db01) ( By postgres user ) With the backup completed, log back into the primary system and clear
the backup flag
echo "SELECT pg_stop_backup();" |
psql -U postgres
-bash-4.1$ echo "SELECT
pg_stop_backup();" | psql -U postgres
pg_stop_backup
----------------
0/A000088
(1 row)
6 -- (secondary server (slave) test_db02) ( By postgres user )
6.1-> turn off
Archive, Modify the /var/lib/pgsql/data/postgresql.conf. Set the following
parameters
archive_mode = off
6.2-> create a recovery.conf file in /var/lib/pgsql/data/ with the following
contents. This tells postgres to start up implement the log files.
Cd /var/lib/pgsql/data/
vi recovery.conf
restore_command = 'pg_standby -d -s
10 -t /tmp/pgsql.trigger.5432 /u01/psqlwarch %f %p %r 2>>standby.log'
6.3-> Start up postgres on
the standby server. It will start restoring data right away..
-bash-4.1$ /usr/bin/pg_ctl start -D /var/lib/pgsql/data
6.4-> Take a peek at the
restore to make sure there aren't too many errors. It may complain about
missing files. Ignore these warnings
Cd /var/lib/pgsql/data/
-bash-4.1$ tail -l standby.log
Trigger file : /tmp/pgsql.trigger.5432
Waiting for WAL file : 00000001000000000000000E
WAL file path : /u01/psqlwarch/00000001000000000000000E
Restoring to : pg_xlog/RECOVERYXLOG
Sleep interval : 10 seconds
Max wait interval : 0 forever
Command for restore : cp
"/u01/psqlwarch/00000001000000000000000E"
"pg_xlog/RECOVERYXLOG"
Keep archive history : 00000001000000000000000C and later
WAL file not present yet. Checking
for trigger file...
WAL file not present yet. Checking
for trigger file...
WAL file not present yet. Checking
for trigger file...
WAL file not present yet. Checking
for trigger file...
running restore : OK
removing
"/u01/psqlwarch/00000001000000000000000A"
removing
"/u01/psqlwarch/00000001000000000000000B"
Trigger file : /tmp/pgsql.trigger.5432
Waiting for WAL file : 00000001000000000000000F
WAL file path : /u01/psqlwarch/00000001000000000000000F
Restoring to : pg_xlog/RECOVERYXLOG
Sleep interval : 10 seconds
Max wait interval : 0 forever
Command for restore : cp
"/u01/psqlwarch/00000001000000000000000F" "pg_xlog/RECOVERYXLOG"
Keep archive history : 00000001000000000000000D and later
WAL file not present yet. Checking
for trigger file...
WAL file not present yet. Checking
for trigger file...
WAL file not present yet. Checking
for trigger file...
Note : -- Execute below command on Primary server(master)
test_db01 and check new archive entry
come in standby.log
postgres=# select pg_switch_xlog();
6.5-> Verify that it's
still in standby mode. It should complain with this message:
-bash-4.1$ psql -l
psql: FATAL: the database system is starting up
Doing the
Failover
what do we do
when the primary fails? Here's my process for flipping over to a standby.
1.
Log into the primary server and make sure that
postgresql is all the way down.
2.
If the primary server is not totally shut down,
take it down the rest of the way.
kill -9
the primary postgres process if
necessary. sudo kill -9
sudo cat /var/pgsql/data/postmaster.pid
| head -n 1
3.
log into the warm standby
4.
switch to the postges user
sudo -u postgres bash
5.
Verify that the server is still in standby
mode. It should report '
psql:
FATAL: the database system is starting up
'.psql
6.
Assuming that it (warm standby) is not up, create the
trigger file. This tells our standby server that it's time to become a
primary node. It's very important that only one server is primary at a time so
i hope you followed steps 1 and 2.
touch /tmp/pgsql.trigger.5432
7.
Start postgres into
warm standby and Verify that the standby server is running
psql
8.
Get your clients pointing to the new server
(change DNS, change IPs in configuration files, etc.)
Note : Postgres will want to rename the recovery.conf file to recovery.done once it’s finished.
*************************************************************************
No comments:
Post a Comment