Обсуждение: postgresql 9.3 failover time

Поиск
Список
Период
Сортировка

postgresql 9.3 failover time

От
Shay Cohavi
Дата:
Hi,
I have postgresql 9.3 setup with 2 nodes (active/standby with streaming replication & continuos archiving).
I have created 2 failover & failback script in order to perform a switchover between the DB servers:
1. failover - create a trigger file in order to promote the new primary.
2. failback - perform a base backup as mentions in :
   a. start backup on the primary.
   b. stop the failed node .
       didn't delete the DB directory on the failed node
   c. performing rsync between the nodes.
   d.stopping the backup on the primary.
   e.performing rsync on the pg_xlog.
   f. creating a recovery.conf

    standby_mode = 'on'
    primary_conninfo = 'host=10.50.1.153 port=5432 user=usr password=pass'
    restore_command = 'scp 10.50.1.153:/home/postgres/archive/%f %p'
    trigger_file = '/home/postgres/databases/fabrix/trigger'
    archive_cleanup_command = 'ssh 10.50.1.153 /home/postgres/pg_utils/archive_cleanup.sh %r'

   g. starting the failed node as secondary.

the switchover method:
1. stop the primary node.
2. promote the secondary node (failover.sh).
3. perform failback on the failed node.
4. start the failed node.

this method works great! 


but if I perform multiple switchovers (>20), each time the new primary gets promoted (trigger file) - it takes longer because it searches the timelines on the archive. 

for example:

[2015-12-12 20:35:10.769 IST] LOG:  trigger file found: /home/postgres/databases/fabrix/trigger
[2015-12-12 20:35:10.769 IST] FATAL:  terminating walreceiver process due to administrator command
scp: /home/postgres/archive/0000009400000002000000DC: No such file or directory
[2015-12-12 20:35:10.893 IST] LOG:  record with zero length at 2/DC000168
[2015-12-12 20:35:10.893 IST] LOG:  redo done at 2/DC000100
scp: /home/postgres/archive/0000009400000002000000DC: No such file or directory
scp: /home/postgres/archive/0000009300000002000000DC: No such file or directory
scp: /home/postgres/archive/0000009200000002000000DC: No such file or directory
.
.
.

scp: /home/postgres/archive/0000009100000002000000DC: No such file or directory
scp: /home/postgres/archive/0000009000000002000000DC: No such file or directory
scp: /home/postgres/archive/00000095.history: No such file or directory
[2015-12-12 20:35:11.801 IST] LOG:  selected new timeline ID: 149
[2015-12-12 20:35:11.931 IST] LOG:  restored log file "00000094.history" from archive
[2015-12-12 20:35:12.173 IST] LOG:  archive recovery complete
[2015-12-12 20:35:12.181 IST] LOG:  database system is ready to accept connections
[2015-12-12 20:35:12.181 IST] LOG:  autovacuum launcher started

this could take for a least 1 min.....or more.

is there any way to skip the timeline searching in order to decrease the promotion?


Thanks,
ShayC

Re: postgresql 9.3 failover time

От
David Steele
Дата:
On 12/12/15 2:08 PM, Shay Cohavi wrote:
> *I have postgresql 9.3 setup with 2 nodes (active/standby with streaming
> replication & continuos archiving).*
> *I have created 2 failover & failback script in order to perform a
> switchover between the DB servers:*
> *1. failover - create a trigger file in order to promote the new primary.*
> *2. failback - perform a base backup as mentions in :*
> *   a. start backup on the primary.*
> *   b. stop the failed node .*
> *       *didn't delete the DB directory on the failed node
> *   c. performing rsync between the nodes.*

If you use rsync here be sure to use checksums.  The clusters are very
similar to each other and rsync timestamp resolution could become a problem.

> *   d.stopping the backup on the primary.*
> *   e.performing rsync on the pg_xlog.*
> *   f. creating a recovery.conf*
> /
> /    standby_mode = 'on'/
> /    primary_conninfo = 'host=10.50.1.153 port=5432 user=usr password=pass'/
> /    restore_command = 'scp 10.50.1.153:/home/postgres/archive/%f %p'/
> /    trigger_file = '/home/postgres/databases/fabrix/trigger'/
> /    archive_cleanup_command = 'ssh 10.50.1.153
> /home/postgres/pg_utils/archive_cleanup.sh %r'/
> *
> *   g. starting the failed node as secondary.*
> *
> *the switchover method:*
> *1. stop the primary node.*
> *2. promote the secondary node (failover.sh).*
> *3. perform failback on the failed node.*
> *4. start the failed node.*
> *
> *this method works great! *
> *
> *but if I perform multiple switchovers (>20), each time the new primary
> gets promoted (trigger file) - it takes longer because it searches the
> timelines on the archive. *

This is an indication that your backup/restore process is not working
correctly.  Postgres should only look for timelines that are greater
than the current timeline.

> *for example:*
>
> /[2015-12-12 20:35:10.769 IST] LOG:  trigger file found:
> /home/postgres/databases/fabrix/trigger/
> /[2015-12-12 20:35:10.769 IST] FATAL:  terminating walreceiver process
> due to administrator command/
> /scp: /home/postgres/archive/0000009400000002000000DC: No such file or
> directory/
> /[2015-12-12 20:35:10.893 IST] LOG:  record with zero length at 2/DC000168/
> /[2015-12-12 20:35:10.893 IST] LOG:  redo done at 2/DC000100/
> /scp: /home/postgres/archive/0000009400000002000000DC: No such file or
> directory/
> /scp: /home/postgres/archive/0000009300000002000000DC: No such file or
> directory/
> /scp: /home/postgres/archive/0000009200000002000000DC: No such file or
> directory/
> /./
> /./
> /./
> /
> /
> /scp: /home/postgres/archive/0000009100000002000000DC: No such file or
> directory/
> /scp: /home/postgres/archive/0000009000000002000000DC: No such file or
> directory/
> /scp: /home/postgres/archive/00000095.history: No such file or directory/
> /[2015-12-12 20:35:11.801 IST] LOG:  selected new timeline ID: 149/
> /[2015-12-12 20:35:11.931 IST] LOG:  restored log file
> "00000094.history" from archive/
> /[2015-12-12 20:35:12.173 IST] LOG:  archive recovery complete/
> /[2015-12-12 20:35:12.181 IST] LOG:  database system is ready to accept
> connections/
> /[2015-12-12 20:35:12.181 IST] LOG:  autovacuum launcher started/

It's not clear to me how you got to timeline 149.  Some lines have been
removed - did the history log requests go all the way to 148?

Rsync is possibly your issue here - maybe pg_control is not being copied
because the timestamp is the same on both systems (rsync only has a 1
second time resolution so this is very possible between a master and a
streaming replica).  Try rsync with checksums (--checksum) and see if
that makes a difference.

--
-David
david@pgmasters.net


Вложения