Обсуждение: repmgr problem with registering standby
Hi, I have repmgr working to some degree on a couple of servers, but am having trouble with the "register" part on the slave. On the master, I run: # repmgr -f /etc/repmgr/validator/repmgr.conf \ --verbose --force master register Opening configuration file: /etc/repmgr/validator/repmgr.conf repmgr connecting to master database repmgr connected to master, checking its state finding node list for cluster 'validator' Master node correctly registered for cluster validator with id 0 (conninfo: host=10.133.54.2 port=5432 user=repmgr dbname=repmgr) So that looks good, but then I try this on the slave: # repmgr -f /etc/repmgr/validator/repmgr.conf \ --verbose standby register Opening configuration file: /etc/repmgr/validator/repmgr.conf repmgr connecting to standby database repmgr connected to standby, checking its state repmgr connecting to master database finding node list for cluster 'validator' A master must be defined before configuring a slave I can query the database like so though, and it seems like it's all good: repmgr=# select * from repmgr_validator.repl_nodes; id | cluster | conninfo ----+-----------+------------------------------------------------------ 0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr (1 row) Does anyone have an idea of what might be going wrong here? Thanks, Toby
On Wed, Jul 27, 2011 at 10:36 AM, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote: > Hi, > I have repmgr working to some degree on a couple of servers, but am having > trouble with the "register" part on the slave. > > On the master, I run: > # repmgr -f /etc/repmgr/validator/repmgr.conf \ > --verbose --force master register > > Opening configuration file: /etc/repmgr/validator/repmgr.conf > repmgr connecting to master database > repmgr connected to master, checking its state > finding node list for cluster 'validator' > Master node correctly registered for cluster validator with id 0 (conninfo: > host=10.133.54.2 port=5432 user=repmgr dbname=repmgr) > > > So that looks good, but then I try this on the slave: > # repmgr -f /etc/repmgr/validator/repmgr.conf \ > --verbose standby register > > Opening configuration file: /etc/repmgr/validator/repmgr.conf > repmgr connecting to standby database > repmgr connected to standby, checking its state > repmgr connecting to master database > finding node list for cluster 'validator' > A master must be defined before configuring a slave > > > > I can query the database like so though, and it seems like it's all good: > repmgr=# select * from repmgr_validator.repl_nodes; > id | cluster | conninfo > ----+-----------+------------------------------------------------------ > 0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr > (1 row) > > > Does anyone have an idea of what might be going wrong here? Hi, thanks for using repmgr. What version of repmgr are you using? What version of PostgreSQL? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
----- Original Message ----- > On Wed, Jul 27, 2011 at 10:36 AM, Toby Corkindale wrote: > > Hi, > > I have repmgr working to some degree on a couple of servers, but am > > having > > trouble with the "register" part on the slave. > > > > On the master, I run: > > # repmgr -f /etc/repmgr/validator/repmgr.conf \ > > --verbose --force master register > > > > Opening configuration file: /etc/repmgr/validator/repmgr.conf > > repmgr connecting to master database > > repmgr connected to master, checking its state > > finding node list for cluster 'validator' > > Master node correctly registered for cluster validator with id 0 > > (conninfo: > > host=10.133.54.2 port=5432 user=repmgr dbname=repmgr) > > > > > > So that looks good, but then I try this on the slave: > > # repmgr -f /etc/repmgr/validator/repmgr.conf \ > > --verbose standby register > > > > Opening configuration file: /etc/repmgr/validator/repmgr.conf > > repmgr connecting to standby database > > repmgr connected to standby, checking its state > > repmgr connecting to master database > > finding node list for cluster 'validator' > > A master must be defined before configuring a slave > > > > > > > > I can query the database like so though, and it seems like it's all > > good: > > repmgr=# select * from repmgr_validator.repl_nodes; > > id | cluster | conninfo > > ----+-----------+------------------------------------------------------ > > 0 | validator | host=10.133.54.2 port=5432 user=repmgr > > dbname=repmgr > > (1 row) > > > > > > Does anyone have an idea of what might be going wrong here? > > Hi, thanks for using repmgr. > > What version of repmgr are you using? What version of PostgreSQL? Hi Simon, We're using version 1.1.0 of repmgr, against PostgreSQL 9.0.4-1~bpo60+1 (ie. the version from backports) on Debian squeeze. To complicate matters, we have several postgresql instances per machine, using Debian's pg cluster stuff. This seems to workelsewhere with repmgr though (as long as we make sure the ports are specified in the repmgr configs). However, I'm not having any success even with just the default/single instance of pg either at the moment. Cheers, Toby
On Wed, Jul 27, 2011 at 4:36 AM, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote: > > So that looks good, but then I try this on the slave: > # repmgr -f /etc/repmgr/validator/repmgr.conf \ > --verbose standby register > can you show the content of /etc/repmgr/validator/repmgr.conf? [...] > > I can query the database like so though, and it seems like it's all good: > repmgr=# select * from repmgr_validator.repl_nodes; > id | cluster | conninfo > ----+-----------+------------------------------------------------------ > 0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr > (1 row) > this is on the master or the slave? -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación
On 28/07/11 03:47, Jaime Casanova wrote: > On Wed, Jul 27, 2011 at 4:36 AM, Toby Corkindale > <toby.corkindale@strategicdata.com.au> wrote: >> >> So that looks good, but then I try this on the slave: >> # repmgr -f /etc/repmgr/validator/repmgr.conf \ >> --verbose standby register >> > can you show the content of /etc/repmgr/validator/repmgr.conf? cluster=validator node=mel-db06 conninfo='host=10.133.54.1 port=5432 user=repmgr dbname=repmgr' >> I can query the database like so though, and it seems like it's all good: >> repmgr=# select * from repmgr_validator.repl_nodes; >> id | cluster | conninfo >> ----+-----------+------------------------------------------------------ >> 0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr >> (1 row) > > this is on the master or the slave? I ran that on the slave; however I've just checked now, and the same results are given on both nodes. Just so you know, db06=10.133.54.1 and db07=10.133.54.2. They also have a second address each on the 192.168.10.x network as well though. Toby
On Wed, Jul 27, 2011 at 7:24 PM, Toby Corkindale <toby.corkindale@strategicdata.com.au> wrote: > On 28/07/11 03:47, Jaime Casanova wrote: >> >> On Wed, Jul 27, 2011 at 4:36 AM, Toby Corkindale >> <toby.corkindale@strategicdata.com.au> wrote: >>> >>> So that looks good, but then I try this on the slave: >>> # repmgr -f /etc/repmgr/validator/repmgr.conf \ >>> --verbose standby register >>> >> can you show the content of /etc/repmgr/validator/repmgr.conf? > > cluster=validator > node=mel-db06 > conninfo='host=10.133.54.1 port=5432 user=repmgr dbname=repmgr' > sorry for the delay on this... do you still have this problem? the node parameter should be an integer value, i don't think that string should work for you >>> I can query the database like so though, and it seems like it's all good: >>> repmgr=# select * from repmgr_validator.repl_nodes; >>> id | cluster | conninfo >>> ----+-----------+------------------------------------------------------ >>> 0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr >>> (1 row) >> if in the standby that string you're using as node value ends up as a 0 then it never asks for the node 0 (it couldn't be the master because you're just registering as a standby) so i bet that's the problem, use numbers in the node parameter and everything will be ok i will have to add a check against this case in repmgr, though -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación
On 02/08/11 01:05, Jaime Casanova wrote: > On Wed, Jul 27, 2011 at 7:24 PM, Toby Corkindale > <toby.corkindale@strategicdata.com.au> wrote: >> On 28/07/11 03:47, Jaime Casanova wrote: >>> >>> On Wed, Jul 27, 2011 at 4:36 AM, Toby Corkindale >>> <toby.corkindale@strategicdata.com.au> wrote: >>>> >>>> So that looks good, but then I try this on the slave: >>>> # repmgr -f /etc/repmgr/validator/repmgr.conf \ >>>> --verbose standby register >>>> >>> can you show the content of /etc/repmgr/validator/repmgr.conf? >> >> cluster=validator >> node=mel-db06 >> conninfo='host=10.133.54.1 port=5432 user=repmgr dbname=repmgr' >> > > sorry for the delay on this... do you still have this problem? We did, yes.. > the node parameter should be an integer value, i don't think that > string should work for you Ah! Right, yes, changing that to integer values on all the nodes concerned has indeed solved the problem - once I manually deleted the repgmr schema from the database. (It wouldn't replace the master, even with --force) >>>> I can query the database like so though, and it seems like it's all good: >>>> repmgr=# select * from repmgr_validator.repl_nodes; >>>> id | cluster | conninfo >>>> ----+-----------+------------------------------------------------------ >>>> 0 | validator | host=10.133.54.2 port=5432 user=repmgr dbname=repmgr >>>> (1 row) >>> > > if in the standby that string you're using as node value ends up as a > 0 then it never asks for the node 0 (it couldn't be the master because > you're just registering as a standby) > > so i bet that's the problem, use numbers in the node parameter and > everything will be ok > > i will have to add a check against this case in repmgr, though Is there some documentation detailing the format of the repmgr.conf file? Both I and another guy here have looked at it, and neither of us spotted that node was only supposed to contain integers. For that matter - is there a reason it has to be an integer? Allowing hostnames there would be more friendly. Using integers means someone has to maintain a mapping on node IDs to hostnames in a separate place, and then that leads to mistakes, like someone thinking the standby node (2) is the master hostname :/ Thanks for your help tracking this down! Cheers, Toby
> For that matter - is there a reason it has to be an integer? Allowing hostnames there would be more friendly. Using integersmeans someone has to maintain a mapping on node IDs to hostnames in a separate place, and then that leads to mistakes,like someone thinking the standby node (2) is the master hostname :/ > > As a quick observation, the host name, by itself, doesn't seem to be a candidate key. It would probably have made senseto use a varchar instead of an integer but it seems people treat such a key type as forbidden. David J.