Обсуждение: Request for replication advice
Hi all, Yes, it's yet another request for advice on replicating pgsql. I have been doing some research on this lately, and so far I am getting the impression that what I want is Slony + log shipping. The scenario is: * One master postgres database on the internal, which is being frequently updated by the internal users. * One slave postgres database which is hosted remotely, available from the Internet, and accessible via an IPsec tunnel from the intranet. The purpose of the public slave is to allow external users read-only access to the data in the master database, without compromising the security of the internal network. To this end, we would like to "push" all inserts, updates and deletes out to the slave database from the master database, with a purely one-way connection (the slave is never allowed to connect in to the internal network). It is important that the data in the slave be as current as possible with the live data. A delay of several minutes for any given update to propagate to the slave would be tolerable, but any more than that and the application would lose much of its appeal. So it's okay if the replication is asynchronous, so long as it is fast. The slave also needs to be highly available 24/7. So, my question for the list is: is Slony + log shipping the direction I should be investigating, or is there something else out that I ought to consider? My understanding of WAL-based replication is that the slave is purely a standby, and cannot be used to service queries. TIA, BJ
On Sat, 2006-11-11 at 06:34 +1100, Brendan Jurd wrote: > So, my question for the list is: is Slony + log shipping the direction > I should be investigating, or is there something else out that I ought > to consider? My understanding of WAL-based replication is that the This is certainly the direction to look at. The speed of which replication keeps on on a log shipped node is going to depend on the frequency of inserts/updates/deletes, you class of hardware and network connectivity. You also need to be absolutely certain that long running queries and 'idle in transaction' transactions are minimized as much as possible (or completely eliminated). Anything that prevents vacuums from doing there job properly is going to hurt a slony replica. Oh, and regarding log shipping itself, I recommend waiting for the release of 1.1.6 or 1.2.1, there is a fairly serious bug in the current version that causes log shipping to fall over if you have more than 2 nodes in your config (not just log shipped nodes). If you have more questions, please sign up for the Slony list. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
"Brendan Jurd" <direvus@gmail.com> writes: > So, my question for the list is: is Slony + log shipping the direction > I should be investigating, or is there something else out that I ought > to consider? Those are two different methods: you'd use one or the other, not both. Slony-I is much the more battle-tested of the two at the moment. In theory WAL log shipping should be higher performance for heavy-update scenarios, but its latency is variable (low update rate = higher latency), and not easy to put a bound on pre-8.2. Have you looked at the replication info recently added to the PG docs? http://developer.postgresql.org/pgdocs/postgres/failover.html regards, tom lane
On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote: > "Brendan Jurd" <direvus@gmail.com> writes: > > So, my question for the list is: is Slony + log shipping the direction > > I should be investigating, or is there something else out that I ought > > to consider? > > Those are two different methods: you'd use one or the other, not both. Slony has its own log shipping, I think that was what he was referring to. > Slony-I is much the more battle-tested of the two at the moment. In > theory WAL log shipping should be higher performance for heavy-update > scenarios, but its latency is variable (low update rate = higher > latency), and not easy to put a bound on pre-8.2. I'm not entirely sure how battle tested the Slony log shipping stuff actually is. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
Brad Nicholson <bnichols@ca.afilias.info> writes: > On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote: >> Those are two different methods: you'd use one or the other, not both. > Slony has its own log shipping, I think that was what he was referring > to. Oh, OK, I was thinking of the trigger-driven version. regards, tom lane
On Fri, 2006-11-10 at 15:16 -0500, Tom Lane wrote: > Brad Nicholson <bnichols@ca.afilias.info> writes: > > On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote: > >> Those are two different methods: you'd use one or the other, not both. > > > Slony has its own log shipping, I think that was what he was referring > > to. > > Oh, OK, I was thinking of the trigger-driven version. Same thing, actually. There's an option that tells the slon daemon to write the data syncs to files instead of applying directly to a subscriber. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
tgl@sss.pgh.pa.us (Tom Lane) writes: > Brad Nicholson <bnichols@ca.afilias.info> writes: >> On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote: >>> Those are two different methods: you'd use one or the other, not both. > >> Slony has its own log shipping, I think that was what he was referring >> to. > > Oh, OK, I was thinking of the trigger-driven version. Slony-I is /always/ trigger-driven... The "extra" that "log shipping" offers is that you can configure a subscriber to serialize queries into a file as well as applying them to the subscriber. (It would be nice to have the option of not bothering with the subscriber, but that hasn't been tried yet.) Thus, you have a series of "log files" that contain Plain Ordinary SQL. (There's a *little* bit of Slony-I-specific stuff in them, that could be trivially trimmed out.) Those log files get spooled in a directory; you could do just about anything you like with them, such as FTPing them somewhere, burning them onto a DVD, or copying them onto a little USB key like <http://www.kingmaxdigi.com/product/superstick.htm> this one, and attach that to the leg of a bird, to implement some "avian data transfer protocol" :-). -- let name="cbbrowne" and tld="acm.org" in String.concat "@" [name;tld];; http://linuxdatabases.info/info/logshipping.html Why are there flotation devices under plane seats instead of parachutes?
On 11/11/06, Brad Nicholson <bnichols@ca.afilias.info> wrote: > On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote: > > "Brendan Jurd" <direvus@gmail.com> writes: > > > So, my question for the list is: is Slony + log shipping the direction > > > I should be investigating, or is there something else out that I ought > > > to consider? > > > > Those are two different methods: you'd use one or the other, not both. > > Slony has its own log shipping, I think that was what he was referring > to. Indeed I was; sorry if my terminology caused confusion. The reason I am looking at Slony with log shipping is that it can operate across a one-way connection, whereas plain Slony requires communication in both directions. A bi-directional connection would negate the purpose of having two separate databases, which is to protect the internal database (and the internal network as a whole) from a compromised external system. If we were willing to have a bi-directional connection, I don't see any further disadvantage in allowing the external application(s) to connect straight into our internal postgres database over the IPsec tunnel, and ignoring the replication issue entirely.
direvus@gmail.com ("Brendan Jurd") writes: > On 11/11/06, Brad Nicholson <bnichols@ca.afilias.info> wrote: >> On Fri, 2006-11-10 at 15:07 -0500, Tom Lane wrote: >> > "Brendan Jurd" <direvus@gmail.com> writes: >> > > So, my question for the list is: is Slony + log shipping the direction >> > > I should be investigating, or is there something else out that I ought >> > > to consider? >> > >> > Those are two different methods: you'd use one or the other, not both. >> >> Slony has its own log shipping, I think that was what he was referring >> to. > > Indeed I was; sorry if my terminology caused confusion. > > The reason I am looking at Slony with log shipping is that it can > operate across a one-way connection, whereas plain Slony requires > communication in both directions. A bi-directional connection would > negate the purpose of having two separate databases, which is to > protect the internal database (and the internal network as a whole) > from a compromised external system. > > If we were willing to have a bi-directional connection, I don't see > any further disadvantage in allowing the external application(s) to > connect straight into our internal postgres database over the IPsec > tunnel, and ignoring the replication issue entirely. Let me point out one possible downside to using Slony-I log shipping; it may not be an issue for you, but it's worth observing... Log shipping works via serializing the subscription work done on a subscriber to files. Thus, you MUST have at least one subscriber in order to have log shipping work. If that's a problem, that's a problem... -- (format nil "~S@~S" "cbbrowne" "acm.org") http://linuxfinances.info/info/oses.html Microsoft Outlook: Deploying Viruses Has Never Been This Easy!
On 11/11/06, Chris Browne <cbbrowne@acm.org> wrote: > Let me point out one possible downside to using Slony-I log shipping; > it may not be an issue for you, but it's worth observing... > > Log shipping works via serializing the subscription work done on a > subscriber to files. Thus, you MUST have at least one subscriber in > order to have log shipping work. If that's a problem, that's a > problem... So I would have a normal Slony subscriber sitting somewhere on the internal network, which pushes its log files out to the remote server. And the remote server then has a process sitting on it which inhales the log files into the database as they arrive. Have I got the right idea? Why *does* Slony require a bi-directional connection to the subscriber? The data is travelling in one direction only ... what needs to come back the other way? This seems to be getting rather messy. I wonder if I might not be better off just writing AFTER triggers on all the tables I'm interested in, which replicate the query to the slave system with psql. It would probably be relatively labour intensive, and increase the burden of administering the schema, but it would also be a much more direct and simple approach. BJ
Brendan Jurd wrote: > Why *does* Slony require a bi-directional connection to the > subscriber? The data is travelling in one direction only ... what > needs to come back the other way? So the slave can say "yes I got that data you can remove it from my TODO list" ? > This seems to be getting rather messy. I wonder if I might not be > better off just writing AFTER triggers on all the tables I'm > interested in, which replicate the query to the slave system with > psql. It would probably be relatively labour intensive, and increase > the burden of administering the schema, but it would also be a much > more direct and simple approach. > That sounds like contrib/dbmirror -- Shane Ambler pgSQL@007Marketing.com Get Sheeky @ http://Sheeky.Biz
direvus@gmail.com ("Brendan Jurd") writes: > On 11/11/06, Chris Browne <cbbrowne@acm.org> wrote: >> Let me point out one possible downside to using Slony-I log shipping; >> it may not be an issue for you, but it's worth observing... >> >> Log shipping works via serializing the subscription work done on a >> subscriber to files. Thus, you MUST have at least one subscriber in >> order to have log shipping work. If that's a problem, that's a >> problem... > > So I would have a normal Slony subscriber sitting somewhere on the > internal network, which pushes its log files out to the remote server. > And the remote server then has a process sitting on it which inhales > the log files into the database as they arrive. > > Have I got the right idea? Exactly. > Why *does* Slony require a bi-directional connection to the > subscriber? The data is travelling in one direction only ... what > needs to come back the other way? - So that any node might be promoted to be origin in case of emergency. - So that nodes know when logged data (sl_log_{1,2}) can be safely purged. > This seems to be getting rather messy. I wonder if I might not be > better off just writing AFTER triggers on all the tables I'm > interested in, which replicate the query to the slave system with > psql. It would probably be relatively labour intensive, and > increase the burden of administering the schema, but it would also > be a much more direct and simple approach. There might be some improved elegance in that; it is quite possible that Slony-I has more functionality than you need. -- "cbbrowne","@","linuxdatabases.info" http://linuxfinances.info/info/slony.html "In man-machine symbiosis, it is man who must adjust: The machines can't." -- Alan J. Perlis
On 11/14/06, Chris Browne <cbbrowne@acm.org> wrote: > direvus@gmail.com ("Brendan Jurd") writes: > > This seems to be getting rather messy. I wonder if I might not be > > better off just writing AFTER triggers on all the tables I'm > > interested in, which replicate the query to the slave system with > > psql. It would probably be relatively labour intensive, and > > increase the burden of administering the schema, but it would also > > be a much more direct and simple approach. > > There might be some improved elegance in that; it is quite possible > that Slony-I has more functionality than you need. I think that's the case. Well, I believe I have all the information I need on this. Thanks to everyone who replied, I appreciate your help. BJ