Обсуждение: Architecture of walreceiver (Streaming Replication)
Hi, Recently, the development of SR is not progressing because of the indecision on whether walreceiver should be a subprocess of the startup process (i.e., a stand-alone program), or of postmaster. Since time is running out, I'd like to discuss about this and advance the project. The related threads are: http://archives.postgresql.org/pgsql-hackers/2009-09/msg01101.php http://archives.postgresql.org/pgsql-hackers/2009-09/msg01291.php IMO, walreceiver should be a subprocess of postmaster for the following reasons. 1. It's not easy to give a GUC parameter to a stand-alone walreceiver program. A simple approach is giving a parameteras a command-line argument. But this wouldn't cover a reload of parameter. 2. It's not easy to treat the log messages generated by a stand-alone walreceiver as well as the other postgres messages.A straightforward approach is that the startup process passes along the messages to the logger process. But thisis not simple. I agree that a stand-alone walreceiver is useful for some cases. But I think that it's sufficient to provide that as contrib or pgfoundry tool. Not need to provide that in core. The communication interface to walsender is going to be provided as libpq, so it's not difficult to implement such a stand-alone tool. Thought? Please feel free to comment. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Nov 2, 2009, at 5:06 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > Hi, > > Recently, the development of SR is not progressing because of > the indecision on whether walreceiver should be a subprocess > of the startup process (i.e., a stand-alone program), or of > postmaster. Since time is running out, I'd like to discuss > about this and advance the project. > > The related threads are: > http://archives.postgresql.org/pgsql-hackers/2009-09/msg01101.php > http://archives.postgresql.org/pgsql-hackers/2009-09/msg01291.php > > IMO, walreceiver should be a subprocess of postmaster for > the following reasons. > > 1. It's not easy to give a GUC parameter to a stand-alone > walreceiver program. A simple approach is giving a > parameter as a command-line argument. But this wouldn't > cover a reload of parameter. > > 2. It's not easy to treat the log messages generated by > a stand-alone walreceiver as well as the other postgres > messages. A straightforward approach is that the startup > process passes along the messages to the logger process. > But this is not simple. > > I agree that a stand-alone walreceiver is useful for some > cases. But I think that it's sufficient to provide that as > contrib or pgfoundry tool. Not need to provide that in core. > The communication interface to walsender is going to be > provided as libpq, so it's not difficult to implement such > a stand-alone tool. > > Thought? Please feel free to comment. I agree. A stand-alone tool seems like a good idea (which is why I proposed it) but I don't think that should mean that we can't have a tightly integrated core facility. We can decide later whether there it is helpful for those things to share code; right now, we should focus on getting an initial version of this feature out the door. Speaking of getting things out the door, what's up with Hot Standby? It seemed like the outstanding issues were just about dealt with, and then the discussion died off... ...Robert
Fujii Masao escreveu: > IMO, walreceiver should be a subprocess of postmaster for > the following reasons. > +1. I agree that the first version should be as close as possible to postmaster. My points are: (i) it will be easier to install (no need to install another third-party software), (ii) it will be easier to administrate (the options will be available in one central point -- postgresql.conf), and (iii) it will be easier to control (it is a postmaster subprocess). But I see some value if it would be possible to design it in a way that other third-party softwares could replace it completely (even if it couldn't take advantage of some postmaster features). Of course, there is no need to develop such a POC external walreceiver tool. You just need to have in mind that available interfaces should be accessible by external tools. If someone decides to code a tool to mimic walreceiver but with some aditional features such as wal filtering then (s)he is free to do it because we provide entry points in the API. BTW, are you going to submit another WIP patch for next commitfest? -- Euler Taveira de Oliveira http://www.timbira.com/
On Mon, Nov 2, 2009 at 10:14 AM, Euler Taveira de Oliveira <euler@timbira.com> wrote: > BTW, are you going to submit another WIP patch for next commitfest? Well, Heikki was going to keep working on this and Hot Standby between CommitFests "until it gets committed", but things seem to be stalled at the moment, possibly because Heikki is tied up with internal EnterpriseDB projects. I don't think the hold-up is with Fujii Masao. ...Robert
Robert Haas wrote: > On Mon, Nov 2, 2009 at 10:14 AM, Euler Taveira de Oliveira > <euler@timbira.com> wrote: >> BTW, are you going to submit another WIP patch for next commitfest? > > Well, Heikki was going to keep working on this and Hot Standby between > CommitFests "until it gets committed", but things seem to be stalled > at the moment, possibly because Heikki is tied up with internal > EnterpriseDB projects. I don't think the hold-up is with Fujii Masao. Right. I got dragged away into other stuff for the last week or so. wrt. synchronous replication, if someone else has the cycles to look at it, that would be great. I got stuck on the postmaster-process or not question Fujii raised again now, not being able to decide. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Euler Taveira de Oliveira wrote: > Fujii Masao escreveu: >> IMO, walreceiver should be a subprocess of postmaster for >> the following reasons. >> > +1. I agree that the first version should be as close as possible to > postmaster. My points are: (i) it will be easier to install (no need to > install another third-party software), (ii) it will be easier to administrate > (the options will be available in one central point -- postgresql.conf), and > (iii) it will be easier to control (it is a postmaster subprocess). None of these points are really for or against either approach. In any case, we would ship with all the required components, so no need to install 3rd party software. The recovery related options would come from recovery.conf in both models, although that could be changed if we wanted to. Not sure what easier to control (iii) means, although admittedly it's a bit tricky to make it walreceiver behave correctly as a subprocess of the startup process, making sure it responds to shutdown requests etc. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Hi, On Tue, Nov 3, 2009 at 3:23 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > wrt. synchronous replication, if someone else has the cycles to look at > it, that would be great. I got stuck on the postmaster-process or not > question Fujii raised again now, not being able to decide. What is your worry about the postmaster-subprocess walreceiver? One of those is that the startup process would become stuck because of failure of launching of walreceiver, and I have addressed that. http://archives.postgresql.org/pgsql-hackers/2009-09/msg02003.php If you have another worry, I'll address that. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Tue, Nov 3, 2009 at 12:33 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Nov 2, 2009 at 10:14 AM, Euler Taveira de Oliveira > <euler@timbira.com> wrote: >> BTW, are you going to submit another WIP patch for next commitfest? > > Well, Heikki was going to keep working on this and Hot Standby between > CommitFests "until it gets committed", but things seem to be stalled > at the moment, possibly because Heikki is tied up with internal > EnterpriseDB projects. I don't think the hold-up is with Fujii Masao. BTW, my replication patch is on git repository: git://git.postgresql.org/git/users/fujii/postgres.git branch: replication The changes against Heikki's repository (git://git.postgresql.org/git/users/heikki/postgres.git, branch: replication-orig) are: - Prevent pq_wait from being called more than once for the connection which has already turned out to have data ready tobe read. Sometimes walsender was calling pq_wait more than once for the connection before actually reading data. This is OK in Linux,the subsequent pq_wait returns immediately. OTOH, in Windows, this makes the subsequent pq_wait get stuck, i.e., thepq_wait doesn't return even if there is data ready to be read in the connection. Which seems to be derived from the half-bakedimplementation of pgwin32_select. So I changed pq_wait not to call select/poll until data was read from the connection, once it turned out to be available. - Fix the bug that has crossed a logid boundary wrongly. This bug was introduced by sr-paging-rework.patch. http://archives.postgresql.org/pgsql-hackers/2009-10/msg00384.php - Apply the sr_rework_1001.patch. http://archives.postgresql.org/pgsql-hackers/2009-09/msg01996.php Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
> Recently, the development of SR is not progressing because of > the indecision on whether walreceiver should be a subprocess > of the startup process (i.e., a stand-alone program), or of > postmaster. Since time is running out, I'd like to discuss > about this and advance the project. > > The related threads are: > http://archives.postgresql.org/pgsql-hackers/2009-09/msg01101.php > http://archives.postgresql.org/pgsql-hackers/2009-09/msg01291.php > > IMO, walreceiver should be a subprocess of postmaster for > the following reasons. > > 1. It's not easy to give a GUC parameter to a stand-alone > walreceiver program. A simple approach is giving a > parameter as a command-line argument. But this wouldn't > cover a reload of parameter. > > 2. It's not easy to treat the log messages generated by > a stand-alone walreceiver as well as the other postgres > messages. A straightforward approach is that the startup > process passes along the messages to the logger process. > But this is not simple. > > I agree that a stand-alone walreceiver is useful for some > cases. But I think that it's sufficient to provide that as > contrib or pgfoundry tool. Not need to provide that in core. > The communication interface to walsender is going to be > provided as libpq, so it's not difficult to implement such > a stand-alone tool. +1. I agree with the idea walreceiver runs as subprocess of postmaster. -- Tatsuo Ishii SRA OSS, Inc. Japan
Fujii Masao wrote: > On Tue, Nov 3, 2009 at 12:33 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Mon, Nov 2, 2009 at 10:14 AM, Euler Taveira de Oliveira >> <euler@timbira.com> wrote: >>> BTW, are you going to submit another WIP patch for next commitfest? >> Well, Heikki was going to keep working on this and Hot Standby between >> CommitFests "until it gets committed", but things seem to be stalled >> at the moment, possibly because Heikki is tied up with internal >> EnterpriseDB projects. I don't think the hold-up is with Fujii Masao. > > BTW, my replication patch is on git repository: > > git://git.postgresql.org/git/users/fujii/postgres.git > branch: replication Thanks, I started to look at this again now. The consensus seems to be to keep the current architecture where walreceiver is a child of postmaster. I found the global LogstreamResult variable very confusing. It meant different things in different processes. So I replaced it with static globals in walsender.c and walreceiver.c, and renamed the fields to match the purpose better. I removed some variables from shared memory that are not necessary, at least not before we have synchronous mode: Walsender only needs to publish how far it has sent, and walreceiver only needs to tell startup process how far it has fsync'd. I changed walreceiver so that it only lets the startup process to apply WAL that it has fsync'd to disk, per recent discussion on hackers. Maybe we want to support more esoteric modes in the future, but that's the least surprising and most useful one. Plus some other minor simplifications. My changes are in my git repo at git://git.postgresql.org/git/users/heikki/postgres.git, branch "replication". -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Hi, On Fri, Nov 20, 2009 at 5:54 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Thanks, I started to look at this again now. Thanks a lot! > I found the global LogstreamResult variable very confusing. It meant > different things in different processes. So I replaced it with static > globals in walsender.c and walreceiver.c, and renamed the fields to > match the purpose better. I removed some variables from shared memory > that are not necessary, at least not before we have synchronous mode: > Walsender only needs to publish how far it has sent, and walreceiver > only needs to tell startup process how far it has fsync'd. OK. > I changed walreceiver so that it only lets the startup process to apply > WAL that it has fsync'd to disk, per recent discussion on hackers. Maybe > we want to support more esoteric modes in the future, but that's the > least surprising and most useful one. OK. We'll need to go forward in stages. > Plus some other minor simplifications. My changes are in my git repo at > git://git.postgresql.org/git/users/heikki/postgres.git, branch > "replication". I fixed one bug. I also look through the code over and over again. git://git.postgresql.org/git/users/fujii/postgres.git branch: replication Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center