Обсуждение: WIP partial replication patch
Hi, attached is a WIP patch that will eventually implement partial replication, with the following syntax: CREATE REPLICA CLASS classname [ EXCLUDING RELATION ( relname [ , ... ] ) ] [ EXCLUDING DATABASE ( dbname [ , ... ] ) ] ALTER REPLICA CLASS classname [ { INCLUDING | EXCLUDING } RELATION ( relname [ , ... ] ) ] [ { INCLUDING | EXCLUDING } DATABASE ( dbname [ , ... ] ) ] The use case is to have a secondary server where read-only access is allowed but (maybe for space reasons) some tables and databases are excluded from the replication. The standby server keeps those tables at the state of the last full backup but no further modification is done to them. The current patch adds two new global system tables, pg_replica and pg_replicaitem and three new indexes to maintain the classes and their contents. The startup process in standby mode connects to a new database called "replication" which is created at initdb time. This is needed because transaction context is needed for accessing the syscache for the new tables. There is a little nasty detail with the patch as it stands. The RelFileNode triplet is currently treated as if it carried the relation Oid, but it's not actually true, the RelFileNode contains the relfilenode ID. Initially, without table rewriting DDL, the oid equals relfilenode, which was enough for a proof of concept patch. I will need to extend the relmapper so it can carry more than one "database-local" mapping info, so the filter can work in all database at once. To be able to do this, all databases' pg_class should be read initially and re-read during relmapper cache invalidation. As a sidenode, this work may serve as a basis for full cross-database relation access, too. Best regards, Zoltán Böszörményi
Вложения
Boszormenyi Zoltan <zb@cybertec.at> writes: > attached is a WIP patch that will eventually implement > partial replication, with the following syntax: This fundamentally cannot work, as it relies on system catalogs to be valid during recovery. Another rather basic problem is that you've got to pass system catalog updates downstream (in case they affect the tables being replicated) but if you want partial replication then many of those updates will be incorrect for the slave machine. More generally, though, we are going to have our hands full for the foreseeable future trying to get the existing style of replication bug-free and performant. I don't think we want to undertake any large expansion of the replication feature set, at least not for some time to come. So you can count on me to vote against committing anything like this into core. regards, tom lane
Tom Lane írta: > Boszormenyi Zoltan <zb@cybertec.at> writes: > >> attached is a WIP patch that will eventually implement >> partial replication, with the following syntax: >> > > This fundamentally cannot work, as it relies on system catalogs to be > valid during recovery. Just like Hot Standby, no? What is the difference here? Sorry for being ignorant. > Another rather basic problem is that you've > got to pass system catalog updates downstream (in case they affect > the tables being replicated) but if you want partial replication then > many of those updates will be incorrect for the slave machine. > Yes, it's true. But there's an easy solution to that, querying such tables can be forbidden, we were talking about truncating such excluded relations internally. Currently querying exluded tables are allowed just to be able to see that DML indeed doesn't modify them. As I said, ATM it's only a proof of concept patch. > More generally, though, we are going to have our hands full for the > foreseeable future trying to get the existing style of replication > bug-free and performant. I don't think we want to undertake any large > expansion of the replication feature set, at least not for some time > to come. So you can count on me to vote against committing anything > like this into core. > Understood. Best regards, Zoltán Böszörményi
On Fri, Aug 13, 2010 at 09:36:00PM +0200, Boszormenyi Zoltan wrote: > Tom Lane írta: > > Boszormenyi Zoltan <zb@cybertec.at> writes: > > > >> attached is a WIP patch that will eventually implement > >> partial replication, with the following syntax: > > This fundamentally cannot work, as it relies on system catalogs to be > > valid during recovery. > Just like Hot Standby, no? What is the difference here? > Sorry for being ignorant. In HS you can only connect after youve found a restartpoint - only after that you know that you have reached a consistent point for the system. I think this is fixable by keeping more wal on the standby's but I need to think more about it. Andres
> Another rather basic problem is that you've > got to pass system catalog updates downstream (in case they affect > the tables being replicated) but if you want partial replication then > many of those updates will be incorrect for the slave machine. Couldn't this be taken care of by replicating the objects but not any data for them? That is, the tables and indexes would exist, but be empty? > More generally, though, we are going to have our hands full for the > foreseeable future trying to get the existing style of replication > bug-free and performant. I don't think we want to undertake any large > expansion of the replication feature set, at least not for some time > to come. So you can count on me to vote against committing anything > like this into core. I imagine it'll take more than a year to get this to work, if we ever do. Probably good to put it on a git branch and that way those who want to can continue long-term work on it. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: >> Another rather basic problem is that you've >> got to pass system catalog updates downstream (in case they affect >> the tables being replicated) but if you want partial replication then >> many of those updates will be incorrect for the slave machine. > Couldn't this be taken care of by replicating the objects but not any > data for them? That is, the tables and indexes would exist, but be empty? Seems a bit pointless. What exactly is the use-case for a slave whose system catalogs match the master exactly (as they must) but whose data does not? Notice also that you have to shove the entire WAL downstream anyway --- the proposed patch filters at the point of application, and would have a hard time doing better because LSNs have to remain consistent. It would also be rather tricky to identify which objects have to have updates applied, eg, if you replicate a table you'd damn well better replicate the data for each and every one of its indexes (which is a non-constant set in general), because queries on the slave will expect them all to be valid. Maybe it's possible to keep track of that, though I bet things will be tricky when there are uncommitted DDL changes (consider data changes associated with a CREATE INDEX on a replicated table). In any case xlog replay functions are not the place to have that kind of logic. regards, tom lane
Andres Freund írta: > On Fri, Aug 13, 2010 at 09:36:00PM +0200, Boszormenyi Zoltan wrote: > >> Tom Lane írta: >> >>> Boszormenyi Zoltan <zb@cybertec.at> writes: >>> >>> >>>> attached is a WIP patch that will eventually implement >>>> partial replication, with the following syntax: >>>> >>> This fundamentally cannot work, as it relies on system catalogs to be >>> valid during recovery. >>> >> Just like Hot Standby, no? What is the difference here? >> Sorry for being ignorant. >> > In HS you can only connect after youve found a restartpoint - only > after that you know that you have reached a consistent point for the > system. > And in this patch, the startup process only tries to connect after signalling the postmaster that a consistent state is reached. And the connection has a reasonable timeout built in. > I think this is fixable by keeping more wal on the standby's but I > need to think more about it. > > Andres > > Best regards, Zoltán Böszörményi
On Sat, Aug 14, 2010 at 08:40:24AM +0200, Boszormenyi Zoltan wrote: > Andres Freund írta: > > On Fri, Aug 13, 2010 at 09:36:00PM +0200, Boszormenyi Zoltan wrote: > > > >> Tom Lane írta: > >> > >>> Boszormenyi Zoltan <zb@cybertec.at> writes: > >>> > >>> > >>>> attached is a WIP patch that will eventually implement > >>>> partial replication, with the following syntax: > >>>> > >>> This fundamentally cannot work, as it relies on system catalogs to be > >>> valid during recovery. > >>> > >> Just like Hot Standby, no? What is the difference here? > >> Sorry for being ignorant. > >> > > In HS you can only connect after youve found a restartpoint - only > > after that you know that you have reached a consistent point for the > > system. > > > And in this patch, the startup process only tries to connect > after signalling the postmaster that a consistent state is reached. > And the connection has a reasonable timeout built in. I don't think you currently can guarantee you allways have enough local WAL to even reach a consistent point. Which is not a problem of your patch, dont get me wrong... Andres
Andres Freund <andres@anarazel.de> writes: > On Sat, Aug 14, 2010 at 08:40:24AM +0200, Boszormenyi Zoltan wrote: >> And in this patch, the startup process only tries to connect >> after signalling the postmaster that a consistent state is reached. >> And the connection has a reasonable timeout built in. > I don't think you currently can guarantee you allways have enough > local WAL to even reach a consistent point. Even if you do, the patch will malfunction (and perhaps corrupt the database) while reading that WAL. Yes, it'd work once you reach a consistent database state, but bootstrapping a slave into that condition will be far more painful than it is with the current replication code. regards, tom lane