Обсуждение: Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
> Heikki Linnakangas wrote: > I don't like the idea of adding the origin id to the record header. > It's only required in some occasions, and on some record types. Right. > And I'm worried it might not even be enough in more complicated > scenarios. > > Perhaps we need a more generic WAL record annotation system, where > a plugin can tack arbitrary information to WAL records. The extra > information could be stored in the WAL record after the rmgr > payload, similar to how backup blocks are stored. WAL replay could > just ignore the annotations, but a replication system could use it > to store the origin id or whatever extra information it needs. Not only would that handle absolute versus relative updates and origin id, but application frameworks could take advantage of such a system for passing transaction metadata. I've held back on one concern so far that I'll bring up now because this suggestion would address it nicely. Our current trigger-driven logical replication includes a summary which includes transaction run time, commit time, the transaction type identifier, the source code line from which that transaction was invoked, the user ID with which the user connected to the application (which isn't the same as the database login), etc. Being able to "decorate" a database transaction with arbitrary (from the DBMS POV) metadata would be very valuable. In fact, our shop can't maintain the current level of capabilities without *some* way to associate such information with a transaction. I think that using up the only unused space in the fixed header to capture one piece of the transaction metadata needed for logical replication, and that only in some configurations, is short-sighted. If we solve the general problem of transaction metadata, this one specific case will fall out of that. I think removing origin ID from this patch and submitting a separate patch for a generalized transaction metadata system is the sensible way to go. -Kevin
On 20 June 2012 20:37, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >> Heikki Linnakangas wrote: > >> I don't like the idea of adding the origin id to the record header. >> It's only required in some occasions, and on some record types. > > Right. Wrong, as explained. >> And I'm worried it might not even be enough in more complicated >> scenarios. >> >> Perhaps we need a more generic WAL record annotation system, where >> a plugin can tack arbitrary information to WAL records. The extra >> information could be stored in the WAL record after the rmgr >> payload, similar to how backup blocks are stored. WAL replay could >> just ignore the annotations, but a replication system could use it >> to store the origin id or whatever extra information it needs. > > Not only would that handle absolute versus relative updates and > origin id, but application frameworks could take advantage of such a > system for passing transaction metadata. I've held back on one > concern so far that I'll bring up now because this suggestion would > address it nicely. > > Our current trigger-driven logical replication includes a summary > which includes transaction run time, commit time, the transaction > type identifier, the source code line from which that transaction was > invoked, the user ID with which the user connected to the application > (which isn't the same as the database login), etc. Being able to > "decorate" a database transaction with arbitrary (from the DBMS POV) > metadata would be very valuable. In fact, our shop can't maintain > the current level of capabilities without *some* way to associate > such information with a transaction. > I think that using up the only unused space in the fixed header to > capture one piece of the transaction metadata needed for logical > replication, and that only in some configurations, is short-sighted. > If we solve the general problem of transaction metadata, this one > specific case will fall out of that. The proposal now includes flag bits that would allow the addition of a variable length header, should that ever become necessary. So the unused space in the fixed header is not being "used up" as you say. In any case, the fixed header still has 4 wasted bytes on 64bit systems even after the patch is applied. So this claim of short sightedness is just plain wrong. It isn't true that this is needed only for some configurations of multi-master, per discussion. This is not transaction metadata, it is WAL record metadata required for multi-master replication, see later point. We need to add information to every WAL record that is used as the source for generating LCRs. It is also possible to add this to HEAP and HEAP2 records, but doing that *will* bloat the WAL stream, whereas using the *currently wasted* bytes on a WAL record header does *not* bloat the WAL stream. > I think removing origin ID from this patch and submitting a separate > patch for a generalized transaction metadata system is the sensible > way to go. We already have a very flexible WAL system for recording data of interest to various resource managers. If you wish to annotate a transaction, you can either generate a new kind of WAL record or you can enhance a commit record. There are already unused flag bits on commit records for just such a purpose. XLOG_NOOP records can already be generated by your application if you wish to inject additional metadata to the WAL stream. So no changes are required for you to implement the generalised transaction metadata scheme you say you require. Not sure how or why that relates to requirements for multi-master. Please note that I've suggested review changes to Andres' work myself. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs <simon@2ndQuadrant.com> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >>> Heikki Linnakangas wrote: >> >>> I don't like the idea of adding the origin id to the record >>> header. It's only required in some occasions, and on some record >>> types. >> >> Right. > > Wrong, as explained. The point is not wrong; you are simply not responding to what is being said. You have not explained why an origin ID is required when there is no replication, or if there is master/slave logical replication, or there are multiple masters with non-overlapping primary keys replicating to a single table in a consolidated database, or each master replicates to all other masters directly, or any of various other scenarios raised on this thread. You've only explained why it's necessary for certain configurations of multi-master replication where all rows in a table can be updated on any of the masters. I understand that this is the configuration you find most interesting, at least for initial implementation. That does not mean that the other situations don't exist as use cases or should be not be considered in the overall design. I don't think there is anyone here who would not love to see this effort succeed, all the way to multi-master replication in the configuration you are emphasizing. What is happening is that people are expressing concerns about parts of the design which they feel are problematic, and brainstorming about possible alternatives. As I'm sure you know, fixing a design problem at this stage in development is a lot less expensive than letting the problem slide and trying to deal with it later. > It isn't true that this is needed only for some configurations of > multi-master, per discussion. I didn't get that out of the discussion; I saw a lot of cases mentioned as not needing it to which you simply did not respond. > This is not transaction metadata, it is WAL record metadata > required for multi-master replication, see later point. > > We need to add information to every WAL record that is used as the > source for generating LCRs. If the origin ID of a transaction doesn't count as transaction metadata (i.e., data about the transaction), what does? It may be a metadata element about which you have special concerns, but it is transaction metadata. You don't plan on supporting individual WAL records within a transaction containing different values for origin ID, do you? If not, why is it something to store in every WAL record rather than once per transaction? That's not intended to be a rhetorical question. I think it's because you're still thinking of the WAL stream as *the medium* for logical replication data rather than *the source* of logical replication data. As long as the WAL stream is the medium, options are very constrained. You can code a very fast engine to handle a single type of configuration that way, and perhaps that should be a supported feature, but it's not a configuration I've needed yet. (Well, on reflection, if it had been available and easy to use, I can think of *one* time I *might* have used it for a pair of nodes.) It seems to me that you are so focused on this one use case that you are not considering how design choices which facilitate fast development of that use case paint us into a corner in terms of expanding to other use cases. >> I think removing origin ID from this patch and submitting a >> separate patch for a generalized transaction metadata system is >> the sensible way to go. > > We already have a very flexible WAL system for recording data of > interest to various resource managers. If you wish to annotate a > transaction, you can either generate a new kind of WAL record or > you can enhance a commit record. Right. Like many of us are suggesting should be done for origin ID. > XLOG_NOOP records can already be generated by your application if > you wish to inject additional metadata to the WAL stream. So no > changes are required for you to implement the generalised > transaction metadata scheme you say you require. I'm glad it's that easy. Are there SQL functions to for that yet? > Not sure how or why that relates to requirements for multi-master. That depends on whether you want to leave the door open to other logical replication than the one use case on which you are currently focused. I even consider some of those other cases multi-master, especially when multiple databases are replicating to a single table on another server. I'm not clear on your definition -- it seems to be rather more narrow. Maybe we need to define some terms somewhere to facilitate discussion. Is there a Wiki page where that would make sense? -Kevin
On Wednesday, June 20, 2012 05:34:42 PM Kevin Grittner wrote: > Simon Riggs <simon@2ndQuadrant.com> wrote: > > This is not transaction metadata, it is WAL record metadata > > required for multi-master replication, see later point. > > We need to add information to every WAL record that is used as the > > source for generating LCRs. > If the origin ID of a transaction doesn't count as transaction > metadata (i.e., data about the transaction), what does? It may be a > metadata element about which you have special concerns, but it is > transaction metadata. You don't plan on supporting individual WAL > records within a transaction containing different values for origin > ID, do you? If not, why is it something to store in every WAL > record rather than once per transaction? That's not intended to be > a rhetorical question. Its definitely possible to store it per transaction (see the discussion around http://archives.postgresql.org/message- id/201206201605.43634.andres@2ndquadrant.com) it just makes the filtering via the originating node a considerably more complex thing. With our proposal you can do it without any complexity involved, on a low level. Storing it per transaction means you can only stream out the data to other nodes *after* fully reassembling the transaction. Thats a pitty, especially if we go for a design where the decoding happens in a proxy instance. Other metadata will not be needed on such a low level. I also have to admit that I am very hesitant to start developing some generic "transaction metadata" framework atm. That seems to be a good way to spend a good part of time in discussion and disagreeing. Imo thats something for later. > I think it's because you're still thinking > of the WAL stream as *the medium* for logical replication data > rather than *the source* of logical replication data. I don't think thats true. See the above referenced subthread for reasons why I think the origin id is important. Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 20 June 2012 23:34, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Simon Riggs <simon@2ndQuadrant.com> wrote: >> Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >>>> Heikki Linnakangas wrote: >>> >>>> I don't like the idea of adding the origin id to the record >>>> header. It's only required in some occasions, and on some record >>>> types. >>> >>> Right. >> >> Wrong, as explained. > > The point is not wrong; you are simply not responding to what is > being said. Heikki said that the origin ID was not required for all MMR configs/scenarios. IMHO that is wrong, with explanation given. By agreeing with him, I assumed you were sharing that assertion, rather than saying something else. > You have not explained why an origin ID is required when there is no > replication, or if there is master/slave logical replication, or ... You're right; I never claimed it was needed. Origin Id is only needed for multi-master replication and that is the only context I've discussed it. >> This is not transaction metadata, it is WAL record metadata >> required for multi-master replication, see later point. >> >> We need to add information to every WAL record that is used as the >> source for generating LCRs. > > If the origin ID of a transaction doesn't count as transaction > metadata (i.e., data about the transaction), what does? It may be a > metadata element about which you have special concerns, but it is > transaction metadata. You don't plan on supporting individual WAL > records within a transaction containing different values for origin > ID, do you? If not, why is it something to store in every WAL > record rather than once per transaction? That's not intended to be > a rhetorical question. I think it's because you're still thinking > of the WAL stream as *the medium* for logical replication data > rather than *the source* of logical replication data. > As long as the WAL stream is the medium, options are very > constrained. You can code a very fast engine to handle a single > type of configuration that way, and perhaps that should be a > supported feature, but it's not a configuration I've needed yet. > (Well, on reflection, if it had been available and easy to use, I > can think of *one* time I *might* have used it for a pair of nodes.) > It seems to me that you are so focused on this one use case that you > are not considering how design choices which facilitate fast > development of that use case paint us into a corner in terms of > expanding to other use cases. >>> I think removing origin ID from this patch and submitting a >>> separate patch for a generalized transaction metadata system is >>> the sensible way to go. >> >> We already have a very flexible WAL system for recording data of >> interest to various resource managers. If you wish to annotate a >> transaction, you can either generate a new kind of WAL record or >> you can enhance a commit record. > > Right. Like many of us are suggesting should be done for origin ID. > >> XLOG_NOOP records can already be generated by your application if >> you wish to inject additional metadata to the WAL stream. So no >> changes are required for you to implement the generalised >> transaction metadata scheme you say you require. > > I'm glad it's that easy. Are there SQL functions to for that yet? Yes, another possible design is to generate a new kind of WAL record for the origin id. Doing it that way will slow down multi-master by a measurable amount, and slightly bloat the WAL stream. The proposed way uses space that is currently wasted and likely to remain so. Only 2 bytes of 6 bytes available are proposed for use, with a flag design that allows future extension if required. When MMR is not in use, the WAL records would look completely identical to the way they look now, in size, settings and speed of writing them. Putting the origin id onto each WAL record allows very fast and simple stateless filtering. I suggest using it because those bytes have been sitting there unused for close to 10 years now and no better use springs to mind. The proposed design is the fastest way of implementing MMR, without any loss for non-users. As I noted before, slowing down MMR by a small amount causes geometric losses in performance across the whole cluster. >> Not sure how or why that relates to requirements for multi-master. > > That depends on whether you want to leave the door open to other > logical replication than the one use case on which you are currently > focused. I even consider some of those other cases multi-master, > especially when multiple databases are replicating to a single table > on another server. I'm not clear on your definition -- it seems to > be rather more narrow. Maybe we need to define some terms somewhere > to facilitate discussion. Is there a Wiki page where that would > make sense? The project is called BiDirectional Replication to ensure that people understood this is not just multi-master. But that doesn't mean that multi-master can't have its own specific requirements. Adding originid is also useful for the use case you mention, since its useful to know where the data came from for validation. So having an originid on each insert record would be important. That case must also handle conflicts from duplicate inserts, and originid priority is then an option for conflict handling. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Jun 20, 2012 at 11:50 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On Wednesday, June 20, 2012 05:34:42 PM Kevin Grittner wrote: >> Simon Riggs <simon@2ndQuadrant.com> wrote: >> > This is not transaction metadata, it is WAL record metadata >> > required for multi-master replication, see later point. > >> > We need to add information to every WAL record that is used as the >> > source for generating LCRs. >> If the origin ID of a transaction doesn't count as transaction >> metadata (i.e., data about the transaction), what does? It may be a >> metadata element about which you have special concerns, but it is >> transaction metadata. You don't plan on supporting individual WAL >> records within a transaction containing different values for origin >> ID, do you? If not, why is it something to store in every WAL >> record rather than once per transaction? That's not intended to be >> a rhetorical question. > Its definitely possible to store it per transaction (see the discussion around > http://archives.postgresql.org/message- > id/201206201605.43634.andres@2ndquadrant.com) it just makes the filtering via > the originating node a considerably more complex thing. With our proposal you > can do it without any complexity involved, on a low level. Storing it per > transaction means you can only stream out the data to other nodes *after* > fully reassembling the transaction. Thats a pitty, especially if we go for a > design where the decoding happens in a proxy instance. I guess I'm not seeing the purpose to having the origin node id in the WAL stream either. We have it in the Slony sl_log_* stream, however there is a crucial difference, in that sl_log_* is expressly a shared structure. In contrast, WAL isn't directly sharable; you don't mix together multiple WAL streams. It seems as though the point in time at which you need to know the origin ID is the moment at which you're deciding to read data from the WAL files, and knowing which stream you are reading from is an assertion that might be satisfied by looking at configuration that doesn't need to be in the WAL stream itself. It might be *nice* for the WAL stream to be self-identifying, but that doesn't seem to be forcibly necessary. The case where it *would* be needful is if you are in the process of assembling together updates coming in from multiple masters, and need to know: - This INSERT was replicated from node #1, so should be ignored downstream - That INSERT was replicated from node#2, so should be ignored downstream - This UPDATE came from the local node, so needs to be passed to downstream users Or perhaps something else is behind the node id being deeply embedded into the stream that I'm not seeing altogether. > Other metadata will not be needed on such a low level. > > I also have to admit that I am very hesitant to start developing some generic > "transaction metadata" framework atm. That seems to be a good way to spend a > good part of time in discussion and disagreeing. Imo thats something for > later. Well, I see there being a use in there being at least 3 sorts of LCR records: a) Capturing literal SQL that is to replayed downstream. This parallels two use cases existing in existing replication systems: i) In pre-2.2 versions of Slony, statements are replayedliterally.So there's a stream of INSERT/UPDATE/DELETE statements. ii) DDL capture and replay. In existing replicationsystems, DDL isn't captured implicitly, the way Dimitri's Event Triggers are to do, but rather is captured explicitly. There should be a function to allow injecting such SQL explicitly; that is sure to be a useful sort of thing to be able to do. b) Capturing tuple updates in a binary form that can be turned readily into heap updates on a replica. Unfortunately, this form is likely not to play well when replicating across platforms or Postgres versions, so I suspect that this performance optimization should be implemented as a *last* resort, rather than first. Michael Jackson had some "rules of optimization" that said "don't do it", and, for the expert, "don't do it YET..." c) Capturing tuple data in some reasonably portable and readily re-writable form. Slony 2.2 changes from "SQL fragments" (of a) i) above) to storing updates as an array of text values indicating: - relation name - attribute names - attribute values, serializedinto strings I don't know that this provably represents the *BEST* representation, but it definitely will be portable where b) would not be, and lends itself to being able to reuse query plans, where a) requires extraordinary amounts of parsing work, today. So I'm pretty sure it's better than a) and b) for a sizable set of cases. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
Hi Chris! On Wednesday, June 20, 2012 07:06:28 PM Christopher Browne wrote: > On Wed, Jun 20, 2012 at 11:50 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > On Wednesday, June 20, 2012 05:34:42 PM Kevin Grittner wrote: > >> Simon Riggs <simon@2ndQuadrant.com> wrote: > >> > This is not transaction metadata, it is WAL record metadata > >> > required for multi-master replication, see later point. > >> > > >> > We need to add information to every WAL record that is used as the > >> > source for generating LCRs. > >> > >> If the origin ID of a transaction doesn't count as transaction > >> metadata (i.e., data about the transaction), what does? It may be a > >> metadata element about which you have special concerns, but it is > >> transaction metadata. You don't plan on supporting individual WAL > >> records within a transaction containing different values for origin > >> ID, do you? If not, why is it something to store in every WAL > >> record rather than once per transaction? That's not intended to be > >> a rhetorical question. > > > > Its definitely possible to store it per transaction (see the discussion > > around http://archives.postgresql.org/message- > > id/201206201605.43634.andres@2ndquadrant.com) it just makes the filtering > > via the originating node a considerably more complex thing. With our > > proposal you can do it without any complexity involved, on a low level. > > Storing it per transaction means you can only stream out the data to > > other nodes *after* fully reassembling the transaction. Thats a pitty, > > especially if we go for a design where the decoding happens in a proxy > > instance. > > I guess I'm not seeing the purpose to having the origin node id in the > WAL stream either. > > We have it in the Slony sl_log_* stream, however there is a crucial > difference, in that sl_log_* is expressly a shared structure. In > contrast, WAL isn't directly sharable; you don't mix together multiple > WAL streams. > > It seems as though the point in time at which you need to know the > origin ID is the moment at which you're deciding to read data from the > WAL files, and knowing which stream you are reading from is an > assertion that might be satisfied by looking at configuration that > doesn't need to be in the WAL stream itself. It might be *nice* for > the WAL stream to be self-identifying, but that doesn't seem to be > forcibly necessary. > > The case where it *would* be needful is if you are in the process of > assembling together updates coming in from multiple masters, and need > to know: > - This INSERT was replicated from node #1, so should be ignored > downstream - That INSERT was replicated from node #2, so should be ignored > downstream - This UPDATE came from the local node, so needs to be passed > to downstream users Exactly that is the point. And you want to do that in an efficient manner without too much logic, thats why something simple like the record header is so appealing. > > I also have to admit that I am very hesitant to start developing some > > generic "transaction metadata" framework atm. That seems to be a good > > way to spend a good part of time in discussion and disagreeing. Imo > > thats something for later. > Well, I see there being a use in there being at least 3 sorts of LCR > records: > a) Capturing literal SQL that is to replayed downstream > b) Capturing tuple updates in a binary form that can be turned readily > into heap updates on a replica. > c) Capturing tuple data in some reasonably portable and readily > re-writable form I think we should provide the utilities to do all of those. a) is a consequence of being able to do c). That doesn't really have something to do with this subthread though? The part you quoted above was my response to the suggestion to add some generic framework to attach metadata to individual transactions on the generating side. We quite possibly will end up needing that but I personally don't think we should designing that part atm. > b) Capturing tuple updates in a binary form that can be turned readily > into heap updates on a replica. > Unfortunately, this form is likely not to play well when > replicating across platforms or Postgres versions, so I suspect that > this performance optimization should be implemented as a *last* > resort, rather than first. Michael Jackson had some "rules of > optimization" that said "don't do it", and, for the expert, "don't do > it YET..." Well, apply is a bottleneck. Besides field experience I/We have benchmarked it and its rather plausible that it is. And I don't think we can magically make that faster in pg in general so my plan is to remove the biggest cost factor I can see. And yes, it will have restrictions... Regards, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 21 June 2012 01:06, Christopher Browne <cbbrowne@gmail.com> wrote: > I guess I'm not seeing the purpose to having the origin node id in the > WAL stream either. > > We have it in the Slony sl_log_* stream, however there is a crucial > difference, in that sl_log_* is expressly a shared structure. In > contrast, WAL isn't directly sharable; you don't mix together multiple > WAL streams. Unfortunately you do. That's really the core of how this differs from current Slony. Every change we make creates WAL records. Whether that is changes originating on the current node, or changes originating on upstream nodes that need to be applied on the current node. The WAL stream is then read and filtered for changes to pass onto other nodes. So we want to be able to filter out the applied changes to avoid passing them back to the original nodes. Having each record know the origin makes the filtering much simpler, so if its possible to do it efficiently then its the best design. It turns out to be the best way to do this so far known. There are other design however, as noted. In all cases we need the origin id in the WAL. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 20.06.2012 16:46, Simon Riggs wrote: > The proposal now includes flag bits that would allow the addition of a > variable length header, should that ever become necessary. So the > unused space in the fixed header is not being "used up" as you say. In > any case, the fixed header still has 4 wasted bytes on 64bit systems > even after the patch is applied. So this claim of short sightedness is > just plain wrong. >> ...> > We need to add information to every WAL record that is used as the > source for generating LCRs. It is also possible to add this to HEAP > and HEAP2 records, but doing that *will* bloat the WAL stream, whereas > using the *currently wasted* bytes on a WAL record header does *not* > bloat the WAL stream. Or, we could provide a mechanism for resource managers to use those padding bytes for whatever data the wish to use. Or modify the record format so that the last 4 bytes of the data in the WAL record are always automatically stored in those padding bytes, thus making all WAL records 4 bytes shorter. That would make the WAL even more compact, with only a couple of extra CPU instructions in the critical path. My point is that it's wrong to think that it's free to use those bytes, just because they're currently unused. If we use them for one thing, we can't use them for other things anymore. If we're so concerned about WAL bloat that we can't afford to add any more bytes to the WAL record header or heap WAL records, then it would be equally fruitful to look at ways to use those padding bytes to save that precious WAL space. I don't think we're *that* concerned about the WAL bloat, however. So let's see what is the most sensible place to add whatever extra information we need in the WAL, from the point of view of maintainability, flexibility, readability etc. Then we can decide where to put it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 21 June 2012 02:45, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > On 20.06.2012 16:46, Simon Riggs wrote: >> >> The proposal now includes flag bits that would allow the addition of a >> variable length header, should that ever become necessary. So the >> unused space in the fixed header is not being "used up" as you say. In >> any case, the fixed header still has 4 wasted bytes on 64bit systems >> even after the patch is applied. So this claim of short sightedness is >> just plain wrong. >> >> ... > >> >> >> We need to add information to every WAL record that is used as the >> source for generating LCRs. It is also possible to add this to HEAP >> and HEAP2 records, but doing that *will* bloat the WAL stream, whereas >> using the *currently wasted* bytes on a WAL record header does *not* >> bloat the WAL stream. > Wonderful ideas, these look good. > Or, we could provide a mechanism for resource managers to use those padding > bytes for whatever data the wish to use. Sounds better to me. > Or modify the record format so that > the last 4 bytes of the data in the WAL record are always automatically > stored in those padding bytes, thus making all WAL records 4 bytes shorter. > That would make the WAL even more compact, with only a couple of extra CPU > instructions in the critical path. Sounds cool, but a little weird, even for me. > My point is that it's wrong to think that it's free to use those bytes, just > because they're currently unused. If we use them for one thing, we can't use > them for other things anymore. If we're so concerned about WAL bloat that we > can't afford to add any more bytes to the WAL record header or heap WAL > records, then it would be equally fruitful to look at ways to use those > padding bytes to save that precious WAL space. Agreed. Thanks for sharing those ideas. Exactly why I like the list (really...) > I don't think we're *that* concerned about the WAL bloat, however. So let's > see what is the most sensible place to add whatever extra information we > need in the WAL, from the point of view of maintainability, flexibility, > readability etc. Then we can decide where to put it. Removing FPW is still most important aspect there. I think allowing rmgrs to redefine the wasted bytes in the header is the best idea. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 21 June 2012 02:56, Simon Riggs <simon@2ndquadrant.com> wrote: > I think allowing rmgrs to redefine the wasted bytes in the header is > the best idea. Hmm, I think the best idea is to save 2 bytes off the WAL header for all records, so there are no wasted bytes on 64bit or 32bit. That way the potential for use goes away and there's benefit for all, plus no argument about how to use those bytes in rarer cases. I'll work on that. And then we just put the originid on each heap record for MMR, in some manner, discussed later. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 20.06.2012 22:11, Simon Riggs wrote: > On 21 June 2012 02:56, Simon Riggs<simon@2ndquadrant.com> wrote: > >> I think allowing rmgrs to redefine the wasted bytes in the header is >> the best idea. > > Hmm, I think the best idea is to save 2 bytes off the WAL header for > all records, so there are no wasted bytes on 64bit or 32bit. > > That way the potential for use goes away and there's benefit for all, > plus no argument about how to use those bytes in rarer cases. > > I'll work on that. I don't think that's actually necessary, the WAL bloat isn't *that* bad that we need to start shaving bytes from there. I was just trying to make a point. > And then we just put the originid on each heap record for MMR, in some > manner, discussed later. I reserve the right to object to that, too :-). Others raised the concern that a 16-bit integer is not a very intuitive identifier. Also, as discussed, for more complex scenarios just the originid is not sufficient. ISTM that we need more flexibility. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 21 June 2012 03:23, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: >> And then we just put the originid on each heap record for MMR, in some >> manner, discussed later. > > > I reserve the right to object to that, too :-). OK. But that would be only for MMR, using special record types. > Others raised the concern > that a 16-bit integer is not a very intuitive identifier. Of course > Also, as > discussed, for more complex scenarios just the originid is not sufficient. > ISTM that we need more flexibility. Of course -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wednesday, June 20, 2012 09:23:34 PM Heikki Linnakangas wrote: > > And then we just put the originid on each heap record for MMR, in some > > manner, discussed later. > > I reserve the right to object to that, too :-). Others raised the > concern that a 16-bit integer is not a very intuitive identifier. Also, > as discussed, for more complex scenarios just the originid is not > sufficient. ISTM that we need more flexibility. I think the '16bit integer is unintiuitive' argument isn't that interesting. As pointed out by multiple people in the thread that origin_id can be local and mapped to something more complex in communication between the different nodes and the configuration. Before applying changes from another node you lookup their "complex id" into the locally mapped 16bit origin_id which then gets written into the wal stream. When decoding the wal stream into the LCR stream its mapped the other way. We might need more information than that at a later point but those probably won't needed during low-level filtering of wal before reassembling it into transactions... Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services