Обсуждение: Re: [BUGS] server crash in very big transaction [postgresql 8.0beta1]

Поиск
Список
Период
Сортировка

Re: [BUGS] server crash in very big transaction [postgresql 8.0beta1]

От
Tom Lane
Дата:
I wrote:
> What is happening of course is that more than 16K subtransaction IDs
> won't fit in a commit record (since XLOG records have a 16-bit length
> field).  We're gonna have to rethink the representation of subxact
> commit in XLOG.

After some further thought, I think there are basically two ways to
attack this:

1. Allow XLOG records to be larger than 64K.

2. Split transaction commit into multiple XLOG records when there are  many subtransactions.

#2 looks pretty painful because of the need to ensure that transaction
commit is still an atomic action.  It's probably doable in principle
with something similar to the solution we are using for btree page split
logging (ie, record enough info so that the replay logic can complete
the commit even if the later records aren't recoverable from the log).
But I don't see all the details right off, and it sure seems risky.

I'm inclined to go with #1.  There are various ways we could do it
but the most straightforward would be to just widen the xl_len field
to 32 bits.  This would cost either 4 or 8 bytes per XLOG record
(because of MAXALIGN restrictions) but we could more than buy that back
by eliminating the xl_prev and/or xl_xact_prev fields, which have no use
in the current system.  (They were intended to support UNDO but it seems
clear that we will never do that.)

Or we could assign an rmgr value to represent an "extension" record that
is to be merged with a following "normal" record.  This is kinda klugy
but would avoid wasting bits on xl_len in the vast majority of records.
Also we'd not have to force an initdb since the file format would
remain upward-compatible.

Thoughts?
        regards, tom lane


Re: [BUGS] server crash in very big transaction [postgresql

От
Gavin Sherry
Дата:
On Tue, 24 Aug 2004, Tom Lane wrote:

> I wrote:
> > What is happening of course is that more than 16K subtransaction IDs
> > won't fit in a commit record (since XLOG records have a 16-bit length
> > field).  We're gonna have to rethink the representation of subxact
> > commit in XLOG.
>
> After some further thought, I think there are basically two ways to
> attack this:
>
> 1. Allow XLOG records to be larger than 64K.
>
> 2. Split transaction commit into multiple XLOG records when there are
>    many subtransactions.
>

[snip]

> I'm inclined to go with #1.  There are various ways we could do it
> but the most straightforward would be to just widen the xl_len field
> to 32 bits.  This would cost either 4 or 8 bytes per XLOG record
> (because of MAXALIGN restrictions) but we could more than buy that back
> by eliminating the xl_prev and/or xl_xact_prev fields, which have no use
> in the current system.  (They were intended to support UNDO but it seems
> clear that we will never do that.)

If we have to do an initdb for a subsequent beta, could we just remove
these anyway? By my count, we've got at least 16 bytes there.

As for extending the length of xl_len, what happens if someone now has
2^30 subtransaction IDs (as unlikely as that sounds)? What I mean is, it
would be good if we could detect this at a point when we can issue an
ERROR. If we go down this path, we should also document the maximum number
of sub transaction IDs which can be used within a single block so that
if/when people look at doing stuff on that scale are aware of the
limitations.

>
> Or we could assign an rmgr value to represent an "extension" record that
> is to be merged with a following "normal" record.  This is kinda klugy
> but would avoid wasting bits on xl_len in the vast majority of records.
> Also we'd not have to force an initdb since the file format would
> remain upward-compatible.

This is a better idea, I think, as it avoids the problems above and, as
you say, will be binary compatible.

Gavin


Re: [BUGS] server crash in very big transaction [postgresql

От
Alvaro Herrera
Дата:
On Wed, Aug 25, 2004 at 11:21:49AM +1000, Gavin Sherry wrote:
> On Tue, 24 Aug 2004, Tom Lane wrote:

> > 1. Allow XLOG records to be larger than 64K.
> >
> > 2. Split transaction commit into multiple XLOG records when there are
> >    many subtransactions.
>
> [snip]
> 
> > I'm inclined to go with #1.  There are various ways we could do it
> > but the most straightforward would be to just widen the xl_len field
> > to 32 bits.  This would cost either 4 or 8 bytes per XLOG record
> > (because of MAXALIGN restrictions) but we could more than buy that back
> > by eliminating the xl_prev and/or xl_xact_prev fields, which have no use
> > in the current system.  (They were intended to support UNDO but it seems
> > clear that we will never do that.)

If we agree to never implement UNDO, there's a bunch of other code that
could be removed.  Is there anyone that thinks we have any chance of not
doing it?

OTOH, if those fields are unused, we could just remove them for now in
any case.  It's unlikely that there won't be a catalog update for some
other reason before someone implements UNDO anyway.

> As for extending the length of xl_len, what happens if someone now has
> 2^30 subtransaction IDs (as unlikely as that sounds)?

The commit xlog record also carries dropped table information, 12 bytes
apiece (on 32 bit machines?).  It's unlikely that anyone will drop 2^13
tables on a single transaction, but it adds to the child xid list.


> > Or we could assign an rmgr value to represent an "extension" record that
> > is to be merged with a following "normal" record.  This is kinda klugy
> > but would avoid wasting bits on xl_len in the vast majority of records.
> > Also we'd not have to force an initdb since the file format would
> > remain upward-compatible.
> 
> This is a better idea, I think, as it avoids the problems above and, as
> you say, will be binary compatible.

I also think this is a good idea.  Would it be generalized or only
applicable to xl_xact_{commit,abort} records?

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Vivir y dejar de vivir son soluciones imaginarias.
La existencia está en otra parte" (Andre Breton)



Re: [BUGS] server crash in very big transaction [postgresql 8.0beta1]

От
Tom Lane
Дата:
Gavin Sherry <swm@linuxworld.com.au> writes:
> As for extending the length of xl_len, what happens if someone now has
> 2^30 subtransaction IDs (as unlikely as that sounds)?

They'll have run out of RAM to store the subxact-related storage before
that (not to mention most likely have exhausted the CommandCounter
range, not to mention exhausted their patience --- it takes a good while
even to exercise the 2^16-subxact case).  I'm satisfied if we can
approach that limit.  Exceeding it will be a task for some other release.
        regards, tom lane


Re: [BUGS] server crash in very big transaction [postgresql

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> If we agree to never implement UNDO, there's a bunch of other code that
> could be removed.

Yeah, I've been thinking of going around and cleaning out the deadwood,
but beta is not the time for it.

> The commit xlog record also carries dropped table information, 12 bytes
> apiece (on 32 bit machines?).

Good point --- someone will eventually hit that case too, if we don't
increase the XLOG record size limit.

>>> Or we could assign an rmgr value to represent an "extension" record that
>>> is to be merged with a following "normal" record.

> I also think this is a good idea.  Would it be generalized or only
> applicable to xl_xact_{commit,abort} records?

I was envisioning it as a general mechanism --- I see no point in
restricting it to commit/abort records.  If anything it would take extra
code to restrict it to that case ...
        regards, tom lane


Re: [BUGS] server crash in very big transaction [postgresql

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> On Wed, Aug 25, 2004 at 11:21:49AM +1000, Gavin Sherry wrote:
>> On Tue, 24 Aug 2004, Tom Lane wrote:
>>> Or we could assign an rmgr value to represent an "extension" record that
>>> is to be merged with a following "normal" record.  This is kinda klugy
>>> but would avoid wasting bits on xl_len in the vast majority of records.
>>> Also we'd not have to force an initdb since the file format would
>>> remain upward-compatible.
>> 
>> This is a better idea, I think, as it avoids the problems above and, as
>> you say, will be binary compatible.

> I also think this is a good idea.  Would it be generalized or only
> applicable to xl_xact_{commit,abort} records?

After looking into this I've decided that it's not very practical --- it
would require major rewriting of XLogInsert, which I'm disinclined to do
at this stage of the beta cycle.  Widening the xl_len field seems much
safer.  It's not really an initdb-forcing change anyway; all you need to
do to upgrade an existing 8.0beta1 installation is run pg_resetxlog
(assuming you shut down the old postmaster cleanly).
        regards, tom lane


Re: [BUGS] server crash in very big transaction [postgresql

От
Bruce Momjian
Дата:
This has just been fixed by Tom and will be in beta2.

---------------------------------------------------------------------------

Tom Lane wrote:
> Gavin Sherry <swm@linuxworld.com.au> writes:
> > As for extending the length of xl_len, what happens if someone now has
> > 2^30 subtransaction IDs (as unlikely as that sounds)?
> 
> They'll have run out of RAM to store the subxact-related storage before
> that (not to mention most likely have exhausted the CommandCounter
> range, not to mention exhausted their patience --- it takes a good while
> even to exercise the 2^16-subxact case).  I'm satisfied if we can
> approach that limit.  Exceeding it will be a task for some other release.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
>                http://archives.postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073