Обсуждение: [sqlsmith] Crash in GetOldestSnapshot()

Поиск
Список
Период
Сортировка

[sqlsmith] Crash in GetOldestSnapshot()

От
Andreas Seltenreich
Дата:
Hi,

since updating master from c93d873..fc509cd, I see crashes in
GetOldestSnapshot() on update/delete returning statements.

I reduced the triggering statements down to this:
   update clstr_tst set d = d returning d;

Backtrace below.

regards,
Andreas

Program received signal SIGSEGV, Segmentation fault.
(gdb) bt
#0  GetOldestSnapshot () at snapmgr.c:422
#1  0x00000000004b8279 in init_toast_snapshot (toast_snapshot=0x7ffcd824b010) at tuptoaster.c:2314
#2  0x00000000004b83bc in toast_fetch_datum (attr=<optimized out>) at tuptoaster.c:1869
#3  0x00000000004b9ab5 in heap_tuple_untoast_attr (attr=0x18226c8) at tuptoaster.c:179
#4  0x00000000007f71ad in pg_detoast_datum_packed (datum=<optimized out>) at fmgr.c:2266
#5  0x00000000007cfc12 in text_to_cstring (t=0x18226c8) at varlena.c:186
#6  0x00000000007f5735 in FunctionCall1Coll (flinfo=flinfo@entry=0x18221c0, collation=collation@entry=0,
arg1=arg1@entry=25306824)at fmgr.c:1297
 
#7  0x00000000007f68ee in OutputFunctionCall (flinfo=0x18221c0, val=25306824) at fmgr.c:1946
#8  0x0000000000478bc1 in printtup (slot=0x1821f80, self=0x181ce48) at printtup.c:359
#9  0x00000000006f9c8e in RunFromStore (portal=portal@entry=0x177cbf8, direction=direction@entry=ForwardScanDirection,
count=count@entry=0,dest=0x181ce48) at pquery.c:1117
 
#10 0x00000000006f9d52 in PortalRunSelect (portal=portal@entry=0x177cbf8, forward=forward@entry=1 '\001', count=0,
count@entry=9223372036854775807,dest=dest@entry=0x181ce48) at pquery.c:942
 
#11 0x00000000006fb41e in PortalRun (portal=portal@entry=0x177cbf8, count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=1'\001', dest=dest@entry=0x181ce48, altdest=altdest@entry=0x181ce48,
completionTag=completionTag@entry=0x7ffcd824b920"") at pquery.c:787
 
#12 0x00000000006f822b in exec_simple_query (query_string=0x17db878 "update clstr_tst set d = d returning d;") at
postgres.c:1094
#13 PostgresMain (argc=<optimized out>, argv=argv@entry=0x1781ce0, dbname=0x1781b40 "regression", username=<optimized
out>)at postgres.c:4074
 
#14 0x000000000046c9bd in BackendRun (port=0x1786920) at postmaster.c:4262
#15 BackendStartup (port=0x1786920) at postmaster.c:3936
#16 ServerLoop () at postmaster.c:1693
#17 0x0000000000693044 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x175d5f0) at postmaster.c:1301
#18 0x000000000046dd26 in main (argc=3, argv=0x175d5f0) at main.c:228
(gdb) list
417
418        if (OldestActiveSnapshot != NULL)
419            ActiveLSN = OldestActiveSnapshot->as_snap->lsn;
420
421        if (XLogRecPtrIsInvalid(RegisteredLSN) || RegisteredLSN > ActiveLSN)
422            return OldestActiveSnapshot->as_snap;
423
424        return OldestRegisteredSnapshot;
425    }
426
(gdb) bt full
#0  GetOldestSnapshot () at snapmgr.c:422       OldestRegisteredSnapshot = <optimized out>       RegisteredLSN =
<optimizedout>       ActiveLSN = <optimized out>
 
#1  0x00000000004b8279 in init_toast_snapshot (toast_snapshot=0x7ffcd824b010) at tuptoaster.c:2314       snapshot =
<optimizedout>
 
#2  0x00000000004b83bc in toast_fetch_datum (attr=<optimized out>) at tuptoaster.c:1869       toastrel = 0x7f8b4ca88920
     toastidxs = 0x18447c8       toastkey = {         sk_flags = 0,         sk_attno = 1,         sk_strategy = 3,
  sk_subtype = 0,         sk_collation = 100,         sk_func = {           fn_addr = 0x77c490 <oideq>,
fn_oid= 184,           fn_nargs = 2,           fn_strict = 1 '\001',           fn_retset = 0 '\000',           fn_stats
=2 '\002',           fn_extra = 0x0,           fn_mcxt = 0x18282a8,           fn_expr = 0x0         },
sk_argument= 34491       }       toastscan = <optimized out>       ttup = <optimized out>       toasttupDesc =
0x7f8b4ca88c50      result = 0x18422d8       toast_pointer = <optimized out>       ressize = 5735       residx =
<optimizedout>       nextidx = 0       numchunks = 3       chunk = <optimized out>       isnull = <optimized out>
chunkdata= <optimized out>       chunksize = <optimized out>       num_indexes = 1       validIndex = 0
SnapshotToast= {         satisfies = 0x112,         xmin = 3626283536,         xmax = 32764,         xip = 0xf8ac628,
     xcnt = 5221870,         subxip = 0x0,         subxcnt = 0,         suboverflowed = 0 '\000',
takenDuringRecovery= 0 '\000',         copied = 0 '\000',         curcid = 14,         speculativeToken = 0,
active_count= 260753304,         regd_count = 0,         ph_node = {           first_child = 0xf8ac680,
next_sibling= 0xa40000000000112,           prev_or_parent = 0x0         },         whenTaken = 274,         lsn = 0
 }       __func__ = "toast_fetch_datum"
 
#3  0x00000000004b9ab5 in heap_tuple_untoast_attr (attr=0x18226c8) at tuptoaster.c:179
No locals.
#4  0x00000000007f71ad in pg_detoast_datum_packed (datum=<optimized out>) at fmgr.c:2266
No locals.
#5  0x00000000007cfc12 in text_to_cstring (t=0x18226c8) at varlena.c:186       tunpacked = <optimized out>       result
=<optimized out>
 
#6  0x00000000007f5735 in FunctionCall1Coll (flinfo=flinfo@entry=0x18221c0, collation=collation@entry=0,
arg1=arg1@entry=25306824)at fmgr.c:1297       fcinfo = {         flinfo = 0x18221c0,         context = 0x0,
resultinfo= 0x0,         fncollation = 0,         isnull = 0 '\000',         nargs = 1,         arg = {25306824,
6868497,0, 25356976, 1966, 1966, 0, 25207736, 1966, 8470236, 1966, 1966, 140236265213248, 7113528, 0, 4, 1, 25358272,
0,309237645256, 140236281446912, 1966, 25349336, 25207736, 1966, 8470236, 1966, 1966, 140236265213248, 7113605, 47,
148110127398913,25346992, 25358272, 25346992, 1, 25348352, 6188273, 3689292519771913624, 140723934769840,
140723934770607,140723934769840, 0, 10024654, 140723934770200, 0, 140723934770544, 140236311507497, 4222418944,
140723934770544,0, 8226869, 0, 140236263291992, 25, 140723934769968, 24856000, 8232733, 140236263291992, 15294443587,
25,140723934770016, 1125891316908032, 0, 7849104, 281483566645432, 2, 0, 24651624, 0, 25, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
107,8192, 8240, 128, 755914244609, 16, 2, 481036337259, 0, 0, 532575944823, 0, 140236314839840, 8192, 1024, 8192, 0,
25304960,140236311549156},         argnull =
"\000\000\000\000\000\000\000\000\250\202\202\001\000\000\000\000\250\202\202\001\000\000\000\000p\203\202\001\000\000\000\000\000\004\000\000\000\000\000\000\000\004\000\000\000\000\000\000\360\264$\330\374\177\000\000H\316\201\001\000\000\000\000\b\261\200\001\000\000\000\000\002\000\000\000\000\000\000\000\360\264$\330\374\177\000\000\033\314`\000\000\000\000\000\000\000\000"
     }       result = <optimized out>       __func__ = "FunctionCall1Coll"
 
#7  0x00000000007f68ee in OutputFunctionCall (flinfo=0x18221c0, val=25306824) at fmgr.c:1946       result = <optimized
out>      pushed = 0 '\000'
 
#8  0x0000000000478bc1 in printtup (slot=0x1821f80, self=0x181ce48) at printtup.c:359       outputstr = <optimized out>
     thisState = <optimized out>       attr = <optimized out>       typeinfo = <optimized out>       myState =
0x181ce48      oldcontext = 0x180b108       buf = {         data = 0x182aac8 "",         len = 2,         maxlen =
1024,        cursor = 68       }       natts = 1       i = 0
 
#9  0x00000000006f9c8e in RunFromStore (portal=portal@entry=0x177cbf8, direction=direction@entry=ForwardScanDirection,
count=count@entry=0,dest=0x181ce48) at pquery.c:1117       oldcontext = 0x180b108       ok = <optimized out>
forward= 1 '\001'       current_tuple_count = 14       slot = 0x1821f80
 
#10 0x00000000006f9d52 in PortalRunSelect (portal=portal@entry=0x177cbf8, forward=forward@entry=1 '\001', count=0,
count@entry=9223372036854775807,dest=dest@entry=0x181ce48) at pquery.c:942       queryDesc = 0x0       direction =
<optimizedout>       nprocessed = <optimized out>       __func__ = "PortalRunSelect"
 
#11 0x00000000006fb41e in PortalRun (portal=portal@entry=0x177cbf8, count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=1'\001', dest=dest@entry=0x181ce48, altdest=altdest@entry=0x181ce48,
completionTag=completionTag@entry=0x7ffcd824b920"") at pquery.c:787       save_exception_stack = 0x7ffcd824b9a0
save_context_stack= 0x0       local_sigjmp_buf = {{           __jmpbuf = {25020000, 8117249591578047072, 25020088, 2,
25284168,24503472, -8115502085312499104, 8117247189865575008},           __mask_was_saved = 0,           __saved_mask =
{            __val = {8368164, 1, 24911592, 10273694, 2, 1, 2, 140723934771202, 88, 24628216, 25020088, 2, 8459620,
25020000,2, 24628216}           }         }}       result = <optimized out>       nprocessed = <optimized out>
saveTopTransactionResourceOwner= 0x1782df8       saveTopTransactionContext = 0x175e4b0       saveActivePortal = 0x0
 saveResourceOwner = 0x1782df8       savePortalContext = 0x0       saveMemoryContext = 0x175e4b0       __func__ =
"PortalRun"
#12 0x00000000006f822b in exec_simple_query (query_string=0x17db878 "update clstr_tst set d = d returning d;") at
postgres.c:1094      parsetree = 0x17dc660       portal = 0x177cbf8       snapshot_set = <optimized out>
commandTag= <optimized out>       completionTag = "\000ELECT 1\000\377\377\177", '\000' <repeats 12 times>,
"\240\364\272O\213\177\000\000\000\000\000\000\000\000\000\000\"\000\000\000\000\000\000\000\330\367u\001\000\000\000\000\310\327u\001\000\000\000"
     querytree_list = <optimized out>       plantree_list = <optimized out>       receiver = 0x181ce48       format = 0
     dest = DestRemote       parsetree_list = 0x17dc6e0       save_log_statement_stats = 0 '\000'       was_logged = 0
'\000'      msec_str = "\310\317$\330\374\177", '\000' <repeats 25 times>       parsetree_item = 0x17dc6b8
isTopLevel= 1 '\001'
 
#13 PostgresMain (argc=<optimized out>, argv=argv@entry=0x1781ce0, dbname=0x1781b40 "regression", username=<optimized
out>)at postgres.c:4074       query_string = 0x17db878 "update clstr_tst set d = d returning d;"       firstchar =
25020000      input_message = {         data = 0x17db878 "update clstr_tst set d = d returning d;",         len = 40,
     maxlen = 1024,         cursor = 40       }       local_sigjmp_buf = {{           __jmpbuf = {24648928,
8117247467474629216,24648480, 0, 0, 24636608, -8115502085385899424, 8117247183479878240},           __mask_was_saved =
1,          __saved_mask = {             __val = {0, 24648856, 24648480, 24648512, 1024, 140723934771904, 24648928, 0,
8459304,24632953, 8454626, 13256160, 140723934771904, 24648928, 8376332, 24633416}           }         }}
send_ready_for_query= 0 '\000'       disable_idle_in_transaction_timeout = <optimized out>       __func__ =
"PostgresMain"



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Michael Paquier
Дата:
On Sat, Aug 6, 2016 at 6:32 PM, Andreas Seltenreich <seltenreich@gmx.de> wrote:
> since updating master from c93d873..fc509cd, I see crashes in
> GetOldestSnapshot() on update/delete returning statements.
>
> I reduced the triggering statements down to this:
>
>     update clstr_tst set d = d returning d;
>
> Backtrace below.

3e2f3c2e is likely to blame here.. I have moved the open item
"old_snapshot_threshold allows heap:toast disagreement" back to the
list of open items.
-- 
Michael



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Andrew Gierth
Дата:
>>>>> "Andreas" == Andreas Seltenreich <seltenreich@gmx.de> writes:

418   if (OldestActiveSnapshot != NULL)
419       ActiveLSN = OldestActiveSnapshot->as_snap->lsn;
420
421   if (XLogRecPtrIsInvalid(RegisteredLSN) || RegisteredLSN > ActiveLSN)
422       return OldestActiveSnapshot->as_snap;

This second conditional should clearly be inside the first one...

-- 
Andrew (irc:RhodiumToad)



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Amit Kapila
Дата:
On Sat, Aug 6, 2016 at 5:51 PM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
>>>>>> "Andreas" == Andreas Seltenreich <seltenreich@gmx.de> writes:
>
> 418   if (OldestActiveSnapshot != NULL)
> 419       ActiveLSN = OldestActiveSnapshot->as_snap->lsn;
> 420
> 421   if (XLogRecPtrIsInvalid(RegisteredLSN) || RegisteredLSN > ActiveLSN)
> 422       return OldestActiveSnapshot->as_snap;
>
> This second conditional should clearly be inside the first one...
>

Sure, that is the reason of crash, but even if we do that it will lead
to an error "no known snapshots".  Here, what is going on is that we
initialized toast snapshot when there is no active snapshot in the
backend, so GetOldestSnapshot() won't return any snapshot.  I think
for such situations, we need to initialize the lsn and whenTaken of
ToastSnapshot as we do in GetSnapshotData() [1].  We need to do this
when snapshot returned by GetOldestSnapshot() is NULL.

Thoughts?


[1]
In below code
if (old_snapshot_threshold < 0)
{
..
}
else
{
/*
* Capture the current time and WAL stream location in case this
* snapshot becomes old enough to need to fall back on the special
* "old snapshot" logic.
*/
snapshot->lsn = GetXLogInsertRecPtr();
snapshot->whenTaken = GetSnapshotCurrentTimestamp();
MaintainOldSnapshotTimeMapping(snapshot->whenTaken, xmin);
}


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Andrew Gierth
Дата:
>>>>> "Amit" == Amit Kapila <amit.kapila16@gmail.com> writes:
Amit> Sure, that is the reason of crash, but even if we do that it willAmit> lead to an error "no known snapshots".
Here,what is going on isAmit> that we initialized toast snapshot when there is no activeAmit> snapshot in the backend,
soGetOldestSnapshot() won't return anyAmit> snapshot.
 

Hmm.

So this happens because RETURNING queries run to completion immediately
and populate a tuplestore with the results, and the portal then fetches
from the tuplestore to send to the destination. The assumption is that
the tuplestore output can be processed without needing a snapshot, which
obviously is not true now if it contains toasted data.

In a similar case in the past involving holdable cursors, the solution
was to detoast _before_ storing in the tuplestore (see
PersistHoldablePortal). I guess the question now is, under what
circumstances is it now allowable to detoast a datum with no active
snapshot? (Wouldn't it be necessary in such a case to know what the
oldest snapshot ever used in the transaction was?)
Amit> I think for such situations, we need to initialize the lsn andAmit> whenTaken of ToastSnapshot as we do in
GetSnapshotData()[1].
 

Would that not give a too-recent LSN, resulting in possibly failing to
fetch the toast rows?

-- 
Andrew (irc:RhodiumToad)



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> In a similar case in the past involving holdable cursors, the solution
> was to detoast _before_ storing in the tuplestore (see
> PersistHoldablePortal). I guess the question now is, under what
> circumstances is it now allowable to detoast a datum with no active
> snapshot? (Wouldn't it be necessary in such a case to know what the
> oldest snapshot ever used in the transaction was?)

After looking at this a bit, I think probably the appropriate solution
is to register the snapshot that was used by the query and store it as
a property of the Portal, releasing it when the Portal is destroyed.
Essentially, this views possession of a relevant snapshot as a resource
that is required to make toast dereferences safe.

I think there has been a bug here for awhile.  Consider a committed-dead
row with some associated toast data, and suppose the query's snapshot
was the last one that could see that row.  Once we destroy the query's
snapshot, there is nothing preventing a concurrent VACUUM from removing
the dead row and the toast data.  When the RETURNING code was originally
written, I think this was safe enough, because the bookkeeping that
determined when VACUUM could remove data was based on transactions'
advertised xmins, and those did not move once set for the life of the
transaction.  So dereferencing a toast pointer you'd fetched was safe
for the rest of the transaction.  But when we changed over to
oldest-snapshot-based xmin advertisement, and made it so that a
transaction holding no snapshots advertised no xmin, we created a hazard
for data held in portals.

In this view of things, flattening toast pointers in "held" tuplestores
is still necessary, but it's because their protective snapshot is going
away not because the transaction as a whole is going away.  But as long
as we hold onto the relevant snapshot, we don't have to do that for
portals used intra-transaction.

It's interesting to think about whether we could let snapshots outlive
transactions and thereby not need to flatten "held" tuplestores either.
I kinda doubt it's a good idea because of the potential bad effects
for vacuum not being able to remove dead rows for a long time; but
it seems at least possible to do it, in this world-view.
        regards, tom lane



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Robert Haas
Дата:
On Sat, Aug 6, 2016 at 9:00 AM, Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
> Hmm.
>
> So this happens because RETURNING queries run to completion immediately
> and populate a tuplestore with the results, and the portal then fetches
> from the tuplestore to send to the destination. The assumption is that
> the tuplestore output can be processed without needing a snapshot, which
> obviously is not true now if it contains toasted data.
>
> In a similar case in the past involving holdable cursors, the solution
> was to detoast _before_ storing in the tuplestore (see
> PersistHoldablePortal). I guess the question now is, under what
> circumstances is it now allowable to detoast a datum with no active
> snapshot? (Wouldn't it be necessary in such a case to know what the
> oldest snapshot ever used in the transaction was?)

Yes, I think you're right.  Suppose we leave "snapshot too old" to one
side; assume that feature is disabled.  If the backend fetches the
heap tuples without de-TOAST-ing, releases its snapshot (presumably
resetting xmin), and then goes into the tank for a while, those tuples
could be pruned.  When the backend wakes up again and tries to TOAST,
we would get the the exact sort of heap:toast disagreement that we set
out to prevent here.  That's not likely to occur because in most cases
the number of tuples returned will be small, and VACUUM is quite
unlikely to remove them before we de-TOAST.  But this report makes me
suspect it can happen (I have not tested).

With "snapshot too old" enabled, it becomes much more likely.  The
offending prune operation can happen at any time after our snapshot
times out and before we de-TOAST, rather than needing to happen after
the query ends and before we de-TOAST.

So I think in the short term what we should do about this is just fix
it so it doesn't crash.  In the longer term, we might want to think a
bit more carefully about the way we handle de-TOASTing overall; we've
had a number of different bugs that are all rooted in failure to think
carefully about the fact that the query's snapshot needs to live at
least as long as any TOAST pointers that we might want to de-TOAST
later (3f8c8e3c61cef5729980ee4372ec159862a979f1,
ec543db77b6b72f24d0a637c4a4a419cf8311d0b).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> So I think in the short term what we should do about this is just fix
> it so it doesn't crash.

Well, we clearly need to fix GetOldestSnapshot so it won't crash,
but I do not think that having RETURNING queries randomly returning
"ERROR: no known snapshots" is acceptable even for a beta release.
If we aren't prepared to do something about that before Monday,
I think we need to revert 3e2f3c2e until we do have a fix for it.

What I suggested just now in <2850.1470592623@sss.pgh.pa.us> might
be implementable with a couple hours' work, though.  Do you have a
reason to think it'd be insufficient?
        regards, tom lane



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Robert Haas
Дата:
On Sun, Aug 7, 2016 at 1:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>> In a similar case in the past involving holdable cursors, the solution
>> was to detoast _before_ storing in the tuplestore (see
>> PersistHoldablePortal). I guess the question now is, under what
>> circumstances is it now allowable to detoast a datum with no active
>> snapshot? (Wouldn't it be necessary in such a case to know what the
>> oldest snapshot ever used in the transaction was?)
>
> After looking at this a bit, I think probably the appropriate solution
> is to register the snapshot that was used by the query and store it as
> a property of the Portal, releasing it when the Portal is destroyed.
> Essentially, this views possession of a relevant snapshot as a resource
> that is required to make toast dereferences safe.

Hmm, good idea.

> I think there has been a bug here for awhile.  Consider a committed-dead
> row with some associated toast data, and suppose the query's snapshot
> was the last one that could see that row.  Once we destroy the query's
> snapshot, there is nothing preventing a concurrent VACUUM from removing
> the dead row and the toast data.

Yeah: as you see, I came to the same conclusion.

> When the RETURNING code was originally
> written, I think this was safe enough, because the bookkeeping that
> determined when VACUUM could remove data was based on transactions'
> advertised xmins, and those did not move once set for the life of the
> transaction.  So dereferencing a toast pointer you'd fetched was safe
> for the rest of the transaction.  But when we changed over to
> oldest-snapshot-based xmin advertisement, and made it so that a
> transaction holding no snapshots advertised no xmin, we created a hazard
> for data held in portals.

But I missed this aspect of it.

> In this view of things, flattening toast pointers in "held" tuplestores
> is still necessary, but it's because their protective snapshot is going
> away not because the transaction as a whole is going away.  But as long
> as we hold onto the relevant snapshot, we don't have to do that for
> portals used intra-transaction.
>
> It's interesting to think about whether we could let snapshots outlive
> transactions and thereby not need to flatten "held" tuplestores either.
> I kinda doubt it's a good idea because of the potential bad effects
> for vacuum not being able to remove dead rows for a long time; but
> it seems at least possible to do it, in this world-view.

EnterpriseDB has had complaints from customers about the cost of
flattening TOAST pointers when tuplestores are held, so I think
providing an option to skip the flattening (at the risk of increased
bloat) would be a good idea even if we chose not to change the default
behavior.  It's worth noting that the ability to set
old_snapshot_threshold serves as a way of limiting the damage that can
be caused by the open snapshot, so that optional behavior might be
more appealing now than heretofore.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Robert Haas
Дата:
On Sun, Aug 7, 2016 at 2:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> So I think in the short term what we should do about this is just fix
>> it so it doesn't crash.
>
> Well, we clearly need to fix GetOldestSnapshot so it won't crash,
> but I do not think that having RETURNING queries randomly returning
> "ERROR: no known snapshots" is acceptable even for a beta release.
> If we aren't prepared to do something about that before Monday,
> I think we need to revert 3e2f3c2e until we do have a fix for it.
>
> What I suggested just now in <2850.1470592623@sss.pgh.pa.us> might
> be implementable with a couple hours' work, though.  Do you have a
> reason to think it'd be insufficient?

No - if you can implement that quickly, I think it sounds like a great idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Aug 7, 2016 at 2:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What I suggested just now in <2850.1470592623@sss.pgh.pa.us> might
>> be implementable with a couple hours' work, though.  Do you have a
>> reason to think it'd be insufficient?

> No - if you can implement that quickly, I think it sounds like a great idea.

I'll look into it.
        regards, tom lane



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Aug 7, 2016 at 2:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What I suggested just now in <2850.1470592623@sss.pgh.pa.us> might
>> be implementable with a couple hours' work, though.  Do you have a
>> reason to think it'd be insufficient?

> No - if you can implement that quickly, I think it sounds like a great idea.

Pushed.
        regards, tom lane



Re: [sqlsmith] Crash in GetOldestSnapshot()

От
Robert Haas
Дата:
On Sun, Aug 7, 2016 at 5:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sun, Aug 7, 2016 at 2:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> What I suggested just now in <2850.1470592623@sss.pgh.pa.us> might
>>> be implementable with a couple hours' work, though.  Do you have a
>>> reason to think it'd be insufficient?
>
>> No - if you can implement that quickly, I think it sounds like a great idea.
>
> Pushed.

Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company