Re: OOM in spgist insert

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: OOM in spgist insert
Дата
Msg-id 1543946.1620923512@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: OOM in spgist insert  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2021-May-13, Tom Lane wrote:
>> BTW, another nasty thing I discovered while testing this is that
>> the CHECK_FOR_INTERRUPTS() at line 2146 is useless, because
>> we're holding a buffer lock there so InterruptHoldoffCount > 0.
>> So once you get into this loop you can't even cancel the query.
>> Seems like that needs a fix, too.

Attached is a WIP patch for that part.  Basically, if it looks
like there's an interrupt pending, make spgdoinsert() fall out of
its loop, so it'll release the buffer locks, and then recheck
CHECK_FOR_INTERRUPTS() at a point where it can really throw the
error.  Now, the hole in this approach is what if ProcessInterrupts
still declines to throw the error?  I've made the patch make use of
the existing provision to retry spgdoinsert() in case of deadlock,
so that it just goes around and tries again.  But that doesn't seem
terribly satisfactory, because if InterruptPending remains set then
we'll just have an infinite loop at that outer level.  IOW, this
patch can only cope with the situation where there's not any outer
function holding a buffer lock (or other reason to prevent the
query cancel from taking effect).  Maybe that's good enough, but
I'm not terribly pleased with it.

> This comment made me remember a patch I've had for a while, which splits
> the CHECK_FOR_INTERRUPTS() definition in two -- one of them is
> INTERRUPTS_PENDING_CONDITION() which let us test the condition
> separately; that allows the lock we hold to be released prior to
> actually processing the interrupts.

Hm.  Yeah, I was feeling that this patch is unduly friendly with
the innards of CHECK_FOR_INTERRUPTS().  So maybe some refactoring
is called for, but I'm not quite sure what it should look like.
Testing the condition separately is fine, but what about the case
of ProcessInterrupts refusing to pull the trigger?

            regards, tom lane

diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index 4d380c99f0..82e68c4ab7 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -1905,13 +1905,14 @@ spgSplitNodeAction(Relation index, SpGistState *state,
  * Insert one item into the index.
  *
  * Returns true on success, false if we failed to complete the insertion
- * because of conflict with a concurrent insert.  In the latter case,
- * caller should re-call spgdoinsert() with the same args.
+ * (typically because of conflict with a concurrent insert).  In the latter
+ * case, caller should re-call spgdoinsert() with the same args.
  */
 bool
 spgdoinsert(Relation index, SpGistState *state,
             ItemPointer heapPtr, Datum *datums, bool *isnulls)
 {
+    bool        result = true;
     TupleDesc    leafDescriptor = state->leafTupDesc;
     bool        isnull = isnulls[spgKeyColumn];
     int            level = 0;
@@ -2019,9 +2020,17 @@ spgdoinsert(Relation index, SpGistState *state,
         /*
          * Bail out if query cancel is pending.  We must have this somewhere
          * in the loop since a broken opclass could produce an infinite
-         * picksplit loop.
+         * picksplit loop.  Moreover, because it's typically the case that
+         * we're holding buffer lock(s), ProcessInterrupts() won't be able to
+         * throw an error now.  Hence, if we see that InterruptPending remains
+         * true, break out of the loop and deal with the problem below.
          */
         CHECK_FOR_INTERRUPTS();
+        if (unlikely(InterruptPending))
+        {
+            result = false;
+            break;
+        }

         if (current.blkno == InvalidBlockNumber)
         {
@@ -2140,10 +2149,15 @@ spgdoinsert(Relation index, SpGistState *state,
              * spgAddNode and spgSplitTuple cases will loop back to here to
              * complete the insertion operation.  Just in case the choose
              * function is broken and produces add or split requests
-             * repeatedly, check for query cancel.
+             * repeatedly, check for query cancel (see comments above).
              */
     process_inner_tuple:
             CHECK_FOR_INTERRUPTS();
+            if (unlikely(InterruptPending))
+            {
+                result = false;
+                break;
+            }

             innerTuple = (SpGistInnerTuple) PageGetItem(current.page,
                                                         PageGetItemId(current.page, current.offnum));
@@ -2267,5 +2281,12 @@ spgdoinsert(Relation index, SpGistState *state,
         UnlockReleaseBuffer(parent.buffer);
     }

-    return true;
+    /*
+     * Finally, check for interrupts again.  If there was one,
+     * ProcessInterrupts() will be able to throw the error now.  But should it
+     * not do so for some reason, return false to cause a retry.
+     */
+    CHECK_FOR_INTERRUPTS();
+
+    return result;
 }

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Maciek Sakrejda
Дата:
Сообщение: Re: compute_query_id and pg_stat_statements
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: compute_query_id and pg_stat_statements