Re: BUG #17821: Assertion failed in heap_update() due to heap pruning

Поиск
Список
Период
Сортировка
От Alexander Lakhin
Тема Re: BUG #17821: Assertion failed in heap_update() due to heap pruning
Дата
Msg-id 25c3399b-58a3-d727-7f97-93394fd1afa9@gmail.com
обсуждение исходный текст
Ответ на Re: BUG #17821: Assertion failed in heap_update() due to heap pruning  (Andres Freund <andres@anarazel.de>)
Ответы Re: BUG #17821: Assertion failed in heap_update() due to heap pruning  (Alexander Lakhin <exclusion@gmail.com>)
Список pgsql-bugs
Hi,

04.03.2023 19:48, Andres Freund wrote:
> My first suspicion would be that we aren't holding a lock in the right moment, to enforce cache invalidation
processing.
>

I've encountered another case of that assertion failure (this time with pg_statistic):
numclients=10
for ((c=1;c<=numclients;c++)); do
   createdb db$c
done

for ((i=1;i<=50;i++)); do
   echo "iteration $i"

   for ((c=1;c<=numclients;c++)); do
     echo "
CREATE TABLE t(i int4, t1 text[], t2 text[]);
INSERT INTO t SELECT g,  ARRAY[repeat('x', g)], ARRAY[repeat('x', g)] FROM generate_series(1, 200) g;
ANALYZE t;
     " | psql -d db$c >/dev/null
   done

   for ((c=1;c<=numclients;c++)); do
     echo "
ANALYZE t;
     " | psql -d db$c >/dev/null &

     echo "
SELECT * FROM t WHERE i = 1;
ANALYZE t;
     " | psql -d db$c >/dev/null &
   done
   wait
   grep 'TRAP:' server.log && { break; }

   for ((c=1;c<=numclients;c++)); do
     echo "
DROP TABLE t; VACUUM pg_statistic;
     " | psql -d db$c >/dev/null
   done
done

...
iteration 13
server closed the connection unexpectedly
         This probably means the server terminated abnormally
         before or while processing the request.
connection to server was lost
TRAP: failed Assert("ItemIdIsNormal(lp)"), File: "heapam.c", Line: 3074, PID: 89697

Reproduced on REL_12_STABLE (starting from 5e0928005) .. master.
(Modifying pruneheap.c as in [1] not needed for this case.)

Here we have the following sequence of events:
Session 1
ANALYZE t; // needed just to produce material for pruning
     do_analyze_rel() -> update_attstats() -> CatalogTupleUpdateWithInfo() -> heap_update(ctid(18,16)); 
PrepareToInvalidateCacheTuple(hashvalue: {hash1})
                 Session 2:
                 SELECT * FROM t WHERE i = 1;
                     SearchCatCacheInternal(hashValue: {hash1})
                         SearchCatCacheMiss(hashValue: {hash1})
     ProcessInvalidationMessages()
     CatCacheInvalidate(hashValue: {hash1})
     CatCacheInvalidate(hashValue: {hash2})
                             CatCacheInvalidate(hashValue: {hash1})
                         SearchCatCacheMiss returns tid(18,16)

                 ANALYZE t;
                     do_analyze_rel() -> update_attstats()
                         attno=0 SearchSysCache3() -> SearchCatCacheInternal(hashValue: {hash2}) ->
SearchCatCacheMiss()
 
-> heap_page_prune(block: 18)
                         attno=1 SearchSysCache3() -> SearchCatCacheInternal(hashValue: {hash1}) -> oldtup =
tid(18,16)
                         CatalogTupleUpdateWithInfo() -> heap_update(tid(18, 16))

(A server.log with additional tracing messages for one of the failed runs
is attached.)

In other words, a backend can get some tuple id from catcache and then prune
the underlying page while searching for another (missing) entry in the cache
(any other reading of the page that may trigger pruning will do as well).
After that, an attempt to update that tuple might trigger the Assert().

[1] https://www.postgresql.org/message-id/17821-dd8c334263399284%40postgresql.org

Best regards,
Alexander
Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Peter Smith
Дата:
Сообщение: Re: [16+] subscription can end up in inconsistent state
Следующее
От: Laurenz Albe
Дата:
Сообщение: Re: BUG #18149: Incorrect lexeme for english token "proxy"