RE: could not read from hash-join temporary file: SUCCESS && DB goes into recovery mode

Поиск
Список
Период
Сортировка
От Reid Thompson
Тема RE: could not read from hash-join temporary file: SUCCESS && DB goes into recovery mode
Дата
Msg-id SJ0PR11MB4848B9E3A92CEA5F74FE7FE09E499@SJ0PR11MB4848.namprd11.prod.outlook.com
обсуждение исходный текст
Ответ на Re: could not read from hash-join temporary file: SUCCESS && DB goes into recovery mode  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-general
Alvaro,
Thanks for the responses and the explanation. It is appreciated.  We will investigate re-writing the query to minimize
theresult set. 

Thanks again.
reid

-----Original Message-----
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Sent: Monday, April 19, 2021 11:59 AM
To: Reid Thompson <Reid.Thompson@omnicell.com>
Cc: pgsql-general@lists.postgresql.org
Subject: Re: could not read from hash-join temporary file: SUCCESS && DB goes into recovery mode

[EXTERNAL SOURCE]



On 2021-Apr-19, Reid Thompson wrote:

> Thanks - I found that, which seems to fix the error handling right? Or
> does it actually correct the cause of the segfault also?

Uh, what segfault?  You didn't mention one.  Yes, it fixes the error handling, so when the system runs out of disk
space,that's correctly reported instead of continuing. 

... Ah, I see now that you mentioned that the DB goes in recovery mode in the subject line.  That's exactly why I was
lookingat that problem last year.  What I saw is that the hash-join spill-to-disk phase runs out of disk, so the disk
fileis corrupt; later the hash-join reads that data back in memory, but because it is incomplete, it follows a broken
pointersomewhere and causes a crash. 

(In our customer case it was actually a bit more complicated: they had
*two* sessions running the same large hash-join query, and one of them filled up disk first, then the other also did
that;some time later one of them raised an ERROR freeing up disk space, which allowed the other to continue until it
triedto read hash-join data back and crashed). 

So, yes, the fix will avoid the crash by the fact that once you run out of disk space, the hash-join will be aborted
andnothing will try to read broken data. 

You'll probably have to rewrite your query to avoid eating 2TB of disk space.

--
Álvaro Herrera       Valdivia, Chile



В списке pgsql-general по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: could not read from hash-join temporary file: SUCCESS && DB goes into recovery mode
Следующее
От: Reid Thompson
Дата:
Сообщение: RE: could not read from hash-join temporary file: SUCCESS && DB goes into recovery mode