Обсуждение: Make relcache init write errors not be fatal

Поиск
Список
Период
Сортировка

Make relcache init write errors not be fatal

От
Jeff Janes
Дата:
After running a testing server out of storage, I tried to track down why it was so hard to get it back up again.  (Rather than what I usually do which is just throwing it away and making the test be smaller).

I couldn't start a backend because it couldn't write the relcache init file.

I found this comment, but it did not carry its sentiment to completion:

        /*
         * We used to consider this a fatal error, but we might as well
         * continue with backend startup ...
         */

With the attached patch applied, I could at least get a backend going so I could drop some tables/indexes and free up space.

I'm not enamoured with the implementation of passing a flag down to write_item, but it seemed better than making write_item return an error code and then checking the return status in a dozen places.  Maybe we could turn write_item into a macro, so the macro can implement the "return" from the outer function directly?

Cheers,

Jeff
Вложения

Re: Make relcache init write errors not be fatal

От
Andres Freund
Дата:
Hi,

On 2018-12-22 20:49:58 -0500, Jeff Janes wrote:
> After running a testing server out of storage, I tried to track down why it
> was so hard to get it back up again.  (Rather than what I usually do which
> is just throwing it away and making the test be smaller).
> 
> I couldn't start a backend because it couldn't write the relcache init file.
> 
> I found this comment, but it did not carry its sentiment to completion:
> 
>         /*
>          * We used to consider this a fatal error, but we might as well
>          * continue with backend startup ...
>          */
> 
> With the attached patch applied, I could at least get a backend going so I
> could drop some tables/indexes and free up space.

Why is this a good idea?  It'll just cause hard to debug performance
issues imo.

Greetings,

Andres Freund


Re: Make relcache init write errors not be fatal

От
Jeff Janes
Дата:
On Sat, Dec 22, 2018 at 8:54 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2018-12-22 20:49:58 -0500, Jeff Janes wrote:
> After running a testing server out of storage, I tried to track down why it
> was so hard to get it back up again.  (Rather than what I usually do which
> is just throwing it away and making the test be smaller).
>
> I couldn't start a backend because it couldn't write the relcache init file.
>
> I found this comment, but it did not carry its sentiment to completion:
>
>         /*
>          * We used to consider this a fatal error, but we might as well
>          * continue with backend startup ...
>          */
>
> With the attached patch applied, I could at least get a backend going so I
> could drop some tables/indexes and free up space.

Why is this a good idea?  It'll just cause hard to debug performance
issues imo.


You get lots of WARNINGs, so it shouldn't be too hard to debug.  And once you drop a table or an index, the init will succeed and you wouldn't have the performance issues at all anymore.

The alternative, barring finding extraneous data on the same partition that can be removed, seems to be having indefinite downtime until you can locate a larger hard drive and move everything to it, or using dangerous hacks.

Cheers,

Jeff