Re: Long running query causing XID limit breach

Поиск
Список
Период
Сортировка
От sud
Тема Re: Long running query causing XID limit breach
Дата
Msg-id CAD=mzVU9zMVJx__Z1MeWPNWoMscJwLfjvrmK2fsmSrOOSf9=Cw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Long running query causing XID limit breach  (sud <suds1434@gmail.com>)
Ответы Re: Long running query causing XID limit breach
Список pgsql-general
Also,if i am getting it correct, it means we should not run any transaction (even if it's legitimate one like for e.g. a big Reporting "SELECT" query) beyond 10hrs, as that will end up consuming 10*200million XID per hour= 2billion XID limit saturation and thus causing system failure. Hope my understanding is correct here.

On Thu, May 23, 2024 at 11:41 AM sud <suds1434@gmail.com> wrote:

On Thu, May 23, 2024 at 10:42 AM Muhammad Salahuddin Manzoor <salahuddin.m@bitnine.net> wrote:
Greetings,

Running `VACUUM table_name;` on a partitioned table will vacuum each partition individually, not the whole table as a single unit.

Yes, running `VACUUM table_name;` frequently on tables or partitions with heavy DML is recommended.

Regular `VACUUM` does not lock the table for reads or writes, so it won't disrupt ongoing 24/7 data operations.

"optimize autovacuum"
Yes. Adjust following parameters as per your system/environment requirement
autovacuum_max_workers, 
autovacuum_freeze_max_age , 
autovacuum_vacuum_cost_delay

Following need to be first tested thoroughly in a test environment.
Recommended Alert Threshold
Alert at 50% Usage: Set the alert threshold at 1 billion used XIDs. This provides a significant buffer, giving you ample time to take corrective action before reaching the critical limit.

Calculation Rationale
Daily XID Usage: Approximately 4 billion rows per day implies high XID consumption.
Buffer Time: At 1 billion XIDs, you would still have 1 billion XIDs remaining, giving you roughly 12 hours to address the issue if your system consumes 200 million XIDs per hour.



Thank you so much. That helps.
So apart from setting these alerts on "Maximumusedtxnids" and making the vacuum optimized by tweaking above parameters, should we also need to have monitoring in place to ensure the Vacuum is not taking longer as compared to its normal runtime and also if it's getting blocked/failed by something? Like for example in our case where the select query was running longer , so the vacuum must not be able to succeed every time it attempts, so is it really worth having that level of alerting?  and also how can we get an idea regarding if the vacuum is not succeeding or getting failed etc to avoid such upcoming issues? 

В списке pgsql-general по дате отправления:

Предыдущее
От: sud
Дата:
Сообщение: Re: Long running query causing XID limit breach
Следующее
От: Muhammad Salahuddin Manzoor
Дата:
Сообщение: Re: Long running query causing XID limit breach