Обсуждение: Index bloat and REINDEX/VACUUM optimization for partial index

Поиск
Список
Период
Сортировка

Index bloat and REINDEX/VACUUM optimization for partial index

От
jayaprabhakar k
Дата:
Hi,

TL;DR: 
Observations:
  1. REINDEX requires a full table scan
    • Roughly create a new index, rename index, drop old index.
    • REINDEX is not incremental. running reindex frequently does not reduce the future reindex time.
  2. REINDEX does not use the index itself
  3. VACUUM does not clean up the indices. (relpages >> reltuples) I understand, vacuum is supposed to remove pages only if there are no live tuples in the page, but somehow, even immediately after vacuum, I see relpages significantly greater than reltuples. I would have assumed, relpages <= reltuples 
  4. Query Planner does not consider index bloat, so uses highly bloated partial index that is terribly slow over other index
Question: Is there a way to optimize postgres vacuum/reindex when using partial indexes?

We have a large table (tasks) that keep track of all the tasks that are created and their statuses. Around 1.4 million tasks per day are created every day (~15 inserts per second). 

One of the columns is int `status` that can be one of (1 - Init, 2 - InProgress, 3 - Success, 4 - Aborted, 5 - Failure) (Actually, there are more statuses, but this would give the idea)

On average, a task completes in around a minute with some outliers that can go as long as a few weeks. There is a periodic heartbeat that updates the last updated time in the table.

At any moment, there are around 1000-1500 tasks in pending statuses (Init + InProgress) out of around 500 million tasks. 

Now, we have a task monitoring query that will look for all pending tasks that have not received any update in the last n minutes.

```
SELECT [columns list]
  FROM tasks
  WHERE status NOT IN (3,4,5) AND created > NOW() - INTERVAL '30 days' AND updated < NOW() - interval '30 minutes'
```

Since we are only interested in the pending tasks, I created a partial index
 `"tasks_pending_status_created_type_idx" btree (status, created, task_type) WHERE status <> ALL (ARRAY[3, 4, 5])`.

This worked great initially, however this started to get bloated very very quickly because, every task starts in pending state, gets multiple updates (and many of them are not HOT updates, working on optimizing fill factor now), and eventually gets deleted from the index (as status changes to success). 


```

\d+ tasks
                                                                            Table "public.tasks"
            Column             |            Type            | Collation | Nullable |              Default              | Storage  | Compression | Stats target | Description
-------------------------------+----------------------------+-----------+----------+-----------------------------------+----------+-------------+--------------+-------------
 id                            | bigint                     |           | not null | nextval('tasks_id_seq'::regclass) | plain    |             |              |
  client_id                     | bigint                     |           | not null |                                   | plain    |             |              | 
 status                        | integer                    |           | not null |                                   | plain    |             |              |
 description                   | character varying(128)     |           | not null |                                   | extended |             |              |
 current_count                 | bigint                     |           | not null |                                   | plain    |             |              |
 target_count                  | bigint                     |           | not null |                                   | plain    |             |              |
 status_msg                    | character varying(4096)    |           |          |                                   | extended |             |              |
 blob_key                      | bigint                     |           |          |                                   | plain    |             |              |
 created                       | timestamp with time zone   |           | not null |                                   | plain    |             |              |
 updated                       | timestamp with time zone   |           | not null |                                   | plain    |             |              |
 idle_time                     | integer                    |           | not null | 0                                 | plain    |             |              |
 started                       | timestamp with time zone   |           |          |                                   | plain    |             |              |
Indexes:
    "tasks_pkey" PRIMARY KEY, btree (id)
    "tasks_created_idx" btree (created)
    "tasks_pending_status_created_idx" btree (status, created) WHERE status <> ALL (ARRAY[3, 4, 5])

    "tasks_client_id_status_created_idx" btree (client_id, status, created DESC)    "tasks_status_idx" btree (status)
Access method: heap
Options: autovacuum_vacuum_scale_factor=0.02, autovacuum_analyze_scale_factor=0.02, fillfactor=70

```

Immediately after REINDEX

```
SELECT relname,reltuples,relpages FROM pg_class WHERE relname like 'tasks%idx%';

              relname               |   reltuples    | relpages
------------------------------------+----------------+----------
 
tasks_pending_status_created_idx   |          34175 |      171
 tasks_created_idx                  |  5.3920026e+08 | 11288121
 
tasks_client_id_status_created_idx |  5.3920026e+08 |  7031615
 tasks_status_idx                   |  5.3920026e+08 |  2215403
(9 rows)

```

A couple of days after manual full REINDEX.
```
SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE (relname like 'tasks%idx%' OR relname='tasks');
              relname               | relpages |   reltuples    | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
------------------------------------+----------+----------------+---------------+---------+----------+----------------+------------+---------------
 tasks_pending_status_created_idx   |    79664 |         201831 |             0 | i       |        3 | f              |            |     652771328
 tasks_created_idx                  | 11384992 |    5.42238e+08 |             0 | i       |        1 | f              |            |   93481443328
 tasks_client_id_status_created_idx |  7167147 |    5.42274e+08 |             0 | i       |        5 | f              |            |   58727710720
 tasks_status_idx                   |  2258820 |  5.4223546e+08 |             0 | i       |        1 | f              |            |   18508734464
 tasks                              | 71805187 |   5.171037e+08 |      71740571 | r       |       30 | f              |            |  613282308096

```


Re: Index bloat and REINDEX/VACUUM optimization for partial index

От
Peter Geoghegan
Дата:
On Mon, Aug 28, 2023 at 5:33 PM jayaprabhakar k <jayaprabhakar@gmail.com> wrote:
> REINDEX requires a full table scan
>
> Roughly create a new index, rename index, drop old index.
> REINDEX is not incremental. running reindex frequently does not reduce the future reindex time.

You didn't say which Postgres version you're on. Note that Postgres 14
can deal with index bloat a lot better than earlier versions could.
This is known to work well with partial indexes. See:

https://www.postgresql.org/message-id/flat/CAL9smLAjt9mZC2%3DqBeJwuNPq7KMAYGTWWQw_hvA-Lfo0b3ycow%40mail.gmail.com

--
Peter Geoghegan



Re: Index bloat and REINDEX/VACUUM optimization for partial index

От
jayaprabhakar k
Дата:
Thanks Peter. It is 14.4, But on AWS RDS Aurora instance. I am trying to read the links you shared - B-Tree Deletion and deduplication, etc. I still don't fully understand what I need to do. In the BTree documentation,

The average and worst-case number of versions per logical row can be kept low purely through targeted incremental deletion passes. It's quite possible that the on-disk size of certain indexes will never increase by even one single page/block despite constant version churn from UPDATEs.

In our case, almost all the tuples stop being covered by the index as they fail the predicate, and only a tiny 1000s of rows pass the index predicate at any point in time. But, we still see the index size continue to increase, index lookups become slow over time, and  vacuum (non full) doesn't reduce the index size much.

Do we need to do anything specific to better utilize the targeted incremental deletion passes?


SELECT VERSION();
                                             version                                            
-------------------------------------------------------------------------------------------------
 PostgreSQL 14.4 on x86_64-pc-linux-gnu, compiled by x86_64-pc-linux-gnu-gcc (GCC) 7.4.0, 64-bit
(1 row)







On Mon, 28 Aug 2023 at 18:49, Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Aug 28, 2023 at 5:33 PM jayaprabhakar k <jayaprabhakar@gmail.com> wrote:
> REINDEX requires a full table scan
>
> Roughly create a new index, rename index, drop old index.
> REINDEX is not incremental. running reindex frequently does not reduce the future reindex time.

You didn't say which Postgres version you're on. Note that Postgres 14
can deal with index bloat a lot better than earlier versions could.
This is known to work well with partial indexes. See:

https://www.postgresql.org/message-id/flat/CAL9smLAjt9mZC2%3DqBeJwuNPq7KMAYGTWWQw_hvA-Lfo0b3ycow%40mail.gmail.com

--
Peter Geoghegan

Re: Index bloat and REINDEX/VACUUM optimization for partial index

От
Jeff Janes
Дата:
On Mon, Aug 28, 2023 at 8:33 PM jayaprabhakar k <jayaprabhakar@gmail.com> wrote:
Hi,

TL;DR: 
Observations:
  1. REINDEX requires a full table scan
    • Roughly create a new index, rename index, drop old index.
    • REINDEX is not incremental. running reindex frequently does not reduce the future reindex time.
  2. REINDEX does not use the index itself
  3. VACUUM does not clean up the indices. (relpages >> reltuples) I understand, vacuum is supposed to remove pages only if there are no live tuples in the page, but somehow, even immediately after vacuum, I see relpages significantly greater than reltuples. I would have assumed, relpages <= reltuples 
  4. Query Planner does not consider index bloat, so uses highly bloated partial index that is terribly slow over other index
Your points 3 and 4 are not correct.  empty index pages are put on a freelist for future reuse, they are not physically removed from the underlying index files.  Maybe they are not actually getting put on the freelist or not being reused from the freelist for some reason, but that would be a different issue.  Use the extension pgstattuple to see what its function pgstatindex says about the index.  

The planner does take index bloat into consideration, but its effect size is low.  Which it should be, as empty or irrelevant pages should be efficiently skipped during the course of most index operations. To figure out what is going with your queries, you should do an EXPLAIN (ANALYZE, BUFFERS) of them, but with it being slow and with it being fast.
 
Question: Is there a way to optimize postgres vacuum/reindex when using partial indexes?

Without knowing what is actually going wrong, I can only offer generalities.  Make sure you don't have long-lived transactions which prevent efficient clean up.  Increase the frequency on which vacuum runs on the table.  It can't reduce the size of an already bloated index, but by keeping the freelist stocked it should be able prevent it from getting bloated in the first place.  Also, it can remove empty pages from being linked into the index tree structure, which means they won't need to be scanned even though they are still in the file.  It can also free up space inside non-empty pages for future reuse within that same page, and so that index tuples don't need to be chased down in the table only to be found to be not visible.
 
```
SELECT [columns list]
  FROM tasks
  WHERE status NOT IN (3,4,5) AND created > NOW() - INTERVAL '30 days' AND updated < NOW() - interval '30 minutes'
```

Since we are only interested in the pending tasks, I created a partial index
 `"tasks_pending_status_created_type_idx" btree (status, created, task_type) WHERE status <> ALL (ARRAY[3, 4, 5])`.

This looks like a poorly designed index.  Since the status condition exactly matches the index where clause, there is no residual point in having "status" be the first column in the index, it can only get in the way (for this particular query).  Move it to the end, or remove it altogether.

Within the tuples which pass the status check, which inequality is more selective, the "created" one or "updated" one?
 
Cheers,

Jeff

Re: Index bloat and REINDEX/VACUUM optimization for partial index

От
jayaprabhakar k
Дата:


On Tue, Aug 29, 2023, 12:43 PM Jeff Janes <jeff.janes@gmail.com> wrote:
On Mon, Aug 28, 2023 at 8:33 PM jayaprabhakar k <jayaprabhakar@gmail.com> wrote:
Hi,

TL;DR: 
Observations:
  1. REINDEX requires a full table scan
    • Roughly create a new index, rename index, drop old index.
    • REINDEX is not incremental. running reindex frequently does not reduce the future reindex time.
  2. REINDEX does not use the index itself
  3. VACUUM does not clean up the indices. (relpages >> reltuples) I understand, vacuum is supposed to remove pages only if there are no live tuples in the page, but somehow, even immediately after vacuum, I see relpages significantly greater than reltuples. I would have assumed, relpages <= reltuples 
  4. Query Planner does not consider index bloat, so uses highly bloated partial index that is terribly slow over other index
Your points 3 and 4 are not correct.  empty index pages are put on a freelist for future reuse, they are not physically removed from the underlying index files.  Maybe they are not actually getting put on the freelist or not being reused from the freelist for some reason, but that would be a different issue.  Use the extension pgstattuple to see what its function pgstatindex says about the index.  

The planner does take index bloat into consideration, but its effect size is low.  Which it should be, as empty or irrelevant pages should be efficiently skipped during the course of most index operations. To figure out what is going with your queries, you should do an EXPLAIN (ANALYZE, BUFFERS) of them, but with it being slow and with it being fast.
 
Question: Is there a way to optimize postgres vacuum/reindex when using partial indexes?

Without knowing what is actually going wrong, I can only offer generalities.  Make sure you don't have long-lived transactions which prevent efficient clean up.  Increase the frequency on which vacuum runs on the table.  It can't reduce the size of an already bloated index, but by keeping the freelist stocked it should be able prevent it from getting bloated in the first place.  Also, it can remove empty pages from being linked into the index tree structure, which means they won't need to be scanned even though they are still in the file.  It can also free up space inside non-empty pages for future reuse within that same page, and so that index tuples don't need to be chased down in the table only to be found to be not visible.
 
```
SELECT [columns list]
  FROM tasks
  WHERE status NOT IN (3,4,5) AND created > NOW() - INTERVAL '30 days' AND updated < NOW() - interval '30 minutes'
```

Since we are only interested in the pending tasks, I created a partial index
 `"tasks_pending_status_created_type_idx" btree (status, created, task_type) WHERE status <> ALL (ARRAY[3, 4, 5])`.

This looks like a poorly designed index.  Since the status condition exactly matches the index where clause, there is no residual point in having "status" be the first column in the index, it can only get in the way (for this particular query).  Move it to the end, or remove it altogether.
Interesting. I don't understand why it will get in the way. Unfortunately we have a few other cases where status is used in filter. That said, I will consider how to get this to work. 
Would removing status from the index column, improve HOT updates %? For example, changing status from 1->2, doesn't change anything on the index (assuming other criteria for HOT updates are met), but I am not sure how the implementation is.


Within the tuples which pass the status check, which inequality is more selective, the "created" one or "updated" one?
Obviously updated time is more selective (after status), and the created time is included only to exclude some bugs in our system that had left some old tasks stuck in progress (and for sorting). We do try to clean up occasionally, but not each time. 
However we cannot add an index on `updated` column because that timestamp gets updated over 10x on average for each task. Since if a single index use a column, then the update will not be HOT, and every index needs to be updated. That will clearly add a bloat to every index. Did I miss something?
 
 
Cheers,

Jeff

Re: Index bloat and REINDEX/VACUUM optimization for partial index

От
Maxim Boguk
Дата:

At any moment, there are around 1000-1500 tasks in pending statuses (Init + InProgress) out of around 500 million tasks. 

Now, we have a task monitoring query that will look for all pending tasks that have not received any update in the last n minutes.

```
SELECT [columns list]
  FROM tasks
  WHERE status NOT IN (3,4,5) AND created > NOW() - INTERVAL '30 days' AND updated < NOW() - interval '30 minutes'
```

Since we are only interested in the pending tasks, I created a partial index
 `"tasks_pending_status_created_type_idx" btree (status, created, task_type) WHERE status <> ALL (ARRAY[3, 4, 5])`.

This worked great initially, however this started to get bloated very very quickly because, every task starts in pending state, gets multiple updates (and many of them are not HOT updates, working on optimizing fill factor now), and eventually gets deleted from the index (as status changes to success).

From my experience I suspect that there is a problem with "of around 500 million tasks."
Autovacuum indeed cleans old dead index entries, but how many such dead index entries will be collected on the 500M table before autovacuum kicks in?

With the default value of autovacuum_vacuum_scale_factor (The default is 0.2 (20% of table size).) index will collect like 100M outdated/dead index entries before autovacuum kicks in and cleans them all (in a worst case),  and of course it will lead to huge index bloat and awful performance.

Even if you scale down autovacuum_vacuum_scale_factor to some unreasonable low value like 0.01, the index still bloats to the 5M dead entries before autovacuum run, and constant vacuuming of a huge 500M table will put a huge load on the database server.

Unfortunately there is no easy way out of this situation from database side, in general I recommend not trying to implement a fast pacing queue like load inside of a huge and constantly growing table, it never works well because you cannot keep up partial efficient indexes for the queue in a clean/non-bloated state.

In my opinion the best solution is to keep list of entries to process ("around 1000-1500 tasks in pending statuses") duplicated in the separate tiny table (via triggers or implement it on the application level), in that case autovacuum will be able quickly clean dead entries from the index.

Kind Regards,
Maxim


--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678

Re: Index bloat and REINDEX/VACUUM optimization for partial index

От
jayaprabhakar k
Дата:
Thanks Maxim, that's something we are considering now - keep the in progress tasks in one table and periodically move the old and completed tasks to an archive table.
We could use a view that unions them for most queries.

I'm not sure if that's the best alternative though, and we want to know if there are any gotchas to worry about.

On Thu, Aug 31, 2023, 8:06 AM Maxim Boguk <maxim.boguk@gmail.com> wrote:

At any moment, there are around 1000-1500 tasks in pending statuses (Init + InProgress) out of around 500 million tasks. 

Now, we have a task monitoring query that will look for all pending tasks that have not received any update in the last n minutes.

```
SELECT [columns list]
  FROM tasks
  WHERE status NOT IN (3,4,5) AND created > NOW() - INTERVAL '30 days' AND updated < NOW() - interval '30 minutes'
```

Since we are only interested in the pending tasks, I created a partial index
 `"tasks_pending_status_created_type_idx" btree (status, created, task_type) WHERE status <> ALL (ARRAY[3, 4, 5])`.

This worked great initially, however this started to get bloated very very quickly because, every task starts in pending state, gets multiple updates (and many of them are not HOT updates, working on optimizing fill factor now), and eventually gets deleted from the index (as status changes to success).

From my experience I suspect that there is a problem with "of around 500 million tasks."
Autovacuum indeed cleans old dead index entries, but how many such dead index entries will be collected on the 500M table before autovacuum kicks in?

With the default value of autovacuum_vacuum_scale_factor (The default is 0.2 (20% of table size).) index will collect like 100M outdated/dead index entries before autovacuum kicks in and cleans them all (in a worst case),  and of course it will lead to huge index bloat and awful performance.

Even if you scale down autovacuum_vacuum_scale_factor to some unreasonable low value like 0.01, the index still bloats to the 5M dead entries before autovacuum run, and constant vacuuming of a huge 500M table will put a huge load on the database server.

Unfortunately there is no easy way out of this situation from database side, in general I recommend not trying to implement a fast pacing queue like load inside of a huge and constantly growing table, it never works well because you cannot keep up partial efficient indexes for the queue in a clean/non-bloated state.

In my opinion the best solution is to keep list of entries to process ("around 1000-1500 tasks in pending statuses") duplicated in the separate tiny table (via triggers or implement it on the application level), in that case autovacuum will be able quickly clean dead entries from the index.

Kind Regards,
Maxim


--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678

Re: Index bloat and REINDEX/VACUUM optimization for partial index

От
Jeff Janes
Дата:
On Wed, Aug 30, 2023 at 8:43 PM jayaprabhakar k <jayaprabhakar@gmail.com> wrote:


On Tue, Aug 29, 2023, 12:43 PM Jeff Janes <jeff.janes@gmail.com> wrote:
On Mon, Aug 28, 2023 at 8:33 PM jayaprabhakar k <jayaprabhakar@gmail.com> wrote:

Since we are only interested in the pending tasks, I created a partial index
 `"tasks_pending_status_created_type_idx" btree (status, created, task_type) WHERE status <> ALL (ARRAY[3, 4, 5])`.

This looks like a poorly designed index.  Since the status condition exactly matches the index where clause, there is no residual point in having "status" be the first column in the index, it can only get in the way (for this particular query).  Move it to the end, or remove it altogether.
Interesting. I don't understand why it will get in the way. Unfortunately we have a few other cases where status is used in filter. That said, I will consider how to get this to work. 
Would removing status from the index column, improve HOT updates %? For example, changing status from 1->2, doesn't change anything on the index (assuming other criteria for HOT updates are met), but I am not sure how the implementation is.

No, changes to the status column will not qualify as HOT updates, even if status is only in the WHERE clause and not the index body.  I don't know if there is a fundamental reason that those can't be done as HOT, or if it is just an optimization that no one implemented.
 


Within the tuples which pass the status check, which inequality is more selective, the "created" one or "updated" one?
Obviously updated time is more selective (after status), and the created time is included only to exclude some bugs in our system that had left some old tasks stuck in progress (and for sorting). We do try to clean up occasionally, but not each time.

If "created" were the leading column in the index, then it could jump directly to the part of the index which meets the `created > ...` without having to scroll through all of them and throw them out one by one.  But it sounds like there are so few of them that being able to skip them wouldn't be worth very much.
 
 
However we cannot add an index on `updated` column because that timestamp gets updated over 10x on average for each task. Since if a single index use a column, then the update will not be HOT, and every index needs to be updated. That will clearly add a bloat to every index. Did I miss something?

Why does it get updated so much?  It seems like status should go from 1 to 2, then from 2 to 3,4,or 5, and then be done.  So only 2 updates, not 10.  Maybe the feature which needs this frequent update could be done in some other way which is less disruptive.

But anyway, PostgreSQL has features to prevent the index bloat from becoming too severe of a problem, and you should figure out why they are not working for you.  The most common ones I know of are 1) long open snapshots preventing clean up, 2) all index scans being bitmap index scans, which don't to micro-vacuuming/index hinting the way ordinary btree index scans do, and 3) running the queries on a hot-standby, where index hint bits must be ignored.  If you could identify and solve this issue, then you wouldn't need to twist yourself into knots avoiding non-HOT updates. 

Cheers,

Jeff

Re: Index bloat and REINDEX/VACUUM optimization for partial index

От
Jeff Janes
Дата:
On Thu, Aug 31, 2023 at 11:06 AM Maxim Boguk <maxim.boguk@gmail.com> wrote:
 
With the default value of autovacuum_vacuum_scale_factor (The default is 0.2 (20% of table size).) index will collect like 100M outdated/dead index entries before autovacuum kicks in and cleans them all (in a worst case),  and of course it will lead to huge index bloat and awful performance.

Index bloat doesn't automatically lead to awful performance.  There must be some additional factor at play.
 
Even if you scale down autovacuum_vacuum_scale_factor to some unreasonable low value like 0.01, the index still bloats to the 5M dead entries before autovacuum run, and constant vacuuming of a huge 500M table will put a huge load on the database server.

For this type of situation, I would generally set autovacuum_vacuum_scale_factor to 0, and use autovacuum_vacuum_threshold to drive the vacuuming instead.  But I'd make those changes just on the queue table(s), not system wide.  Due to the visibility map, the load on the server does not need to be huge just due to the table, as the stable part of the table can be ignored.  The problem is that each index still needs to be read entirely for each vacuum cycle, which would not be much of a problem for the partial indexes, but certainly could be for the full indexes.  There are some very recent improvements in this area, but I don't think they can be applied selectively to specific indexes.

 

Unfortunately there is no easy way out of this situation from database side, in general I recommend not trying to implement a fast pacing queue like load inside of a huge and constantly growing table, it never works well because you cannot keep up partial efficient indexes for the queue in a clean/non-bloated state.

In my opinion the best solution is to keep list of entries to process ("around 1000-1500 tasks in pending statuses") duplicated in the separate tiny table (via triggers or implement it on the application level), in that case autovacuum will be able quickly clean dead entries from the index.

You should be able to use declarative partitioning to separate the "final" tuples from the "active" tuples, to get the same benefit but with less work.

Cheers,

Jeff

Re: Index bloat and REINDEX/VACUUM optimization for partial index

От
Maxim Boguk
Дата:
But anyway, PostgreSQL has features to prevent the index bloat from becoming too severe of a problem, and you should figure out why they are not working for you.  The most common ones I know of are 1) long open snapshots preventing clean up, 2) all index scans being bitmap index scans, which don't to micro-vacuuming/index hinting the way ordinary btree index scans do, and 3) running the queries on a hot-standby, where index hint bits must be ignored.  If you could identify and solve this issue, then you wouldn't need to twist yourself into knots avoiding non-HOT updates.

I am not sure that kill bits could be a complete fix for indexes with tens of millions dead entries and only a handful of live entries. As I understand the mechanics of killbits - they help to avoid excessive heap visibility checks for dead tuples, but tuples with killbit are still should be read from the index first. And with many millions of dead entries it isn't free.

PS: ignoring killbits on hot standby slaves is a source of endless pain in many cases.

--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678

Re: Index bloat and REINDEX/VACUUM optimization for partial index

От
jayaprabhakar k
Дата:
Thanks Maxim and Jeff. 
1. Do you have any pointers to the killbits issue on hot standby slaves? We do use a hot standby instance for many queries. So I want to learn more about it.
2. I am now considering partitioning the table. I am curious if we can set up partitions by mutable columns. More specifically, <status, created>, where the status is mutable, and usually ends up in terminal states (success, failure or aborted). 

I could not find any documentation on the performance implication of partitioning by mutable column, any guidance would be helpful. I had previously underestimated the impact of index on a mutable column, so I want to be cautious this time.

 

 

On Fri, 1 Sept 2023 at 11:02, Maxim Boguk <maxim.boguk@gmail.com> wrote:
But anyway, PostgreSQL has features to prevent the index bloat from becoming too severe of a problem, and you should figure out why they are not working for you.  The most common ones I know of are 1) long open snapshots preventing clean up, 2) all index scans being bitmap index scans, which don't to micro-vacuuming/index hinting the way ordinary btree index scans do, and 3) running the queries on a hot-standby, where index hint bits must be ignored.  If you could identify and solve this issue, then you wouldn't need to twist yourself into knots avoiding non-HOT updates.

I am not sure that kill bits could be a complete fix for indexes with tens of millions dead entries and only a handful of live entries. As I understand the mechanics of killbits - they help to avoid excessive heap visibility checks for dead tuples, but tuples with killbit are still should be read from the index first. And with many millions of dead entries it isn't free.

PS: ignoring killbits on hot standby slaves is a source of endless pain in many cases.

--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678