Обсуждение: BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes

Поиск
Список
Период
Сортировка

BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17586
Logged by:          Maxim Boguk
Email address:      maxim.boguk@gmail.com
PostgreSQL version: 14.4
Operating system:   Ubuntu Linux
Description:

Details about the data structure
pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w
partition of exports_new table (Partition key: RANGE (epoch))

During the reindex job there were quite a lot new inserts into
exports_new_2022_32w table via partition head.


2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545267 txid:354388805] [INSERT] LOG:  duration: 2711.995 ms
 statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
                (SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
                GROUP BY 1,2,3)


2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT:  REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/1.0", size 8192
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT:  REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/0.0", size 8192
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT:  REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:28.965 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.0.sharedfileset/2.0", size 252370944
2022-08-11 02:47:28.965 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT:  REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:29.003 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.0.sharedfileset/1.0", size 253599744
2022-08-11 02:47:29.003 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT:  REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:29.041 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.0.sharedfileset/0.0", size 251535360
2022-08-11 02:47:29.041 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT:  REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index


2022-08-11 02:47:30.023 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545272 txid:354389107] [INSERT] LOG:  duration: 1991.666 ms
 statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
                (SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
                GROUP BY 1,2,3)
2022-08-11 02:47:55.962 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545274 txid:354389164] [INSERT] LOG:  duration: 4027.248 ms
 statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
                (SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
                GROUP BY 1,2,3)
2022-08-11 02:48:02.883 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545275 txid:354389277] [INSERT] LOG:  duration: 1055.236 ms
 statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
                (SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
                GROUP BY 1,2,3)
2022-08-11 02:48:24.914 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545278 txid:354389390] [INSERT] LOG:  duration: 2020.759 ms
 statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
                (SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
                GROUP BY 1,2,3)

2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local]
[vxid:10/63478386 txid:0] [SELECT] ERROR:  missing chunk number 0 for toast
value 2546878653 in pg_toast_2502234687
2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local]
[vxid:10/63478386 txid:0] [SELECT] STATEMENT:  SELECT field_ids FROM
exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch =
$3

2022-08-11 02:48:49.651 UTC 1187482 postgres@***_exports from [local]
[vxid:14/0 txid:0] [REINDEX] LOG:  duration: 187669.153 ms  statement:
REINDEX INDEX CONCURRENTLY pg_toast.pg_toast_2502234687_index

2022-08-11 02:48:50.048 UTC 1184504 ***_field_sync@***_exports from [local]
[vxid:10/63478398 txid:0] [SELECT] ERROR:  missing chunk number 0 for toast
value 2546878502 in pg_toast_2502234687
2022-08-11 02:48:50.048 UTC 1184504 ***_field_sync@***_exports from [local]
[vxid:10/63478398 txid:0] [SELECT] STATEMENT:  SELECT field_ids FROM
exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch =
$3

and so on...

A follow-up secondary manual REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index resolved the problem with damaged
toast.

Problems do not always happen (rarely) and I cannot build a quick reproducer
so far.

Kind Regards,
Maxim


Re: BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes

От
Andres Freund
Дата:
Hi,

On 2022-08-15 11:36:52 +0000, PG Bug reporting form wrote:
> Bug reference:      17586
> Logged by:          Maxim Boguk
> Email address:      maxim.boguk@gmail.com
> PostgreSQL version: 14.4
> Operating system:   Ubuntu Linux
> Description:
>
> Details about the data structure
> pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w
> partition of exports_new table (Partition key: RANGE (epoch))
>
> During the reindex job there were quite a lot new inserts into
> exports_new_2022_32w table via partition head.
>

> 2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local]
> [vxid:12/63545267 txid:354388805] [INSERT] LOG:  duration: 2711.995 ms
>  statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
> field_ids)
>                 (SELECT field_board_id, (field_id % 256)::int
> field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
> field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
> t
>                 GROUP BY 1,2,3)
>
>
> 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
> [vxid:14/66875173 txid:0] [REINDEX] LOG:  temporary file: path
> "base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192



> [vxid:10/63478386 txid:0] [SELECT] ERROR:  missing chunk number 0 for toast
> value 2546878653 in pg_toast_2502234687
> 2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local]
> [vxid:10/63478386 txid:0] [SELECT] STATEMENT:  SELECT field_ids FROM
> exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch =
> $3

Hm. What is the setup of this system? Do you have a hot_standby_feedback
enabled replica? Does it use replication slots? Did it reconnect at the time?

This issue could be due to the xmin horizon going backward, rather than a bug
in reindex concurrently itself.


Too bad we don't have an errcontext set up for this, it'd be quite helpful to
know what row the missing toast value is in, so we could check its visibility.

Greetings,

Andres Freund



Re: BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes

От
Maxim Boguk
Дата:


On Tue, Aug 16, 2022 at 1:01 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2022-08-15 11:36:52 +0000, PG Bug reporting form wrote:
> Bug reference:      17586
> Logged by:          Maxim Boguk
> Email address:      maxim.boguk@gmail.com
> PostgreSQL version: 14.4
> Operating system:   Ubuntu Linux
> Description:
>
> Details about the data structure
> pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w
> partition of exports_new table (Partition key: RANGE (epoch))
>
> During the reindex job there were quite a lot new inserts into
> exports_new_2022_32w table via partition head.
>

> 2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local]
> [vxid:12/63545267 txid:354388805] [INSERT] LOG:  duration: 2711.995 ms
>  statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
> field_ids)
>                 (SELECT field_board_id, (field_id % 256)::int
> field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
> field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
> t
>                 GROUP BY 1,2,3)
>
>
> 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
> [vxid:14/66875173 txid:0] [REINDEX] LOG:  temporary file: path
> "base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192



> [vxid:10/63478386 txid:0] [SELECT] ERROR:  missing chunk number 0 for toast
> value 2546878653 in pg_toast_2502234687
> 2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local]
> [vxid:10/63478386 txid:0] [SELECT] STATEMENT:  SELECT field_ids FROM
> exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch =
> $3

Hm. What is the setup of this system? Do you have a hot_standby_feedback
enabled replica? Does it use replication slots? Did it reconnect at the time?

This issue could be due to the xmin horizon going backward, rather than a bug
in reindex concurrently itself.


Too bad we don't have an errcontext set up for this, it'd be quite helpful to
know what row the missing toast value is in, so we could check its visibility.

Hi,
hot_standby_feedback enabled replica - yes (but with minimal activity)
use replication slots - no and no other apps with replication slots
Did it reconnect at the time - most definitely no, pg_stat_replication shows
backend_start    | 2022-06-30 07:36:51.095073+00
for replication connection.
Also I have log_replication_commands enabled and nothing about replication logged during the last week.


--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk

"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"

Re: BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes

От
Andres Freund
Дата:
Hi,

On 2022-08-16 10:34:14 +0200, Maxim Boguk wrote:
> On Tue, Aug 16, 2022 at 1:01 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2022-08-15 11:36:52 +0000, PG Bug reporting form wrote:
> > > Bug reference:      17586
> > > Logged by:          Maxim Boguk
> > > Email address:      maxim.boguk@gmail.com
> > > PostgreSQL version: 14.4
> > > Operating system:   Ubuntu Linux
> > > Description:
> > >
> > > Details about the data structure
> > > pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w
> > > partition of exports_new table (Partition key: RANGE (epoch))
> > >
> > > During the reindex job there were quite a lot new inserts into
> > > exports_new_2022_32w table via partition head.
> > >
> >
> > > 2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local]
> > > [vxid:12/63545267 txid:354388805] [INSERT] LOG:  duration: 2711.995 ms
> > >  statement: INSERT INTO exports_new(field_board_id, field_id_mod_256,
> > epoch,
> > > field_ids)
> > >                 (SELECT field_board_id, (field_id % 256)::int
> > > field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
> > > field_board_id, epoch, unnest(field_ids) AS field_id FROM
> > temp_export_ids)
> > > t
> > >                 GROUP BY 1,2,3)
> > >
> > >
> > > 2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
> > > [vxid:14/66875173 txid:0] [REINDEX] LOG:  temporary file: path
> > > "base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192
> >
> >
> >
> > > [vxid:10/63478386 txid:0] [SELECT] ERROR:  missing chunk number 0 for
> > toast
> > > value 2546878653 in pg_toast_2502234687
> > > 2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from
> > [local]
> > > [vxid:10/63478386 txid:0] [SELECT] STATEMENT:  SELECT field_ids FROM
> > > exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND
> > epoch =
> > > $3
> >
> > Hm. What is the setup of this system? Do you have a hot_standby_feedback
> > enabled replica? Does it use replication slots? Did it reconnect at the
> > time?
> >
> > This issue could be due to the xmin horizon going backward, rather than a
> > bug
> > in reindex concurrently itself.
> >
> >
> > Too bad we don't have an errcontext set up for this, it'd be quite helpful
> > to
> > know what row the missing toast value is in, so we could check its
> > visibility.
> >
> 
> Hi,
> hot_standby_feedback enabled replica - yes (but with minimal activity)
> use replication slots - no and no other apps with replication slots
> Did it reconnect at the time - most definitely no, pg_stat_replication
> shows
> backend_start    | 2022-06-30 07:36:51.095073+00
> for replication connection.
> Also I have log_replication_commands enabled and nothing about replication
> logged during the last week.

Hm. How often does this occur? And could you run a server with a bit of added
debugging? Otherwise I don't see how we can narrow this down, unfortunately.

Greetings,

Andres Freund