Обсуждение: PATCH: Add REINDEX tag to event triggers

Поиск
Список
Период
Сортировка

PATCH: Add REINDEX tag to event triggers

От
Garrett Thornburg
Дата:
Added my v1 patch to add REINDEX to event triggers.

I originally built this against pg15 but rebased to master for the patch to hopefully make it easier for maintainers to merge. The rebase was automatic so it should be easy to include into any recent version. I'd love for this to land in pg16 if possible.

I added regression tests and they are passing. I also formatted the code using the project tools. Docs are included too.

A few notes:
1. I tried to not touch the function parameters too much and opted to create a convenience function that makes it easy to attach index or table OIDs to the current event trigger list.
2. I made sure both concurrent and regular reindex of index, table, database work as expected.
3. The ddl end command will make available all indexes that were modified by the reindex command. This is a large list when you run "reindex database" but works really well. I debated returning records of the table or DB that were reindexed but that doesn't really make sense. Returning each index fits my use case of building an audit record around the index lifecycle (hence the motivation for the patch).

Thanks for your time,

Garrett Thornburg
Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
Michael Paquier
Дата:
On Thu, Jul 20, 2023 at 10:47:00PM -0600, Garrett Thornburg wrote:
> Added my v1 patch to add REINDEX to event triggers.
>
> I originally built this against pg15 but rebased to master for the patch
> to hopefully make it easier for maintainers to merge. The rebase was
> automatic so it should be easy to include into any recent version. I'd love
> for this to land in pg16 if possible.

The development of Postgres 16 has finished last April and this
version is in a stabilization period.  We are currently running the
first commit fest of PostgreSQL 17, though.  I would recommend to
register your patch here so as it is not lost:
https://commitfest.postgresql.org/44/

> I added regression tests and they are passing. I also formatted the code
> using the project tools. Docs are included too.

Hmm.  What are the use cases you have in mind where you need to know
the list of indexes listed by an event trigger like this?  Just
monitoring to know which indexes have been rebuilt?  We have a
table_rewrite which is equivalent to check if a relfilenode has
changed, so why not, I guess..

Assuming that ddl_command_start is not supported is incorrect.  For
example, with your patch applied, if I do the following setup:
create event trigger regress_event_trigger on ddl_command_start
  execute procedure test_event_trigger();
create function test_event_trigger() returns event_trigger as $$
  BEGIN
      RAISE NOTICE 'test_event_trigger: % %', tg_event, tg_tag;
  END
  $$ language plpgsql;

Then a REINDEX gives me that:
reindex table pg_class;
NOTICE:  00000: test_event_trigger: ddl_command_start REINDEX

The first paragraphs of event-trigger.sgml describe the following:
     The <literal>ddl_command_start</literal> event occurs just before the
     execution of a <literal>CREATE</literal>, <literal>ALTER</literal>, <literal>DROP</literal>,
     <literal>SECURITY LABEL</literal>,
     <literal>COMMENT</literal>, <literal>GRANT</literal> or <literal>REVOKE</literal>
[...]
    The <literal>ddl_command_end</literal> event occurs just after the execution of
    this same set of commands.  To obtain more details on the <acronym>DDL</acronym>

So it would need a refresh to list REINDEX.

        case T_ReindexStmt:
-           lev = LOGSTMT_ALL;  /* should this be DDL? */
+           lev = LOGSTMT_DDL;

This, unfortunately, may not be as right as it is simple because
REINDEX is not a DDL as far as I know (it is not in the SQL
standard).
--
Michael

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
jian he
Дата:
On Fri, Jul 21, 2023 at 12:47 PM Garrett Thornburg <film42@gmail.com> wrote:
>
> Added my v1 patch to add REINDEX to event triggers.
>
> I originally built this against pg15 but rebased to master for the patch to hopefully make it easier for maintainers
tomerge. The rebase was automatic so it should be easy to include into any recent version. I'd love for this to land in
pg16if possible. 
>
> I added regression tests and they are passing. I also formatted the code using the project tools. Docs are included
too.
>
> A few notes:
> 1. I tried to not touch the function parameters too much and opted to create a convenience function that makes it
easyto attach index or table OIDs to the current event trigger list. 
> 2. I made sure both concurrent and regular reindex of index, table, database work as expected.
> 3. The ddl end command will make available all indexes that were modified by the reindex command. This is a large
listwhen you run "reindex database" but works really well. I debated returning records of the table or DB that were
reindexedbut that doesn't really make sense. Returning each index fits my use case of building an audit record around
theindex lifecycle (hence the motivation for the patch). 
>
> Thanks for your time,
>
> Garrett Thornburg


main/postgres/src/backend/tcop/utility.c
535: /*
536:  * standard_ProcessUtility itself deals only with utility commands for
537:  * which we do not provide event trigger support.  Commands that do have
538:  * such support are passed down to ProcessUtilitySlow, which contains the
539:  * necessary infrastructure for such triggers.
540:  *
541:  * This division is not just for performance: it's critical that the
542:  * event trigger code not be invoked when doing START TRANSACTION for
543:  * example, because we might need to refresh the event trigger cache,
544:  * which requires being in a valid transaction.
545:  */

so  T_ReindexStmt should only be in  ProcessUtilitySlow, if you want
to create an event trigger on reindex?

regression tests work fine. I even play with partitions.



Re: PATCH: Add REINDEX tag to event triggers

От
Michael Paquier
Дата:
On Tue, Jul 25, 2023 at 04:34:47PM +0800, jian he wrote:
> so  T_ReindexStmt should only be in  ProcessUtilitySlow, if you want
> to create an event trigger on reindex?
>
> regression tests work fine. I even play with partitions.

It would be an idea to have some regression tests for partitions,
actually, so as some patterns around ReindexMultipleInternal() are
checked.  We could have a REINDEX DATABASE in a TAP test with an event
trigger, as well, but I don't feel strongly about the need to do that
much extra work in 090_reindexdb.pl or 091_reindexdb_all.pl if
partitions cover the multi-table case.
--
Michael

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
jian he
Дата:
On Wed, Jul 26, 2023 at 7:51 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Jul 25, 2023 at 04:34:47PM +0800, jian he wrote:
> > so  T_ReindexStmt should only be in  ProcessUtilitySlow, if you want
> > to create an event trigger on reindex?
> >
> > regression tests work fine. I even play with partitions.
>
> It would be an idea to have some regression tests for partitions,
> actually, so as some patterns around ReindexMultipleInternal() are
> checked.  We could have a REINDEX DATABASE in a TAP test with an event
> trigger, as well, but I don't feel strongly about the need to do that
> much extra work in 090_reindexdb.pl or 091_reindexdb_all.pl if
> partitions cover the multi-table case.
> --
> Michael

quite verbose, copied from partition-info.sql. meet the expectation:
partitioned index will do nothing, partition index will trigger event
trigger.
------------------------------------------------
DROP EVENT TRIGGER  IF EXISTS  end_reindex_command  CASCADE;
DROP EVENT TRIGGER  IF EXISTS  start_reindex_command  CASCADE;

BEGIN;
CREATE OR REPLACE FUNCTION reindex_end_command()
RETURNS event_trigger AS $$
DECLARE
  obj record;
BEGIN
    raise notice 'begin of reindex_end_command';

    FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
    LOOP
      RAISE NOTICE
        'obj.command_tag:% obj.object_type:% obj.schema_name:%
obj.object_identity:%'
        ,obj.command_tag, obj.object_type,obj.schema_name,obj.object_identity;
      RAISE NOTICE 'ddl_end_command -- REINDEX: %', pg_get_indexdef(obj.objid);
    END LOOP;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION start_reindex_command()
RETURNS event_trigger AS $$
DECLARE
    obj record;
BEGIN
    FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
    LOOP
        RAISE NOTICE
        'obj.command_tag:% obj.object_type:% obj.schema_name:%
obj.object_identity:%'
        , obj.command_tag, obj.object_type,obj.schema_name,obj.object_identity;
        RAISE NOTICE 'ddl_start_command -- REINDEX: %',
pg_get_indexdef(obj.objid);
    END LOOP;
    raise notice 'end of start_reindex_command';
END;
$$ LANGUAGE plpgsql;

BEGIN;
CREATE EVENT TRIGGER end_reindex_command ON ddl_command_end
    WHEN TAG IN ('REINDEX') EXECUTE PROCEDURE reindex_end_command();
CREATE EVENT TRIGGER start_reindex_command ON ddl_command_start
    WHEN TAG IN ('REINDEX') EXECUTE PROCEDURE start_reindex_command();
COMMIT;

-- test Reindex Event Trigger
BEGIN;
drop table if EXISTS ptif_test CASCADE;
CREATE TABLE ptif_test (a int, b int) PARTITION BY range (a);
CREATE TABLE ptif_test0 PARTITION OF ptif_test
  FOR VALUES FROM (minvalue) TO (0) PARTITION BY list (b);
CREATE TABLE ptif_test01 PARTITION OF ptif_test0 FOR VALUES IN (1);
CREATE TABLE ptif_test1 PARTITION OF ptif_test
  FOR VALUES FROM (0) TO (100) PARTITION BY list (b);
CREATE TABLE ptif_test11 PARTITION OF ptif_test1 FOR VALUES IN (1);

CREATE TABLE ptif_test2 PARTITION OF ptif_test
  FOR VALUES FROM (100) TO (200);
-- This partitioned table should remain with no partitions.
CREATE TABLE ptif_test3 PARTITION OF ptif_test
  FOR VALUES FROM (200) TO (maxvalue) PARTITION BY list (b);

-- Test index partition tree
CREATE INDEX ptif_test_index ON ONLY ptif_test (a);

CREATE INDEX ptif_test0_index ON ONLY ptif_test0 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test0_index;

CREATE INDEX ptif_test01_index ON ptif_test01 (a);
ALTER INDEX ptif_test0_index ATTACH PARTITION ptif_test01_index;

CREATE INDEX ptif_test1_index ON ONLY ptif_test1 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test1_index;

CREATE INDEX ptif_test11_index ON ptif_test11 (a);
ALTER INDEX ptif_test1_index ATTACH PARTITION ptif_test11_index;

CREATE INDEX ptif_test2_index ON ptif_test2 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test2_index;

CREATE INDEX ptif_test3_index ON ptif_test3 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test3_index;
COMMIT;

--top level partitioned index. will recurse to each partition index.
REINDEX INDEX CONCURRENTLY public.ptif_test_index;

--ptif_test0 is partitioned table. it will index partition: ptif_test01_index
-- event trigger will log  ptif_test01_index
REINDEX INDEX CONCURRENTLY public.ptif_test0_index;

--ptif_test1_index is partitioned index. it will index partition:
ptif_test11_index
-- event trigger will effect on partion index:ptif_test11_index
REINDEX INDEX CONCURRENTLY public.ptif_test1_index;

--ptif_test2 is a partition. event trigger will log ptif_test2_index
REINDEX INDEX CONCURRENTLY public.ptif_test2_index;

--no partitions. event trigger won't do anything.
REINDEX INDEX CONCURRENTLY public.ptif_test3_index;

reindex table ptif_test; --top level.  will recurse to each partition index.
reindex table ptif_test0; -- will direct to ptif_test01
reindex table ptif_test01; -- will index it's associtaed index
reindex table ptif_test11; -- will index it's associtaed index
reindex table ptif_test2;  -- will index it's associtaed index
reindex table ptif_test3;  -- no partion, index won't do anything.

DROP EVENT TRIGGER  IF EXISTS  end_reindex_command  CASCADE;
DROP EVENT TRIGGER  IF EXISTS  start_reindex_command  CASCADE;
DROP FUNCTION IF EXISTS reindex_start_command;
DROP FUNCTION IF EXISTS reindex_end_command;
DROP TABLE if EXISTS ptif_test CASCADE;
-----------------------



Re: PATCH: Add REINDEX tag to event triggers

От
Garrett Thornburg
Дата:
Thanks! I added the patch to the commitfest. TIL.

> Hmm. What are the use cases you have in mind where you need to know the list of indexes listed by an event trigger like this?

Direct use case is general index health and rebuild. We had a recent upgrade go poorly (our fault) that left a bunch of indexes in a corrupted state. We ran a mass rebuild a few times. Would have been great to be able to observe changes happening in scripts. That left us wishing there was a way to monitor all changes around indexes. Figured I'd add support since my first thread seemed somewhat supportive of the idea.

> We have a table_rewrite which is equivalent to check if a relfilenode has changed, so why not, I guess..

Yeah. I was actually debating if we need to add a new reindex event instead of building on top of the DDL start/end. You are right that the documentation does not include nonsql commands like reindex. Maybe I should consider going that direction again?

> Assuming that ddl_command_start is not supported is incorrect.

Also great callout. Using `SELECT * FROM pg_event_trigger_ddl_commands()` in the event does not yield anything. Bad assumption there. I'll add a test with my next patch version.

Re: LOGSTMT_ALL

Agree here. I think this is more for reference than anything else so I'm fine to call it LOGSTMT_ALL, but that goes back to the previous question around ddl events vs a new event similar to table_rewrite.

P.S. Sorry for the double post. Sent from the wrong email so the mailing list didn't get the message.

On Tue, Jul 25, 2023 at 12:55 AM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Jul 20, 2023 at 10:47:00PM -0600, Garrett Thornburg wrote:
> Added my v1 patch to add REINDEX to event triggers.
>
> I originally built this against pg15 but rebased to master for the patch
> to hopefully make it easier for maintainers to merge. The rebase was
> automatic so it should be easy to include into any recent version. I'd love
> for this to land in pg16 if possible.

The development of Postgres 16 has finished last April and this
version is in a stabilization period.  We are currently running the
first commit fest of PostgreSQL 17, though.  I would recommend to
register your patch here so as it is not lost:
https://commitfest.postgresql.org/44/

> I added regression tests and they are passing. I also formatted the code
> using the project tools. Docs are included too.

Hmm.  What are the use cases you have in mind where you need to know
the list of indexes listed by an event trigger like this?  Just
monitoring to know which indexes have been rebuilt?  We have a
table_rewrite which is equivalent to check if a relfilenode has
changed, so why not, I guess..

Assuming that ddl_command_start is not supported is incorrect.  For
example, with your patch applied, if I do the following setup:
create event trigger regress_event_trigger on ddl_command_start
  execute procedure test_event_trigger();
create function test_event_trigger() returns event_trigger as $$
  BEGIN
      RAISE NOTICE 'test_event_trigger: % %', tg_event, tg_tag;
  END
  $$ language plpgsql;

Then a REINDEX gives me that:
reindex table pg_class;
NOTICE:  00000: test_event_trigger: ddl_command_start REINDEX

The first paragraphs of event-trigger.sgml describe the following:
     The <literal>ddl_command_start</literal> event occurs just before the
     execution of a <literal>CREATE</literal>, <literal>ALTER</literal>, <literal>DROP</literal>,
     <literal>SECURITY LABEL</literal>,
     <literal>COMMENT</literal>, <literal>GRANT</literal> or <literal>REVOKE</literal>
[...]
    The <literal>ddl_command_end</literal> event occurs just after the execution of
    this same set of commands.  To obtain more details on the <acronym>DDL</acronym>

So it would need a refresh to list REINDEX.

        case T_ReindexStmt:
-           lev = LOGSTMT_ALL;  /* should this be DDL? */
+           lev = LOGSTMT_DDL;

This, unfortunately, may not be as right as it is simple because
REINDEX is not a DDL as far as I know (it is not in the SQL
standard).
--
Michael

Re: PATCH: Add REINDEX tag to event triggers

От
Garrett Thornburg
Дата:
Thank you for this, jian he! I will include it in the next patch version.

P.S. Sorry for the double post. Sent from the wrong email address so I'm resending so the mailing list gets the email. My apologies!

On Wed, Jul 26, 2023 at 4:30 AM jian he <jian.universality@gmail.com> wrote:
On Wed, Jul 26, 2023 at 7:51 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Jul 25, 2023 at 04:34:47PM +0800, jian he wrote:
> > so  T_ReindexStmt should only be in  ProcessUtilitySlow, if you want
> > to create an event trigger on reindex?
> >
> > regression tests work fine. I even play with partitions.
>
> It would be an idea to have some regression tests for partitions,
> actually, so as some patterns around ReindexMultipleInternal() are
> checked.  We could have a REINDEX DATABASE in a TAP test with an event
> trigger, as well, but I don't feel strongly about the need to do that
> much extra work in 090_reindexdb.pl or 091_reindexdb_all.pl if
> partitions cover the multi-table case.
> --
> Michael

quite verbose, copied from partition-info.sql. meet the expectation:
partitioned index will do nothing, partition index will trigger event
trigger.
------------------------------------------------
DROP EVENT TRIGGER  IF EXISTS  end_reindex_command  CASCADE;
DROP EVENT TRIGGER  IF EXISTS  start_reindex_command  CASCADE;

BEGIN;
CREATE OR REPLACE FUNCTION reindex_end_command()
RETURNS event_trigger AS $$
DECLARE
  obj record;
BEGIN
    raise notice 'begin of reindex_end_command';

    FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
    LOOP
      RAISE NOTICE
        'obj.command_tag:% obj.object_type:% obj.schema_name:%
obj.object_identity:%'
        ,obj.command_tag, obj.object_type,obj.schema_name,obj.object_identity;
      RAISE NOTICE 'ddl_end_command -- REINDEX: %', pg_get_indexdef(obj.objid);
    END LOOP;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION start_reindex_command()
RETURNS event_trigger AS $$
DECLARE
    obj record;
BEGIN
    FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
    LOOP
        RAISE NOTICE
        'obj.command_tag:% obj.object_type:% obj.schema_name:%
obj.object_identity:%'
        , obj.command_tag, obj.object_type,obj.schema_name,obj.object_identity;
        RAISE NOTICE 'ddl_start_command -- REINDEX: %',
pg_get_indexdef(obj.objid);
    END LOOP;
    raise notice 'end of start_reindex_command';
END;
$$ LANGUAGE plpgsql;

BEGIN;
CREATE EVENT TRIGGER end_reindex_command ON ddl_command_end
    WHEN TAG IN ('REINDEX') EXECUTE PROCEDURE reindex_end_command();
CREATE EVENT TRIGGER start_reindex_command ON ddl_command_start
    WHEN TAG IN ('REINDEX') EXECUTE PROCEDURE start_reindex_command();
COMMIT;

-- test Reindex Event Trigger
BEGIN;
drop table if EXISTS ptif_test CASCADE;
CREATE TABLE ptif_test (a int, b int) PARTITION BY range (a);
CREATE TABLE ptif_test0 PARTITION OF ptif_test
  FOR VALUES FROM (minvalue) TO (0) PARTITION BY list (b);
CREATE TABLE ptif_test01 PARTITION OF ptif_test0 FOR VALUES IN (1);
CREATE TABLE ptif_test1 PARTITION OF ptif_test
  FOR VALUES FROM (0) TO (100) PARTITION BY list (b);
CREATE TABLE ptif_test11 PARTITION OF ptif_test1 FOR VALUES IN (1);

CREATE TABLE ptif_test2 PARTITION OF ptif_test
  FOR VALUES FROM (100) TO (200);
-- This partitioned table should remain with no partitions.
CREATE TABLE ptif_test3 PARTITION OF ptif_test
  FOR VALUES FROM (200) TO (maxvalue) PARTITION BY list (b);

-- Test index partition tree
CREATE INDEX ptif_test_index ON ONLY ptif_test (a);

CREATE INDEX ptif_test0_index ON ONLY ptif_test0 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test0_index;

CREATE INDEX ptif_test01_index ON ptif_test01 (a);
ALTER INDEX ptif_test0_index ATTACH PARTITION ptif_test01_index;

CREATE INDEX ptif_test1_index ON ONLY ptif_test1 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test1_index;

CREATE INDEX ptif_test11_index ON ptif_test11 (a);
ALTER INDEX ptif_test1_index ATTACH PARTITION ptif_test11_index;

CREATE INDEX ptif_test2_index ON ptif_test2 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test2_index;

CREATE INDEX ptif_test3_index ON ptif_test3 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test3_index;
COMMIT;

--top level partitioned index. will recurse to each partition index.
REINDEX INDEX CONCURRENTLY public.ptif_test_index;

--ptif_test0 is partitioned table. it will index partition: ptif_test01_index
-- event trigger will log  ptif_test01_index
REINDEX INDEX CONCURRENTLY public.ptif_test0_index;

--ptif_test1_index is partitioned index. it will index partition:
ptif_test11_index
-- event trigger will effect on partion index:ptif_test11_index
REINDEX INDEX CONCURRENTLY public.ptif_test1_index;

--ptif_test2 is a partition. event trigger will log ptif_test2_index
REINDEX INDEX CONCURRENTLY public.ptif_test2_index;

--no partitions. event trigger won't do anything.
REINDEX INDEX CONCURRENTLY public.ptif_test3_index;

reindex table ptif_test; --top level.  will recurse to each partition index.
reindex table ptif_test0; -- will direct to ptif_test01
reindex table ptif_test01; -- will index it's associtaed index
reindex table ptif_test11; -- will index it's associtaed index
reindex table ptif_test2;  -- will index it's associtaed index
reindex table ptif_test3;  -- no partion, index won't do anything.

DROP EVENT TRIGGER  IF EXISTS  end_reindex_command  CASCADE;
DROP EVENT TRIGGER  IF EXISTS  start_reindex_command  CASCADE;
DROP FUNCTION IF EXISTS reindex_start_command;
DROP FUNCTION IF EXISTS reindex_end_command;
DROP TABLE if EXISTS ptif_test CASCADE;
-----------------------

Re: PATCH: Add REINDEX tag to event triggers

От
Garrett Thornburg
Дата:
I added a v2 patch for adding REINDEX to event triggers. The following has changed:

1. I fixed the docs to note that ddl_command_start is supported for REINDEX. Thanks, Michael!
2. I added Jian He's excellent partition table test and updated the expectations to include that regression test.

What is still up for debate:

Michael pointed out that REINDEX probably isn't DDL because the docs only specify commands in the standard like CREATE, ALTER, etc. It might be worth creating a new event to trigger on (similar to what was done for table_rewrite) to make a REINDEX trigger fit in a little nicer. Let me know what you think here. Until then, I left the code as `lev = LOGSTMT_DDL` for `T_ReindexStmt`.

Thanks,
Garrett

On Thu, Jul 20, 2023 at 10:47 PM Garrett Thornburg <film42@gmail.com> wrote:
Added my v1 patch to add REINDEX to event triggers.

I originally built this against pg15 but rebased to master for the patch to hopefully make it easier for maintainers to merge. The rebase was automatic so it should be easy to include into any recent version. I'd love for this to land in pg16 if possible.

I added regression tests and they are passing. I also formatted the code using the project tools. Docs are included too.

A few notes:
1. I tried to not touch the function parameters too much and opted to create a convenience function that makes it easy to attach index or table OIDs to the current event trigger list.
2. I made sure both concurrent and regular reindex of index, table, database work as expected.
3. The ddl end command will make available all indexes that were modified by the reindex command. This is a large list when you run "reindex database" but works really well. I debated returning records of the table or DB that were reindexed but that doesn't really make sense. Returning each index fits my use case of building an audit record around the index lifecycle (hence the motivation for the patch).

Thanks for your time,

Garrett Thornburg
Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
Jim Jones
Дата:

Greetings

On 27.07.23 06:43, Garrett Thornburg wrote:
I added a v2 patch for adding REINDEX to event triggers. The following has changed:

1. I fixed the docs to note that ddl_command_start is supported for REINDEX. Thanks, Michael!
2. I added Jian He's excellent partition table test and updated the expectations to include that regression test.

What is still up for debate:

Michael pointed out that REINDEX probably isn't DDL because the docs only specify commands in the standard like CREATE, ALTER, etc. It might be worth creating a new event to trigger on (similar to what was done for table_rewrite) to make a REINDEX trigger fit in a little nicer. Let me know what you think here. Until then, I left the code as `lev = LOGSTMT_DDL` for `T_ReindexStmt`.

Thanks,
Garrett

Thanks for the patch, Garrett!

I was unable to apply it in 7ef5f5f

$ git apply -v v2-0001-Add-REINDEX-tag-to-event-triggers.patch
Checking patch doc/src/sgml/event-trigger.sgml...
Checking patch src/backend/commands/indexcmds.c...
error: while searching for:
static List *ChooseIndexColumnNames(List *indexElems);
static void ReindexIndex(RangeVar *indexRelation, ReindexParams *params,
                                                 bool isTopLevel);
static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
                                                                                        Oid relId, Oid oldRelId, void *arg);
static Oid      ReindexTable(RangeVar *relation, ReindexParams *params,

error: patch failed: src/backend/commands/indexcmds.c:94
error: src/backend/commands/indexcmds.c: patch does not apply
Checking patch src/backend/tcop/utility.c...
Checking patch src/include/tcop/cmdtaglist.h...
Checking patch src/test/regress/expected/event_trigger.out...
Hunk #1 succeeded at 556 (offset 2 lines).
Checking patch src/test/regress/sql/event_trigger.sql...

Am I missing something?

Jim

Re: PATCH: Add REINDEX tag to event triggers

От
jian he
Дата:
On Wed, Aug 30, 2023 at 8:38 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
>
> Greetings
>
>
>
> I was unable to apply it in 7ef5f5f
>
> $ git apply -v v2-0001-Add-REINDEX-tag-to-event-triggers.patch
> Checking patch doc/src/sgml/event-trigger.sgml...
> Checking patch src/backend/commands/indexcmds.c...
> error: while searching for:
> static List *ChooseIndexColumnNames(List *indexElems);
> static void ReindexIndex(RangeVar *indexRelation, ReindexParams *params,
>                                                  bool isTopLevel);
> static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
>                                                                                         Oid relId, Oid oldRelId, void
*arg);
> static Oid      ReindexTable(RangeVar *relation, ReindexParams *params,
>
> error: patch failed: src/backend/commands/indexcmds.c:94
> error: src/backend/commands/indexcmds.c: patch does not apply
> Checking patch src/backend/tcop/utility.c...
> Checking patch src/include/tcop/cmdtaglist.h...
> Checking patch src/test/regress/expected/event_trigger.out...
> Hunk #1 succeeded at 556 (offset 2 lines).
> Checking patch src/test/regress/sql/event_trigger.sql...
>
> Am I missing something?
>
> Jim

because the change made in here:

https://git.postgresql.org/cgit/postgresql.git/diff/src/backend/commands/indexcmds.c?id=11af63fb48d278b86aa948a5b57f136ef03c2bb7

I manually slight edited the patch content:
from
 static List *ChooseIndexColumnNames(List *indexElems);
 static void ReindexIndex(RangeVar *indexRelation, ReindexParams *params,
  bool isTopLevel);
to
static List *ChooseIndexColumnNames(const List *indexElems);
static void ReindexIndex(const RangeVar *indexRelation, const
ReindexParams *params,
bool isTopLevel);

can you try the attached one.

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
Jim Jones
Дата:
On 01.09.23 11:23, jian he wrote:
> because the change made in here:
>
https://git.postgresql.org/cgit/postgresql.git/diff/src/backend/commands/indexcmds.c?id=11af63fb48d278b86aa948a5b57f136ef03c2bb7
>
> I manually slight edited the patch content:
> from
>   static List *ChooseIndexColumnNames(List *indexElems);
>   static void ReindexIndex(RangeVar *indexRelation, ReindexParams *params,
>    bool isTopLevel);
> to
> static List *ChooseIndexColumnNames(const List *indexElems);
> static void ReindexIndex(const RangeVar *indexRelation, const
> ReindexParams *params,
> bool isTopLevel);
>
> can you try the attached one.

The patch applies cleanly. However, now the compiler complains about a 
qualifier mismatch

indexcmds.c:2707:33: warning: assignment discards ‘const’ qualifier from 
pointer target type [-Wdiscarded-qualifiers]
  2707 |         currentReindexStatement = stmt;


Perhaps 'const ReindexStmt *currentReindexStatement'?

Tests also work just fine. I also tested it against unique and partial 
indexes, and it worked as expected.

Jim




Re: PATCH: Add REINDEX tag to event triggers

От
jian he
Дата:
On Fri, Sep 1, 2023 at 6:40 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
>
> On 01.09.23 11:23, jian he wrote:
> > because the change made in here:
> >
https://git.postgresql.org/cgit/postgresql.git/diff/src/backend/commands/indexcmds.c?id=11af63fb48d278b86aa948a5b57f136ef03c2bb7
> >
> > I manually slight edited the patch content:
> > from
> >   static List *ChooseIndexColumnNames(List *indexElems);
> >   static void ReindexIndex(RangeVar *indexRelation, ReindexParams *params,
> >    bool isTopLevel);
> > to
> > static List *ChooseIndexColumnNames(const List *indexElems);
> > static void ReindexIndex(const RangeVar *indexRelation, const
> > ReindexParams *params,
> > bool isTopLevel);
> >
> > can you try the attached one.
>
> The patch applies cleanly. However, now the compiler complains about a
> qualifier mismatch
>
> indexcmds.c:2707:33: warning: assignment discards ‘const’ qualifier from
> pointer target type [-Wdiscarded-qualifiers]
>   2707 |         c
>
>
> Perhaps 'const ReindexStmt *currentReindexStatement'?
>
> Tests also work just fine. I also tested it against unique and partial
> indexes, and it worked as expected.
>
> Jim
>

Thanks for pointing this out!
also due to commit
https://git.postgresql.org/cgit/postgresql.git/commit/?id=11af63fb48d278b86aa948a5b57f136ef03c2bb7
ExecReindex function input arguments also changed. so we need to
rebase this patch.

change to
currentReindexStatement = unconstify(ReindexStmt*, stmt);
seems to work for me. I tested it, no warning. But I am not 100% sure.

anyway I refactored the patch, making it git applyable
also change from "currentReindexStatement = stmt;" to
"currentReindexStatement = unconstify(ReindexStmt*, stmt);" due to
ExecReindex function changes.

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
Jim Jones
Дата:
On 01.09.23 18:10, jian he wrote:
> Thanks for pointing this out! 
Thanks for the fix!
> also due to commit
> https://git.postgresql.org/cgit/postgresql.git/commit/?id=11af63fb48d278b86aa948a5b57f136ef03c2bb7
> ExecReindex function input arguments also changed. so we need to
> rebase this patch.
>
> change to
> currentReindexStatement = unconstify(ReindexStmt*, stmt);
> seems to work for me. I tested it, no warning. But I am not 100% sure.
>
> anyway I refactored the patch, making it git applyable
> also change from "currentReindexStatement = stmt;" to
> "currentReindexStatement = unconstify(ReindexStmt*, stmt);" due to
> ExecReindex function changes.

LGTM. It applies and builds cleanly, all tests pass and documentation 
also builds ok. The CFbot seems also much happier now :)

I tried this "small stress test" to see if the code was leaking .. but 
it all looks ok to me:

DO $$
BEGIN
   FOR i IN 1..10000 LOOP
     REINDEX INDEX reindex_test.reindex_test1_idx1;
     REINDEX TABLE reindex_test.reindex_tester1;
   END LOOP;
END;
$$;


Jim




Re: PATCH: Add REINDEX tag to event triggers

От
Michael Paquier
Дата:
On Mon, Sep 04, 2023 at 08:00:52PM +0200, Jim Jones wrote:
> LGTM. It applies and builds cleanly, all tests pass and documentation also
> builds ok. The CFbot seems also much happier now :)

+    /*
+     * Open and lock the relation.  ShareLock is sufficient since we only need
+     * to prevent schema and data changes in it.  The lock level used here
+     * should match catalog's reindex_relation().
+     */
+    rel = try_table_open(relid, ShareLock);

I was eyeing at 0003, and this strikes me as incorrect.  Sure, this
matches what reindex_relation() does, but you've missed that
CONCURRENTLY takes a lighter ShareUpdateExclusiveLock, and ShareLock
conflicts with it.  See:
https://www.postgresql.org/docs/devel/explicit-locking.html

So, doesn't this disrupt a concurrent REINDEX?
--
Michael

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
jian he
Дата:
On Fri, Oct 27, 2023 at 3:15 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Sep 04, 2023 at 08:00:52PM +0200, Jim Jones wrote:
> > LGTM. It applies and builds cleanly, all tests pass and documentation also
> > builds ok. The CFbot seems also much happier now :)
>
> +       /*
> +        * Open and lock the relation.  ShareLock is sufficient since we only need
> +        * to prevent schema and data changes in it.  The lock level used here
> +        * should match catalog's reindex_relation().
> +        */
> +       rel = try_table_open(relid, ShareLock);
>
> I was eyeing at 0003, and this strikes me as incorrect.  Sure, this
> matches what reindex_relation() does, but you've missed that
> CONCURRENTLY takes a lighter ShareUpdateExclusiveLock, and ShareLock
> conflicts with it.  See:
> https://www.postgresql.org/docs/devel/explicit-locking.html
>
> So, doesn't this disrupt a concurrent REINDEX?
> --
> Michael

ReindexPartitions called ReindexMultipleInternal
ReindexRelationConcurrently add reindex_event_trigger_collect to cover
it. (line 3869)
ReindexIndex has the function reindex_event_trigger_collect. (line 2853)

reindex_event_trigger_collect_relation called in
ReindexMultipleInternal, ReindexTable (line 2979).
Both are "under concurrent is false" branches.

So it should be fine.



Re: PATCH: Add REINDEX tag to event triggers

От
Michael Paquier
Дата:
On Sat, Oct 28, 2023 at 12:15:22PM +0800, jian he wrote:
> reindex_event_trigger_collect_relation called in
> ReindexMultipleInternal, ReindexTable (line 2979).
> Both are "under concurrent is false" branches.
>
> So it should be fine.

Sorry for the delay in replying.

Indeed, you're right.  reindex_event_trigger_collect_relation() gets
only called in two paths for the non-concurrent cases just after
reindex_relation(), which is not a safe location to call it because
this may be used in CLUSTER or TRUNCATE, and the intention of the
patch is only to count for the objects in REINDEX.

Anyway, I think that the current implementation is incorrect.  The
object is collected after the reindex is done, which is right.
However, an object may be reindexed but not be reported if it was
dropped between the reindex and its endwhen collecting all the objects
of a single relation.  Perhaps it does not matter because the object
is gone, but that still looks incorrect to me because we finish by not
reporting everything we should.  I think that we should put the
collection deeper into the stack, where we know that we are holding
the locks on the objects we are collecting.  Another side effect is
that the patch seems to be missing toast table indexes, which are also
reindexed for a REINDEX TABLE, for instance.  That's not right.

Actually, could there be an argument for pushing the collection down
to reindex_relation() instead to count for all the commands that?
Even better, reindex_index() would be a better candidate because it is
itself called within reindex_relation() for each individual indexes?
If a code path should not be covered for event triggers, we could use
a new REINDEXOPT_ to control that, optionally.  In short, it looks
like we should try to reduce the number of paths calling
reindex_event_trigger_collect(), while collect_relation() ought to be
removed.

Should REINDEX TABLE CONCURRENTLY's coverage be extended so as two new
indexes are reported instead of just one?

REINDEX SCHEMA is a command that perhaps should be tested to collect
all the indexes rebuilt through it?

A side-effect of this patch is that it impacts ddl_command_start and
ddl_command_end with the addition of REINDEX.  Mixing that with the
addition of a new event is strange.  I think that the addition of
REINDEX to the existing events should be done first, as a separate
patch.  Then a second patch should deal with the collection and the
support of the new event.

I have also dug into the archives to note that control commands have
been proposed in the past to be added as part of event triggers, and
it happens that I've mentioned in a comment that that this was a
concept perhaps contrary to what event triggers should do, as these
are intended for DDLs:
https://www.postgresql.org/message-id/CAB7nPqTqZ2-YcNzOQ5KVBUJYHQ4kDSd4Q55Mc-fBzM8GH0bV2Q@mail.gmail.com

Philosophically, I'm open on all that and having more commands
depending on the cases that are satisfied, FWIW, but let's organize
the changes.
--
Michael

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
jian he
Дата:
On Mon, Nov 20, 2023 at 3:34 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Oct 28, 2023 at 12:15:22PM +0800, jian he wrote:
> > reindex_event_trigger_collect_relation called in
> > ReindexMultipleInternal, ReindexTable (line 2979).
> > Both are "under concurrent is false" branches.
> >
> > So it should be fine.
>
> Sorry for the delay in replying.
>
> Indeed, you're right.  reindex_event_trigger_collect_relation() gets
> only called in two paths for the non-concurrent cases just after
> reindex_relation(), which is not a safe location to call it because
> this may be used in CLUSTER or TRUNCATE, and the intention of the
> patch is only to count for the objects in REINDEX.
>
> Anyway, I think that the current implementation is incorrect.  The
> object is collected after the reindex is done, which is right.
> However, an object may be reindexed but not be reported if it was
> dropped between the reindex and its endwhen collecting all the objects
> of a single relation.  Perhaps it does not matter because the object
> is gone, but that still looks incorrect to me because we finish by not
> reporting everything we should.  I think that we should put the
> collection deeper into the stack, where we know that we are holding
> the locks on the objects we are collecting.  Another side effect is
> that the patch seems to be missing toast table indexes, which are also
> reindexed for a REINDEX TABLE, for instance.  That's not right.
>
> Actually, could there be an argument for pushing the collection down
> to reindex_relation() instead to count for all the commands that?
> Even better, reindex_index() would be a better candidate because it is
> itself called within reindex_relation() for each individual indexes?
> If a code path should not be covered for event triggers, we could use
> a new REINDEXOPT_ to control that, optionally.  In short, it looks
> like we should try to reduce the number of paths calling
> reindex_event_trigger_collect(), while collect_relation() ought to be
> removed.
>
> Should REINDEX TABLE CONCURRENTLY's coverage be extended so as two new
> indexes are reported instead of just one?
>
> REINDEX SCHEMA is a command that perhaps should be tested to collect
> all the indexes rebuilt through it?
>
> A side-effect of this patch is that it impacts ddl_command_start and
> ddl_command_end with the addition of REINDEX.  Mixing that with the
> addition of a new event is strange.  I think that the addition of
> REINDEX to the existing events should be done first, as a separate
> patch.  Then a second patch should deal with the collection and the
> support of the new event.
>
> I have also dug into the archives to note that control commands have
> been proposed in the past to be added as part of event triggers, and
> it happens that I've mentioned in a comment that that this was a
> concept perhaps contrary to what event triggers should do, as these
> are intended for DDLs:
> https://www.postgresql.org/message-id/CAB7nPqTqZ2-YcNzOQ5KVBUJYHQ4kDSd4Q55Mc-fBzM8GH0bV2Q@mail.gmail.com
>
> Philosophically, I'm open on all that and having more commands
> depending on the cases that are satisfied, FWIW, but let's organize
> the changes.
> --
> Michael


Hi.
Top level func is ExecReIndex. it will call {ReindexIndex,
ReindexTable, ReindexMultipleTables}
then trace deep down, it will call {ReindexRelationConcurrently, reindex_index}.

Imitate the CREATE INDEX logic, we need pass `const ReindexStmt *stmt`
to all the following functions:
ReindexIndex
ReindexTable
ReindexMultipleTables
ReindexPartitions
ReindexMultipleInternal
ReindexRelationConcurrently
reindex_relation
reindex_index

because the EventTriggerCollectSimpleCommand needs the `(ReindexStmt
*) parsetree`.
and we call EventTriggerCollectSimpleCommand at reindex_index or
ReindexRelationConcurrently.

The new test will cover the following case.
reindex (concurrently) schema;
reindex (concurrently) partitioned index;
reindex (concurrently) partitioned table;

not sure this is the right approach. But now the reindex event
collection is after we call index_open.
same static function (reindex_event_trigger_collect) in
src/backend/commands/indexcmds.c and src/backend/catalog/index.c for
now.
given that ProcessUtilitySlow supports the event triggers facility.
so probably no need for another REINDEXOPT_ option?

I also added a test for disabled event triggers.

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
Michael Paquier
Дата:
On Fri, Nov 24, 2023 at 08:00:00AM +0800, jian he wrote:
> not sure this is the right approach. But now the reindex event
> collection is after we call index_open.
> same static function (reindex_event_trigger_collect) in
> src/backend/commands/indexcmds.c and src/backend/catalog/index.c for
> now.

Knowing that there are two calls of reindex_relation() and four
callers of reindex_index(), passing the stmt looks acceptable to me to
be able to get more context from the reindex statement where a given
index has been rebuilt.

Anyway, as a whole, I am still confused by the shape of the patch..
This relies on pg_event_trigger_ddl_commands(), that reports a list of
DDL commands, but the use case you are proposing is to report a list
of the *indexes* rebuilt.  So, in its current shape, we'd report the
equivalent of N DDL commands matching the N indexes rebuilt in a
single command.  Shouldn't we use a new event type and a dedicated SQL
function to report the list of indexes newly rebuilt to get a full
list of the work that has happened?

> given that ProcessUtilitySlow supports the event triggers facility.
> so probably no need for another REINDEXOPT_ option?

Right, perhaps we don't need that if we just supply the stmt.  If
there is no requirement for it, let's avoid that, then.

-     ReindexMultipleTables(stmt->name, stmt->kind, ¶ms);
+     ReindexMultipleTables(stmt, stmt->name, stmt->kind, ¶ms);

If we begin including the stmt, there is no need for these extra
arguments in the extra routines of indexcmds.c.

As far as I can see, this patch is doing too much as presented.  Could
you split the patch into more pieces, please?  Based on v4 you have
sent, there are refactoring and basic piece parts like:
- Patch to make event triggers ddl_command_start and ddl_command_stop
react on ReindexStmt, so as a command is reported in
pg_event_trigger_ddl_commands().
- Patch for GetCommandLogLevel() to switch ReindexStmt from
LOGSTMT_ALL to LOGSTMT_DDL.  True that this could be switched.
- Patch to refactor the routines of indexcmds.c and index.c to use the
ReindexStmt as argument, as a preparation of the next patch...
- Patch to add a new event type with a new SQL function to return a
list of the indexes rebuilt, with the ReindexStmt involved when the
index OID was trapped by the collection.

1) and 2) are a minimal feature in themselves, which may be enough to
satisfy your original case, and where you'd know when a REINDEX has
been run in event triggers.  3) and 4) are what you are trying to
achieve, to get the list of the indexes rebuilt knowing the context of
a given command.
--
Michael

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
jian he
Дата:
On Fri, Nov 24, 2023 at 10:44 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> As far as I can see, this patch is doing too much as presented.  Could
> you split the patch into more pieces, please?  Based on v4 you have
> sent, there are refactoring and basic piece parts like:
> - Patch to make event triggers ddl_command_start and ddl_command_stop
> react on ReindexStmt, so as a command is reported in
> pg_event_trigger_ddl_commands().
> - Patch for GetCommandLogLevel() to switch ReindexStmt from
> LOGSTMT_ALL to LOGSTMT_DDL.  True that this could be switched.
> - Patch to refactor the routines of indexcmds.c and index.c to use the
> ReindexStmt as argument, as a preparation of the next patch...
> - Patch to add a new event type with a new SQL function to return a
> list of the indexes rebuilt, with the ReindexStmt involved when the
> index OID was trapped by the collection.
>
> 1) and 2) are a minimal feature in themselves, which may be enough to
> satisfy your original case, and where you'd know when a REINDEX has
> been run in event triggers.  3) and 4) are what you are trying to
> achieve, to get the list of the indexes rebuilt knowing the context of
> a given command.
> --
> Michael

hi.
v5-0001. changed the REINDEX command tag from event_trigger_ok: false
to event_trigger_ok: true.
Move ReindexStmt moves from standard_ProcessUtility to ProcessUtilitySlow.
By default ProcessUtilitySlow will call trigger related functions.
So the event trigger facility can support reindex statements.

v5-0002. In GetCommandLogLevel, change T_ReindexStmt from lev =
LOGSTMT_ALL to lev = LOGSTMT_DDL. so log_statement (enum) >= 'ddl'
will make the reindex statement be logged.

v5-0003. Refactor the following functions: {ReindexIndex,
ReindexTable, ReindexMultipleTables,
ReindexPartitions,ReindexMultipleInternal
,ReindexRelationConcurrently, reindex_relation,reindex_index} by
adding `const ReindexStmt *stmt` as their first argument.
This is for event trigger support reindex. We need to pass both the
newly generated indexId and the ReindexStmt to
EventTriggerCollectSimpleCommand right after the newly index gets
their lock. To do that, we have to refactor related functions.

v5-0004. Event trigger support reindex command implementation,
documentation, regress test, helper function pass reindex info to
function EventTriggerCollectSimpleCommand.

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
Ajin Cherian
Дата:
On Mon, Nov 27, 2023 at 11:00 AM jian he <jian.universality@gmail.com> wrote:
>
> On Fri, Nov 24, 2023 at 10:44 AM Michael Paquier <michael@paquier.xyz> wrote:
> hi.
> v5-0001. changed the REINDEX command tag from event_trigger_ok: false
> to event_trigger_ok: true.
> Move ReindexStmt moves from standard_ProcessUtility to ProcessUtilitySlow.
> By default ProcessUtilitySlow will call trigger related functions.
> So the event trigger facility can support reindex statements.
>
> v5-0002. In GetCommandLogLevel, change T_ReindexStmt from lev =
> LOGSTMT_ALL to lev = LOGSTMT_DDL. so log_statement (enum) >= 'ddl'
> will make the reindex statement be logged.
>
> v5-0003. Refactor the following functions: {ReindexIndex,
> ReindexTable, ReindexMultipleTables,
> ReindexPartitions,ReindexMultipleInternal
> ,ReindexRelationConcurrently, reindex_relation,reindex_index} by
> adding `const ReindexStmt *stmt` as their first argument.
> This is for event trigger support reindex. We need to pass both the
> newly generated indexId and the ReindexStmt to
> EventTriggerCollectSimpleCommand right after the newly index gets
> their lock. To do that, we have to refactor related functions.
>
> v5-0004. Event trigger support reindex command implementation,
> documentation, regress test, helper function pass reindex info to
> function EventTriggerCollectSimpleCommand.

I just started reviewing the patch. Some minor comments:
In patch 0001:
In standard_ProcessUtility(), since you are unconditionally calling
ProcessUtilitySlow() in case of T_ReindexStmt, you really don't need
the case statement for T_ReindexStmt just like all the other commands
which have event trigger support. It will call ProcessUtilitySlow() as
default.

In patch 0004:
No need to duplicate reindex_event_trigger_collect in indexcmds.c
since it is already present in index.c. Add it to index.h and make the
function extern so that it can be accessed in both index.c and
indexcmds.c

regards,
Ajin Cherian
Fujitsu Australia



Re: PATCH: Add REINDEX tag to event triggers

От
jian he
Дата:
On Mon, Nov 27, 2023 at 6:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> I just started reviewing the patch. Some minor comments:
> In patch 0001:
> In standard_ProcessUtility(), since you are unconditionally calling
> ProcessUtilitySlow() in case of T_ReindexStmt, you really don't need
> the case statement for T_ReindexStmt just like all the other commands
> which have event trigger support. It will call ProcessUtilitySlow() as
> default.
>
> In patch 0004:
> No need to duplicate reindex_event_trigger_collect in indexcmds.c
> since it is already present in index.c. Add it to index.h and make the
> function extern so that it can be accessed in both index.c and
> indexcmds.c
>
> regards,
> Ajin Cherian
> Fujitsu Australia

Thanks for reviewing it!
The above 2 suggestions are good, now the code is more neat.

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
Michael Paquier
Дата:
On Tue, Nov 28, 2023 at 12:28:50AM +0800, jian he wrote:
> On Mon, Nov 27, 2023 at 6:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
>> Ajin Cherian

Dammit.  I've just noticed that I missed mentioning you in the commit
log as a reviewer.  Sorry about that.

> The above 2 suggestions are good, now the code is more neat.

Right, the extra bits for the default case of
standard_ProcessUtility() was not necessary, because we don't need to
apply any object-level checks to see as we just collect a list of
indexes.

Anyway, I've been working on the patch for the last few days, and
applied it after tweaking a bit its style, code and comments.

Mainly, I did not see a need for reindex_event_trigger_collect() as
wrapper for EventTriggerCollectSimpleCommand() as it is only used in
two code paths across two files.  I have come back to the regression
tests, and trimmed them down to a minimum as event_trigger.sql cannot
be run in parallel so this eats in run time, concluding that knowing
the list of indexes rebuilt is also something that we track with
relfilenode change tracking in the other test suites (including the
SCHEMA, toast and partition cases), so doing that again had limited
impact.

One thing that I have been unable to conclude is if we should change
GetCommandLogLevel() for ReindexStmt.  I agree that there's a good
argument for it, but I am going to raise a different thread about that
to raise awareness, and this does not prevent the feature to work
(even if it feels inconsistent with pg_event_trigger_ddl_commands()).
--
Michael

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
Alexander Lakhin
Дата:
Hi Michael,

04.12.2023 04:04, Michael Paquier wrote:
> Anyway, I've been working on the patch for the last few days, and
> applied it after tweaking a bit its style, code and comments.

Please look at the assertion failure triggered when REINDEX processed by an event trigger:
CREATE FUNCTION etf() RETURNS EVENT_TRIGGER AS $$ BEGIN PERFORM 1; END $$ LANGUAGE plpgsql;
CREATE EVENT TRIGGER et ON ddl_command_end EXECUTE FUNCTION etf();
CREATE TABLE t (i int);
REINDEX (CONCURRENTLY) TABLE t;

Core was generated by `postgres: law regression [local] REINDEX                                      '.
Program terminated with signal SIGABRT, Aborted.

warning: Section `.reg-xstate/338911' in core file too small.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140462399108928) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140462399108928) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140462399108928) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140462399108928, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007fbff2c09476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007fbff2bef7f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x000055d88b8bb8de in ExceptionalCondition (conditionName=0x55d88baac6c0 "portal->portalSnapshot == NULL", 
fileName=0x55d88baac2c4 "pquery.c", lineNumber=1796) at assert.c:66
#6  0x000055d88b6d6186 in EnsurePortalSnapshotExists () at pquery.c:1796
#7  0x000055d88b47d21a in _SPI_execute_plan (plan=0x55d88bfd2a80, options=0x7ffdc53b94c0, snapshot=0x0, 
crosscheck_snapshot=0x0, fire_triggers=true) at spi.c:2579
#8  0x000055d88b479f37 in SPI_execute_plan_with_paramlist (plan=0x55d88bfd2a80, params=0x0, read_only=false, tcount=0)

at spi.c:749
#9  0x00007fbfe6de57ed in exec_run_select (estate=0x7ffdc53b97c0, expr=0x55d88bfba7f8, maxtuples=0, portalP=0x0) at 
pl_exec.c:5815
#10 0x00007fbfe6dddce9 in exec_stmt_perform (estate=0x7ffdc53b97c0, stmt=0x55d88bfba950) at pl_exec.c:2171
#11 0x00007fbfe6ddd87c in exec_stmts (estate=0x7ffdc53b97c0, stmts=0x55d88bfbad90) at pl_exec.c:2023
#12 0x00007fbfe6ddd5f8 in exec_stmt_block (estate=0x7ffdc53b97c0, block=0x55d88bfbade0) at pl_exec.c:1942
#13 0x00007fbfe6ddccff in exec_toplevel_block (estate=0x7ffdc53b97c0, block=0x55d88bfbade0) at pl_exec.c:1633
#14 0x00007fbfe6ddbb6a in plpgsql_exec_event_trigger (func=0x55d88bf66840, trigdata=0x7ffdc53b9b00) at pl_exec.c:1198
#15 0x00007fbfe6df74c7 in plpgsql_call_handler (fcinfo=0x7ffdc53b9aa0) at pl_handler.c:272
#16 0x000055d88b34f592 in EventTriggerInvoke (fn_oid_list=0x55d88beffa40, trigdata=0x7ffdc53b9b00) at
event_trigger.c:1087
#17 0x000055d88b34edea in EventTriggerDDLCommandEnd (parsetree=0x55d88bed6000) at event_trigger.c:803
#18 0x000055d88b6d9a41 in ProcessUtilitySlow (pstate=0x55d88beff930, pstmt=0x55d88bed60b0, queryString=0x55d88bed54f0 
"REINDEX (CONCURRENTLY) TABLE t;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55d88bed6370,
     qc=0x7ffdc53ba310) at utility.c:1937
#19 0x000055d88b6d7ae9 in standard_ProcessUtility (pstmt=0x55d88bed60b0, queryString=0x55d88bed54f0 "REINDEX 
(CONCURRENTLY) TABLE t;", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
     dest=0x55d88bed6370, qc=0x7ffdc53ba310) at utility.c:1074
#20 0x000055d88b6d69ea in ProcessUtility (pstmt=0x55d88bed60b0, queryString=0x55d88bed54f0 "REINDEX (CONCURRENTLY)
TABLE
 
t;", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55d88bed6370,
     qc=0x7ffdc53ba310) at utility.c:530
#21 0x000055d88b6d52b8 in PortalRunUtility (portal=0x55d88bf509f0, pstmt=0x55d88bed60b0, isTopLevel=true, 
setHoldSnapshot=false, dest=0x55d88bed6370, qc=0x7ffdc53ba310) at pquery.c:1158
#22 0x000055d88b6d552f in PortalRunMulti (portal=0x55d88bf509f0, isTopLevel=true, setHoldSnapshot=false, 
dest=0x55d88bed6370, altdest=0x55d88bed6370, qc=0x7ffdc53ba310) at pquery.c:1315
#23 0x000055d88b6d4979 in PortalRun (portal=0x55d88bf509f0, count=9223372036854775807, isTopLevel=true, run_once=true,

dest=0x55d88bed6370, altdest=0x55d88bed6370, qc=0x7ffdc53ba310) at pquery.c:791
#24 0x000055d88b6cd74a in exec_simple_query (query_string=0x55d88bed54f0 "REINDEX (CONCURRENTLY) TABLE t;") at 
postgres.c:1273
#25 0x000055d88b6d26f6 in PostgresMain (dbname=0x55d88bf0c860 "regression", username=0x55d88bf0c848 "law") at 
postgres.c:4653
#26 0x000055d88b5f2014 in BackendRun (port=0x55d88bf00f80) at postmaster.c:4425
#27 0x000055d88b5f1636 in BackendStartup (port=0x55d88bf00f80) at postmaster.c:4104
#28 0x000055d88b5edcf6 in ServerLoop () at postmaster.c:1772
#29 0x000055d88b5ed5f3 in PostmasterMain (argc=3, argv=0x55d88becf700) at postmaster.c:1471
#30 0x000055d88b49d2ca in main (argc=3, argv=0x55d88becf700) at main.c:198

Best regards,
Alexander



Re: PATCH: Add REINDEX tag to event triggers

От
jian he
Дата:
On Mon, Dec 4, 2023 at 11:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>
> Hi Michael,
>
> 04.12.2023 04:04, Michael Paquier wrote:
> > Anyway, I've been working on the patch for the last few days, and
> > applied it after tweaking a bit its style, code and comments.
>
> Please look at the assertion failure triggered when REINDEX processed by an event trigger:
> CREATE FUNCTION etf() RETURNS EVENT_TRIGGER AS $$ BEGIN PERFORM 1; END $$ LANGUAGE plpgsql;
> CREATE EVENT TRIGGER et ON ddl_command_end EXECUTE FUNCTION etf();
> CREATE TABLE t (i int);
> REINDEX (CONCURRENTLY) TABLE t;
>
> Core was generated by `postgres: law regression [local] REINDEX                                      '.
> Program terminated with signal SIGABRT, Aborted.
>
> warning: Section `.reg-xstate/338911' in core file too small.
> #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140462399108928) at ./nptl/pthread_kill.c:44
> 44      ./nptl/pthread_kill.c: No such file or directory.
> (gdb) bt
> #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140462399108928) at ./nptl/pthread_kill.c:44
> #1  __pthread_kill_internal (signo=6, threadid=140462399108928) at ./nptl/pthread_kill.c:78
> #2  __GI___pthread_kill (threadid=140462399108928, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
> #3  0x00007fbff2c09476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
> #4  0x00007fbff2bef7f3 in __GI_abort () at ./stdlib/abort.c:79
> #5  0x000055d88b8bb8de in ExceptionalCondition (conditionName=0x55d88baac6c0 "portal->portalSnapshot == NULL",
> fileName=0x55d88baac2c4 "pquery.c", lineNumber=1796) at assert.c:66
> #6  0x000055d88b6d6186 in EnsurePortalSnapshotExists () at pquery.c:1796
> #7  0x000055d88b47d21a in _SPI_execute_plan (plan=0x55d88bfd2a80, options=0x7ffdc53b94c0, snapshot=0x0,
> crosscheck_snapshot=0x0, fire_triggers=true) at spi.c:2579
> #8  0x000055d88b479f37 in SPI_execute_plan_with_paramlist (plan=0x55d88bfd2a80, params=0x0, read_only=false,
tcount=0)
> at spi.c:749
> #9  0x00007fbfe6de57ed in exec_run_select (estate=0x7ffdc53b97c0, expr=0x55d88bfba7f8, maxtuples=0, portalP=0x0) at
> pl_exec.c:5815
> #10 0x00007fbfe6dddce9 in exec_stmt_perform (estate=0x7ffdc53b97c0, stmt=0x55d88bfba950) at pl_exec.c:2171
> #11 0x00007fbfe6ddd87c in exec_stmts (estate=0x7ffdc53b97c0, stmts=0x55d88bfbad90) at pl_exec.c:2023
> #12 0x00007fbfe6ddd5f8 in exec_stmt_block (estate=0x7ffdc53b97c0, block=0x55d88bfbade0) at pl_exec.c:1942
> #13 0x00007fbfe6ddccff in exec_toplevel_block (estate=0x7ffdc53b97c0, block=0x55d88bfbade0) at pl_exec.c:1633
> #14 0x00007fbfe6ddbb6a in plpgsql_exec_event_trigger (func=0x55d88bf66840, trigdata=0x7ffdc53b9b00) at pl_exec.c:1198
> #15 0x00007fbfe6df74c7 in plpgsql_call_handler (fcinfo=0x7ffdc53b9aa0) at pl_handler.c:272
> #16 0x000055d88b34f592 in EventTriggerInvoke (fn_oid_list=0x55d88beffa40, trigdata=0x7ffdc53b9b00) at
event_trigger.c:1087
> #17 0x000055d88b34edea in EventTriggerDDLCommandEnd (parsetree=0x55d88bed6000) at event_trigger.c:803
> #18 0x000055d88b6d9a41 in ProcessUtilitySlow (pstate=0x55d88beff930, pstmt=0x55d88bed60b0, queryString=0x55d88bed54f0
> "REINDEX (CONCURRENTLY) TABLE t;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55d88bed6370,
>      qc=0x7ffdc53ba310) at utility.c:1937
> #19 0x000055d88b6d7ae9 in standard_ProcessUtility (pstmt=0x55d88bed60b0, queryString=0x55d88bed54f0 "REINDEX
> (CONCURRENTLY) TABLE t;", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
>      dest=0x55d88bed6370, qc=0x7ffdc53ba310) at utility.c:1074
> #20 0x000055d88b6d69ea in ProcessUtility (pstmt=0x55d88bed60b0, queryString=0x55d88bed54f0 "REINDEX (CONCURRENTLY)
TABLE
> t;", readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55d88bed6370,
>      qc=0x7ffdc53ba310) at utility.c:530
> #21 0x000055d88b6d52b8 in PortalRunUtility (portal=0x55d88bf509f0, pstmt=0x55d88bed60b0, isTopLevel=true,
> setHoldSnapshot=false, dest=0x55d88bed6370, qc=0x7ffdc53ba310) at pquery.c:1158
> #22 0x000055d88b6d552f in PortalRunMulti (portal=0x55d88bf509f0, isTopLevel=true, setHoldSnapshot=false,
> dest=0x55d88bed6370, altdest=0x55d88bed6370, qc=0x7ffdc53ba310) at pquery.c:1315
> #23 0x000055d88b6d4979 in PortalRun (portal=0x55d88bf509f0, count=9223372036854775807, isTopLevel=true,
run_once=true,
> dest=0x55d88bed6370, altdest=0x55d88bed6370, qc=0x7ffdc53ba310) at pquery.c:791
> #24 0x000055d88b6cd74a in exec_simple_query (query_string=0x55d88bed54f0 "REINDEX (CONCURRENTLY) TABLE t;") at
> postgres.c:1273
> #25 0x000055d88b6d26f6 in PostgresMain (dbname=0x55d88bf0c860 "regression", username=0x55d88bf0c848 "law") at
> postgres.c:4653
> #26 0x000055d88b5f2014 in BackendRun (port=0x55d88bf00f80) at postmaster.c:4425
> #27 0x000055d88b5f1636 in BackendStartup (port=0x55d88bf00f80) at postmaster.c:4104
> #28 0x000055d88b5edcf6 in ServerLoop () at postmaster.c:1772
> #29 0x000055d88b5ed5f3 in PostmasterMain (argc=3, argv=0x55d88becf700) at postmaster.c:1471
> #30 0x000055d88b49d2ca in main (argc=3, argv=0x55d88becf700) at main.c:198
>
> Best regards,
> Alexander

In indexcmd.c, function, ReindexRelationConcurrently
if (indexIds == NIL)
{
PopActiveSnapshot();
return false;
}

So there is no snapshot left, then PERFORM 1; need a snapshot.



Re: PATCH: Add REINDEX tag to event triggers

От
Michael Paquier
Дата:
On Tue, Dec 05, 2023 at 12:49:12AM +0800, jian he wrote:
> On Mon, Dec 4, 2023 at 11:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>> Please look at the assertion failure triggered when REINDEX processed by an event trigger:
>> CREATE FUNCTION etf() RETURNS EVENT_TRIGGER AS $$ BEGIN PERFORM 1; END $$ LANGUAGE plpgsql;
>> CREATE EVENT TRIGGER et ON ddl_command_end EXECUTE FUNCTION etf();
>> CREATE TABLE t (i int);
>> REINDEX (CONCURRENTLY) TABLE t;
>
> In indexcmd.c, function, ReindexRelationConcurrently
> if (indexIds == NIL)
> {
> PopActiveSnapshot();
> return false;
> }
>
> So there is no snapshot left, then PERFORM 1; need a snapshot.

Popping a snapshot at this stage when there are no indexes has been a
decision taken by the original commit in 5dc92b844e68 because we had
no need for it yet, but we may do now depending on the function
triggered.  I have been looking at the whole stack and something like
the attached to make a pop conditional seems to be sufficient to
satisfy all the cases I've been able to come up with, including the
one reported here.

Alexander, do you see any hole in that?  Perhaps the snapshot push
should be more aggressive at the end of ReindexRelationConcurrently()
as well (in the last transaction when rebuilds happen)?
--
Michael

Вложения

Re: PATCH: Add REINDEX tag to event triggers

От
Alexander Lakhin
Дата:
Hi Michael,

05.12.2023 02:45, Michael Paquier wrote:
> Popping a snapshot at this stage when there are no indexes has been a
> decision taken by the original commit in 5dc92b844e68 because we had
> no need for it yet, but we may do now depending on the function
> triggered.  I have been looking at the whole stack and something like
> the attached to make a pop conditional seems to be sufficient to
> satisfy all the cases I've been able to come up with, including the
> one reported here.
>
> Alexander, do you see any hole in that?  Perhaps the snapshot push
> should be more aggressive at the end of ReindexRelationConcurrently()
> as well (in the last transaction when rebuilds happen)?

Thank you for the fix proposed!

I agree with it. I had worried a bit about ReindexRelationConcurrently()
becoming twofold for callers (it can leave the snapshot or pop it), but I
couldn't find a way to hide this twofoldness inside without adding more
complexity. On the other hand, ReindexRelationConcurrently() now satisfies
EnsurePortalSnapshotExists() in all cases.

Best regards,
Alexander



Re: PATCH: Add REINDEX tag to event triggers

От
Michael Paquier
Дата:
On Wed, Dec 06, 2023 at 10:00:01AM +0300, Alexander Lakhin wrote:
> I agree with it. I had worried a bit about ReindexRelationConcurrently()
> becoming twofold for callers (it can leave the snapshot or pop it), but I
> couldn't find a way to hide this twofoldness inside without adding more
> complexity. On the other hand, ReindexRelationConcurrently() now satisfies
> EnsurePortalSnapshotExists() in all cases.

Thanks, applied that with a few more tests, covering a bit more than
the code path you've reported with a failure.

I was wondering if this should be backpatched, actually, but could not
make a case for it as we've never needed a snapshot after a reindex
until now, AFAIK.
--
Michael

Вложения