Re: Support logical replication of DDLs
От | Ajin Cherian |
---|---|
Тема | Re: Support logical replication of DDLs |
Дата | |
Msg-id | CAFPTHDZ-fVFbD5eFHdx=13K8Tk5hoE57EaRYhGBVt117kQpX4g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Support logical replication of DDLs (shveta malik <shveta.malik@gmail.com>) |
Список | pgsql-hackers |
On Thu, Jun 8, 2023 at 10:02 PM shveta malik <shveta.malik@gmail.com> wrote: > Please find new set of patches addressing below: > a) Issue mentioned by Wang-san in [1], > b) Comments from Peter given in [2] > c) Comments from Amit given in the last 2 emails. > > [1]: https://www.postgresql.org/message-id/OS3PR01MB62750D43D4F7F075B33BD2609E52A%40OS3PR01MB6275.jpnprd01.prod.outlook.com > [2]: https://www.postgresql.org/message-id/CAHut%2BPv9vPbUQc0fzrKmDkKOsS_bj-hup_E%2BsLHNEX%2B6F%2BSY5Q%40mail.gmail.com > > Thank You Vignesh for handling (a), Ajin for handling (b), Shi-san and > Hou-san for contributing in (c). On Fri, May 5, 2023 at 8:10 PM Peter Smith <smithpb2250@gmail.com> wrote: > > I revisited the 0005 patch to verify all changes made to address my > previous review comments [1][2][3][4] were OK. > > Not all changes were made quite as expected, and there were a few > other things I noticed in passing. > > ====== > > 1. General > > I previously [1] wrote a comment: > Use consistent uppercase for JSON and DDL instead of sometimes json > and ddl. There are quite a few random examples in the commit message > but might be worth searching the entire patch to make all comments > also consistent case. > > Now I still find a number of lowercase "json" and "ddl" strings. > fixed > > 3. Commit message > > Executing a non-immutable expression during the table rewrite phase is not > allowed, as it may result in different data between publisher and subscriber. > While some may suggest converting the rewrite inserts to updates and replicate > them afte the ddl command to maintain data consistency. But it doesn't work if > the replica identity column is altered in the command. This is because the > rewrite inserts do not contain the old values and therefore cannot be converted > to update. > > ~ > > 3a. > Grammar and typo need fixing for "While some may suggest converting > the rewrite inserts to updates and replicate them afte the ddl command > to maintain data consistency. But it doesn't work if the replica > identity column is altered in the command." > > ~ > > 3b. > "rewrite inserts to updates" > Consider using uppercase for the INSERTs and UPDATEs > > ~~~ > > 4. > LIMIT: > > --> LIMITATIONS: (??) > Fixed. > > ====== > contrib/test_decoding/sql/ddl.sql > > 5. > +SELECT 'ddl msg2' FROM pg_logical_emit_ddl_message('ddl msg2', 16394, > 1, '{"fmt": "CREATE SCHEMA %{if_not_exists}s %{name}I > %{authorization}s", "name": "foo", "authorization": {"fmt": > "AUTHORIZATION %{authorization_role}I", "present": false, > "authorization_role": null}, "if_not_exists": ""}'); > > Previously ([4]#1) I had asked what is the point of setting a JSON > payload here when the JSON payload is never used. You might as well > pass the string "banana" to achieve the same thing AFAICT. I think the > reply [5] to the question was wrong. If this faked JSON is used for > some good reason then there ought to be a test comment to say the > reason. Otherwise, the fake JSON just confuses the purpose of this > test so it should be removed/simplified. > added a comment explainig this > ====== > contrib/test_decoding/test_decoding.c > > 6. pg_decode_ddl_message > > Previously ([4] #4b) I asked if it was necessary to use > appendBinaryStringInfo, instead of just appendStringInfo. I guess it > doesn't matter much, but I think the question was not answered. > Although we're using plain strings, the API supports binary. Other plugins could do use binary. > ====== > doc/src/sgml/catalogs.sgml > > 7. > + <row> > + <entry role="catalog_table_entry"><para role="column_definition"> > + <structfield>evtisinternal</structfield> <type>char</type> > + </para> > + <para> > + True if the event trigger is internally generated. > + </para></entry> > + </row> > > Why was this called a 'char' type instead of a 'bool' type? > fixed. > ====== > doc/src/sgml/logical-replication.sgml > > 8. > + <para> > + For example, a CREATE TABLE command executed on the publisher gets > + WAL-logged, and forwarded to the subscriber to replay; a subsequent "ALTER > + SUBSCRIPTION ... REFRESH PUBLICATION" is performed on the > subscriber database so any > + following DML changes on the new table can be replicated. > + </para> > > In my previous review comments ([2] 11b) I suggested for this to say > "then an implicit ALTER..." instead of "a subsequent ALTER...". I > think the "implicit" part got accidentally missed. > fixed. > ~~~ > > 9. > + <listitem> > + <para> > + In <literal>ADD COLUMN ... DEFAULT</literal> clause and > + <literal>ALTER COLUMN TYPE</literal> clause of <command>ALTER > + TABLE</command> command, the functions and operators used in > + expression must be immutable. > + </para> > + </listitem> > > IMO this is hard to read. It might be easier if expressed as 2 > separate bullet points. > > SUGGESTION > For ALTER TABLE ... ADD COLUMN ... DEFAULT, the functions and > operators used in expressions must be immutable. > > For ALTER TABLE ... ADD COLUMN TYPE, the functions and operators used > in expressions must be immutable. > fixed. > ~~~ > > 10. > + <para> > + To change the column type, first add a new column of the desired > + type, then update the new column value with the old column value, > + and finnally drop the old column and rename the new column to the > + old column. > + </para> > > /finnally/finally/ > fixed. > ====== > .../access/rmgrdesc/logicalddlmsgdesc.c > > 11. logicalddlmsg_desc > > I previously wrote some suggestions about improving the Assert in this > code (see [3]#2). But, the reply [5] "The array index is already > length + 1 as indices start from 0." did not make sense, because I am > not saying the code has wrong indices. I am only saying the way the > Asserts are done was inconsistent with other similar MESSAGE msg, and > IMO there is a more intuitive way to assert that the DDL Message has > got some payload in it. > fixed. > ====== > src/backend/catalog/pg_publication.c > > 12. > pub->pubactions.pubinsert = pubform->pubinsert; > pub->pubactions.pubupdate = pubform->pubupdate; > pub->pubactions.pubdelete = pubform->pubdelete; > + pub->pubactions.pubddl_table = pubform->pubddl_table; > pub->pubactions.pubtruncate = pubform->pubtruncate; > pub->pubviaroot = pubform->pubviaroot; > > IMO all the insert/update/delete/truncate belong together because they > all came from the 'publish' parameter. I don't think pubddl_table just > be jammed into the middle of them. > fixed. > ====== > src/backend/commands/event_trigger.c > > 13. > static Oid insert_event_trigger_tuple(const char *trigname, const > char *eventname, > - Oid evtOwner, Oid funcoid, List *taglist); > + Oid evtOwner, Oid funcoid, List *taglist, bool isinternal); > > /isinternal/is_internal/ > fixed. > ~~~ > > 14. CreateEventTrigger > > * Create an event trigger. > */ > Oid > -CreateEventTrigger(CreateEventTrigStmt *stmt) > +CreateEventTrigger(CreateEventTrigStmt *stmt, bool isinternal) > > /isinternal/is_internal/ > fixed. > ~~~ > > 15. insert_event_trigger_tuple > > /* Insert catalog entries. */ > return insert_event_trigger_tuple(stmt->trigname, stmt->eventname, > - evtowner, funcoid, tags); > + evtowner, funcoid, tags, isinternal); > > /isinternal/is_internal/ > > ~~~ > > 16. > if (filter_event_trigger(tag, item)) > { > - /* We must plan to fire this trigger. */ > - runlist = lappend_oid(runlist, item->fnoid); > + static const char *trigger_func_prefix = "publication_deparse_%s"; > + char trigger_func_name[NAMEDATALEN]; > + Oid pub_funcoid; > + List *pubfuncname; > + > + /* Get function oid of the publication's ddl deparse event trigger */ > + snprintf(trigger_func_name, sizeof(trigger_func_name), trigger_func_prefix, > + eventstr); > + pubfuncname = SystemFuncName(trigger_func_name); > + pub_funcoid = LookupFuncName(pubfuncname, 0, NULL, true); > + > + if (item->fnoid != pub_funcoid) > + runlist = lappend_oid(runlist, item->fnoid); > + else > + { > + /* Only the first ddl deparse event trigger needs to be invoked */ > + if (pub_deparse_func_cnt++ == 0) > + runlist = lappend_oid(runlist, item->fnoid); > + } > > 16a. > I somehow felt this logic would be more natural/readable if the check > was for == pub_funcoid instead of != pub_funcoid. > fixed. > ~ > > 16b. > Maybe use /pubfuncname/pub_funcname/ for consistent variable naming. > > ====== > src/backend/commands/publicationcmds.c > > 17. DropDDLReplicaEventTriggers > > +static void > +DropDDLReplicaEventTriggers(Oid puboid) > +{ > + DropDDLReplicaEventTrigger(PUB_TRIG_DDL_CMD_START, puboid); > + DropDDLReplicaEventTrigger(PUB_TRIG_DDL_CMD_END, puboid); > + DropDDLReplicaEventTrigger(PUB_TRIG_TBL_REWRITE, puboid); > + DropDDLReplicaEventTrigger(PUB_TRIG_TBL_INIT_WRITE, puboid); > +} > + > + > > Double blank lines. > > fixed. > ====== > src/backend/replication/logical/ddltrigger.c > > > 18. > +/* > + * Check if the command can be publishable. > + * > + * XXX Executing a non-immutable expression during the table rewrite phase is > + * not allowed, as it may result in different data between publisher and > + * subscriber. While some may suggest converting the rewrite inserts to updates > + * and replicate them after the ddl command to maintain data > consistency. But it > + * doesn't work if the replica identity column is altered in the command. This > + * is because the rewrite inserts do not contain the old values and therefore > + * cannot be converted to update. > + * > + * Apart from that, commands contain volatile functions are not > allowed. Because > + * it's possible the functions contain DDL/DML in which case these operations > + * will be executed twice and cause duplicate data. In addition, we don't know > + * whether the tables being accessed by these DDL/DML are published or not. So > + * blindly allowing such functions can allow unintended clauses like the tables > + * accessed in those functions may not even exist on the subscriber. > + */ > +static void > +check_command_publishable(ddl_deparse_context context, bool is_rewrite) > > 18a. > "can be publishable" --> "can be published" > > ~ > > 18b. > While some may suggest converting the rewrite inserts to updates and > replicate them after the ddl command to maintain data consistency. But > it doesn't work if the replica identity column is altered in the > command. > > Grammar? Why is this split into 2 sentences? > fixed. > ~ > > 18c. > Apart from that, commands contain volatile functions are not allowed. > > /contain/containing/ > fixed. > ~~~ > > 19. check_command_publishable > > + if (context.func_volatile == PROVOLATILE_VOLATILE) > + ereport(ERROR, > + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("cannot use volatile function in ALTER TABLE command because > it cannot be replicated in DDL replication")); > +} > > Is it correct for this message to name ALTER TABLE when that is not > even part of the check? Is that the only scenario where this is > possible to occur? > fixed. > ~~~ > > 20. publication_deparse_ddl_command_end > > + /* handle drop commands which appear in the SQLDropList */ > + slist_foreach(iter, &(currentEventTriggerState->SQLDropList)) > + { > + SQLDropObject *obj; > + EventTriggerData *trigdata; > + char *command; > + DeparsedCommandType cmdtype; > + > + trigdata = (EventTriggerData *) fcinfo->context; > + > + obj = slist_container(SQLDropObject, next, iter.cur); > + > + if (!obj->original) > + continue; > + > + if (strcmp(obj->objecttype, "table") == 0) > + cmdtype = DCT_TableDropEnd; > + else > + continue; > + > + command = deparse_drop_command(obj->objidentity, obj->objecttype, > + trigdata->parsetree); > + > + if (command) > + LogLogicalDDLMessage("deparse", obj->address.objectId, cmdtype, > + command, strlen(command) + 1); > + } > > > 20a. > Uppercase the comment. > fixed. > ~ > > 20b. > Also, it's not a very good comment -- because not giving any more > information than the line of code; can you give a more detailed > explanation? > fixed. > ~ > > 20c. > The way the continues are arranged seems a bit strange. Since this is > all DROP code wouldn't it make more sense to write it like this: > > BEFORE > + if (strcmp(obj->objecttype, "table") == 0) > + cmdtype = DCT_TableDropEnd; > + else > + continue; > + > + command = deparse_drop_command(obj->objidentity, obj->objecttype, > + trigdata->parsetree); > + > + if (command) > + LogLogicalDDLMessage("deparse", obj->address.objectId, cmdtype, > + command, strlen(command) + 1); > > AFTER > if (strcmp(obj->objecttype, "table") == 0) > { > DeparsedCommandType cmdtype = DCT_TableDropEnd; > char *command; > > command = deparse_drop_command(obj->objidentity, obj->objecttype, > trigdata->parsetree); > if (command) > LogLogicalDDLMessage("deparse", obj->address.objectId, cmdtype, > command, strlen(command) + 1); > } > fixed. > ~~~~ > > 21. publication_deparse_table_init_write > > + } > + return PointerGetDatum(NULL); > > Add a blank line before the return; > > fixed. > > > ====== > .../replication/logical/reorderbuffer.c > > 22. ReorderBufferRestoreChange > > + /* read prefix */ > + memcpy(&prefix_size, data, sizeof(Size)); > + data += sizeof(Size); > + memcpy(&change->data.ddl.relid, data, sizeof(Oid)); > + data += sizeof(Oid); > + memcpy(&change->data.ddl.cmdtype, data, sizeof(DeparsedCommandType)); > + data += sizeof(int); > + change->data.ddl.prefix = MemoryContextAlloc(rb->context, prefix_size); > + memcpy(change->data.ddl.prefix, data, prefix_size); > + Assert(change->data.ddl.prefix[prefix_size - 1] == '\0'); > + data += prefix_size; > > I had suggested before ([3] #23) that it would be better to use: > data += sizeof(DeparsedCommandType); > > instead of: > data += sizeof(int); > > You already changed this OK in another place but this instance got > accidentally missed. > fixed. > ====== > src/backend/replication/logical/worker.c > > 23. preprocess_create_table > > + if (castmt->objtype == OBJECT_TABLE) > + { > + /* > + * Force skipping data population to avoid data > + * inconsistency. Data should be replicated from the > + * publisher instead. > + */ > + castmt->into->skipData = true; > + } > > I had suggested before ([4] #16b) that the "Force skipping" comments > are not necessary because the function header comment already says the > same thing. One of the "Force skipping" comments was removed OK, but > there is still one more remaining that should be removed. > fixed. > ~~~ > > 24. postprocess_ddl_create_table > > + commandTag = CreateCommandTag((Node *) command); > + cstmt = (CreateStmt *) command->stmt; > + rv = cstmt->relation; > + > + if (commandTag != CMDTAG_CREATE_TABLE) > + return; > + > + cstmt = (CreateStmt *) command->stmt; > + rv = cstmt->relation; > + if (!rv) > + return; > > This code is still flawed as previously described (see [4]#18). There > are duplicate assignments of 'cstmt' and 'rv'. > fixed. > ~~~ > > 25. apply_handle_ddl > > +/* > + * Handle DDL replication messages. Convert the json string into a query > + * string and run it through the query portal. > + */ > +static void > +apply_handle_ddl(StringInfo s) > > IMO for consistency this should use the same style as the other > function comments. So after the first sentence, put a more detailed > description after a blank line. > fixed. > ~~~ > > 26. apply_handle_ddl > > I previously ([4]#21) asked a questio: > There seems to be an assumption here that the only kind of command > processed here would be TABLE related. Maybe that is currently true, > but shouldn't there be some error checking just to make sure it cannot > execute unexpected commands? > > ~ > > IMO this question remains relevant -- I think this ddl code needs some > kind of additional guards/checks in it otherwise it will attempt to > deparse commands that it does not understand (e.g. imagine a later > version publisher which supports more DDL than the subscriber does). > Currently, according to the design, there is no distinction between what publisher supports and subscriber supports. Only the publication decides what is replicated and not, subscription has no control. > ====== > src/backend/replication/pgoutput/pgoutput.c > > 27. PGOutputTxnData > > typedef struct PGOutputTxnData > { > bool sent_begin_txn; /* flag indicating whether BEGIN has been sent */ > + List *deleted_relids; /* maintain list of deleted table oids */ > } PGOutputTxnData; > > Actually, from my previous review (see [4]#22) I meant for this to be > a more detailed *structure* level comment to say why this is necessary > even to have this member; not just a basic field comment like what has > been added. > fixed. > ~~~ > > 28. is_object_published > > + /* > + * Only send this ddl if we don't publish ddl message or the ddl > + * need to be published via its root relation. > + */ > + if (relentry->pubactions.pubddl_table && > + relentry->publish_as_relid == objid) > + return true; > > The comment seems wrong/confused – "Only send this ddl if we don't > publish ddl message" (??) > fixed. > ====== > src/bin/pg_dump/pg_dump.c > > 29. getEventTriggers > > + /* skip internally created event triggers by checking evtisinternal */ > appendPQExpBufferStr(query, > "SELECT e.tableoid, e.oid, evtname, evtenabled, " > "evtevent, evtowner, " > > Uppercase the comment. > fixed. > ====== > src/include/catalog/pg_event_trigger.h > > 33. > @@ -36,7 +36,7 @@ CATALOG(pg_event_trigger,3466,EventTriggerRelationId) > * called */ > char evtenabled; /* trigger's firing configuration WRT > * session_replication_role */ > - > + bool evtisinternal; /* trigger is system-generated */ > #ifdef CATALOG_VARLEN > text evttags[1]; /* command TAGs this event trigger targets */ > #endif > > ~ > > This change should not remove the blank line that previously existed > before the #ifdef CATALOG_VARLEN. > fixed. > ====== > src/include/catalog/pg_publication. > > 34. > +/* Publication trigger events */ > +#define PUB_TRIG_DDL_CMD_START "ddl_command_start" > +#define PUB_TRIG_DDL_CMD_END "ddl_command_end" > +#define PUB_TRIG_TBL_REWRITE "table_rewrite" > +#define PUB_TRIG_TBL_INIT_WRITE "table_init_write" > > Elsewhere in PG15 code there are already hardcoded literal strings for > these triggers, so I am wondering if these constants should really be > defined in some common place where everybody can make use of them > instead of having a mixture of string literals and macros for the same > strings. > fixed. > ====== > src/include/commands/event_trigger.h > > 35. > -extern Oid CreateEventTrigger(CreateEventTrigStmt *stmt); > +extern Oid CreateEventTrigger(CreateEventTrigStmt *stmt, bool isinternal); > extern Oid get_event_trigger_oid(const char *trigname, bool missing_ok); > > IMO a better name is 'is_internal' (Using a snake-case name matches > like the other 'missing_ok') > fixed. > ====== > src/include/replication/ddlmessage.h > > 36. > + * Copyright (c) 2022, PostgreSQL Global Development Group > > Copyright for the new file should be 2023? > fixed. > ====== > src/include/tcop/ddldeparse.h > > 37. > * ddldeparse.h > * > * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group > * Portions Copyright (c) 1994, Regents of the University of California > * > * src/include/tcop/ddldeparse.h > > ~ > > I think this is a new file for the feature so why is the copyright > talking about old dates like 1994,1996 etc? > fixed. regards, Ajin Cherian
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Masahiro IkedaДата:
Сообщение: Re: Introduce WAIT_EVENT_EXTENSION and WAIT_EVENT_BUFFER_PIN
Следующее
От: "Drouvot, Bertrand"Дата:
Сообщение: Re: Introduce WAIT_EVENT_EXTENSION and WAIT_EVENT_BUFFER_PIN