Re: Support logical replication of DDLs
От | Peter Smith |
---|---|
Тема | Re: Support logical replication of DDLs |
Дата | |
Msg-id | CAHut+Pv9vPbUQc0fzrKmDkKOsS_bj-hup_E+sLHNEX+6F+SY5Q@mail.gmail.com обсуждение исходный текст |
Ответ на | RE: Support logical replication of DDLs ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
Список | pgsql-hackers |
I revisited the 0005 patch to verify all changes made to address my previous review comments [1][2][3][4] were OK. Not all changes were made quite as expected, and there were a few other things I noticed in passing. ====== 1. General I previously [1] wrote a comment: Use consistent uppercase for JSON and DDL instead of sometimes json and ddl. There are quite a few random examples in the commit message but might be worth searching the entire patch to make all comments also consistent case. Now I still find a number of lowercase "json" and "ddl" strings. ~~~ 2. Some of my previous review comments were replied [5] as "Will add this in a later version"... or "Keeping this same for now". IMO it would be much better to add a "TODO" or a "FIXME" comment directly into the source for anything described like that, otherwise the list of things "to do later" is just going to get misplaced in all the long thread posts, and/or other reviews are going to report exactly the same missing stuff again later. This review comment applies also to PG DOCS which are known as incomplete or under discussion - I think there should be a TODO/FIXME in those SGML files or the Commit message. e.g. [1]#9b e.g. [2]#12abc e.g. [2]#13c e.g. [2]#14b e.g. [2]#15ab e.g. [2]#17 e.g. [3] (table excessively long) e.g. [4]#2 e.g. [4]#11 ~~~ 3. Commit message Executing a non-immutable expression during the table rewrite phase is not allowed, as it may result in different data between publisher and subscriber. While some may suggest converting the rewrite inserts to updates and replicate them afte the ddl command to maintain data consistency. But it doesn't work if the replica identity column is altered in the command. This is because the rewrite inserts do not contain the old values and therefore cannot be converted to update. ~ 3a. Grammar and typo need fixing for "While some may suggest converting the rewrite inserts to updates and replicate them afte the ddl command to maintain data consistency. But it doesn't work if the replica identity column is altered in the command." ~ 3b. "rewrite inserts to updates" Consider using uppercase for the INSERTs and UPDATEs ~~~ 4. LIMIT: --> LIMITATIONS: (??) ====== contrib/test_decoding/sql/ddl.sql 5. +SELECT 'ddl msg2' FROM pg_logical_emit_ddl_message('ddl msg2', 16394, 1, '{"fmt": "CREATE SCHEMA %{if_not_exists}s %{name}I %{authorization}s", "name": "foo", "authorization": {"fmt": "AUTHORIZATION %{authorization_role}I", "present": false, "authorization_role": null}, "if_not_exists": ""}'); Previously ([4]#1) I had asked what is the point of setting a JSON payload here when the JSON payload is never used. You might as well pass the string "banana" to achieve the same thing AFAICT. I think the reply [5] to the question was wrong. If this faked JSON is used for some good reason then there ought to be a test comment to say the reason. Otherwise, the fake JSON just confuses the purpose of this test so it should be removed/simplified. ====== contrib/test_decoding/test_decoding.c 6. pg_decode_ddl_message Previously ([4] #4b) I asked if it was necessary to use appendBinaryStringInfo, instead of just appendStringInfo. I guess it doesn't matter much, but I think the question was not answered. ====== doc/src/sgml/catalogs.sgml 7. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>evtisinternal</structfield> <type>char</type> + </para> + <para> + True if the event trigger is internally generated. + </para></entry> + </row> Why was this called a 'char' type instead of a 'bool' type? ====== doc/src/sgml/logical-replication.sgml 8. + <para> + For example, a CREATE TABLE command executed on the publisher gets + WAL-logged, and forwarded to the subscriber to replay; a subsequent "ALTER + SUBSCRIPTION ... REFRESH PUBLICATION" is performed on the subscriber database so any + following DML changes on the new table can be replicated. + </para> In my previous review comments ([2] 11b) I suggested for this to say "then an implicit ALTER..." instead of "a subsequent ALTER...". I think the "implicit" part got accidentally missed. ~~~ 9. + <listitem> + <para> + In <literal>ADD COLUMN ... DEFAULT</literal> clause and + <literal>ALTER COLUMN TYPE</literal> clause of <command>ALTER + TABLE</command> command, the functions and operators used in + expression must be immutable. + </para> + </listitem> IMO this is hard to read. It might be easier if expressed as 2 separate bullet points. SUGGESTION For ALTER TABLE ... ADD COLUMN ... DEFAULT, the functions and operators used in expressions must be immutable. For ALTER TABLE ... ADD COLUMN TYPE, the functions and operators used in expressions must be immutable. ~~~ 10. + <para> + To change the column type, first add a new column of the desired + type, then update the new column value with the old column value, + and finnally drop the old column and rename the new column to the + old column. + </para> /finnally/finally/ ====== .../access/rmgrdesc/logicalddlmsgdesc.c 11. logicalddlmsg_desc I previously wrote some suggestions about improving the Assert in this code (see [3]#2). But, the reply [5] "The array index is already length + 1 as indices start from 0." did not make sense, because I am not saying the code has wrong indices. I am only saying the way the Asserts are done was inconsistent with other similar MESSAGE msg, and IMO there is a more intuitive way to assert that the DDL Message has got some payload in it. ====== src/backend/catalog/pg_publication.c 12. pub->pubactions.pubinsert = pubform->pubinsert; pub->pubactions.pubupdate = pubform->pubupdate; pub->pubactions.pubdelete = pubform->pubdelete; + pub->pubactions.pubddl_table = pubform->pubddl_table; pub->pubactions.pubtruncate = pubform->pubtruncate; pub->pubviaroot = pubform->pubviaroot; IMO all the insert/update/delete/truncate belong together because they all came from the 'publish' parameter. I don't think pubddl_table just be jammed into the middle of them. ====== src/backend/commands/event_trigger.c 13. static Oid insert_event_trigger_tuple(const char *trigname, const char *eventname, - Oid evtOwner, Oid funcoid, List *taglist); + Oid evtOwner, Oid funcoid, List *taglist, bool isinternal); /isinternal/is_internal/ ~~~ 14. CreateEventTrigger * Create an event trigger. */ Oid -CreateEventTrigger(CreateEventTrigStmt *stmt) +CreateEventTrigger(CreateEventTrigStmt *stmt, bool isinternal) /isinternal/is_internal/ ~~~ 15. insert_event_trigger_tuple /* Insert catalog entries. */ return insert_event_trigger_tuple(stmt->trigname, stmt->eventname, - evtowner, funcoid, tags); + evtowner, funcoid, tags, isinternal); /isinternal/is_internal/ ~~~ 16. if (filter_event_trigger(tag, item)) { - /* We must plan to fire this trigger. */ - runlist = lappend_oid(runlist, item->fnoid); + static const char *trigger_func_prefix = "publication_deparse_%s"; + char trigger_func_name[NAMEDATALEN]; + Oid pub_funcoid; + List *pubfuncname; + + /* Get function oid of the publication's ddl deparse event trigger */ + snprintf(trigger_func_name, sizeof(trigger_func_name), trigger_func_prefix, + eventstr); + pubfuncname = SystemFuncName(trigger_func_name); + pub_funcoid = LookupFuncName(pubfuncname, 0, NULL, true); + + if (item->fnoid != pub_funcoid) + runlist = lappend_oid(runlist, item->fnoid); + else + { + /* Only the first ddl deparse event trigger needs to be invoked */ + if (pub_deparse_func_cnt++ == 0) + runlist = lappend_oid(runlist, item->fnoid); + } 16a. I somehow felt this logic would be more natural/readable if the check was for == pub_funcoid instead of != pub_funcoid. ~ 16b. Maybe use /pubfuncname/pub_funcname/ for consistent variable naming. ====== src/backend/commands/publicationcmds.c 17. DropDDLReplicaEventTriggers +static void +DropDDLReplicaEventTriggers(Oid puboid) +{ + DropDDLReplicaEventTrigger(PUB_TRIG_DDL_CMD_START, puboid); + DropDDLReplicaEventTrigger(PUB_TRIG_DDL_CMD_END, puboid); + DropDDLReplicaEventTrigger(PUB_TRIG_TBL_REWRITE, puboid); + DropDDLReplicaEventTrigger(PUB_TRIG_TBL_INIT_WRITE, puboid); +} + + Double blank lines. ====== src/backend/replication/logical/ddltrigger.c 18. +/* + * Check if the command can be publishable. + * + * XXX Executing a non-immutable expression during the table rewrite phase is + * not allowed, as it may result in different data between publisher and + * subscriber. While some may suggest converting the rewrite inserts to updates + * and replicate them after the ddl command to maintain data consistency. But it + * doesn't work if the replica identity column is altered in the command. This + * is because the rewrite inserts do not contain the old values and therefore + * cannot be converted to update. + * + * Apart from that, commands contain volatile functions are not allowed. Because + * it's possible the functions contain DDL/DML in which case these operations + * will be executed twice and cause duplicate data. In addition, we don't know + * whether the tables being accessed by these DDL/DML are published or not. So + * blindly allowing such functions can allow unintended clauses like the tables + * accessed in those functions may not even exist on the subscriber. + */ +static void +check_command_publishable(ddl_deparse_context context, bool is_rewrite) 18a. "can be publishable" --> "can be published" ~ 18b. While some may suggest converting the rewrite inserts to updates and replicate them after the ddl command to maintain data consistency. But it doesn't work if the replica identity column is altered in the command. Grammar? Why is this split into 2 sentences? ~ 18c. Apart from that, commands contain volatile functions are not allowed. /contain/containing/ ~~~ 19. check_command_publishable + if (context.func_volatile == PROVOLATILE_VOLATILE) + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot use volatile function in ALTER TABLE command because it cannot be replicated in DDL replication")); +} Is it correct for this message to name ALTER TABLE when that is not even part of the check? Is that the only scenario where this is possible to occur? ~~~ 20. publication_deparse_ddl_command_end + /* handle drop commands which appear in the SQLDropList */ + slist_foreach(iter, &(currentEventTriggerState->SQLDropList)) + { + SQLDropObject *obj; + EventTriggerData *trigdata; + char *command; + DeparsedCommandType cmdtype; + + trigdata = (EventTriggerData *) fcinfo->context; + + obj = slist_container(SQLDropObject, next, iter.cur); + + if (!obj->original) + continue; + + if (strcmp(obj->objecttype, "table") == 0) + cmdtype = DCT_TableDropEnd; + else + continue; + + command = deparse_drop_command(obj->objidentity, obj->objecttype, + trigdata->parsetree); + + if (command) + LogLogicalDDLMessage("deparse", obj->address.objectId, cmdtype, + command, strlen(command) + 1); + } 20a. Uppercase the comment. ~ 20b. Also, it's not a very good comment -- because not giving any more information than the line of code; can you give a more detailed explanation? ~ 20c. The way the continues are arranged seems a bit strange. Since this is all DROP code wouldn't it make more sense to write it like this: BEFORE + if (strcmp(obj->objecttype, "table") == 0) + cmdtype = DCT_TableDropEnd; + else + continue; + + command = deparse_drop_command(obj->objidentity, obj->objecttype, + trigdata->parsetree); + + if (command) + LogLogicalDDLMessage("deparse", obj->address.objectId, cmdtype, + command, strlen(command) + 1); AFTER if (strcmp(obj->objecttype, "table") == 0) { DeparsedCommandType cmdtype = DCT_TableDropEnd; char *command; command = deparse_drop_command(obj->objidentity, obj->objecttype, trigdata->parsetree); if (command) LogLogicalDDLMessage("deparse", obj->address.objectId, cmdtype, command, strlen(command) + 1); } ~~~~ 21. publication_deparse_table_init_write + } + return PointerGetDatum(NULL); Add a blank line before the return; ====== .../replication/logical/reorderbuffer.c 22. ReorderBufferRestoreChange + /* read prefix */ + memcpy(&prefix_size, data, sizeof(Size)); + data += sizeof(Size); + memcpy(&change->data.ddl.relid, data, sizeof(Oid)); + data += sizeof(Oid); + memcpy(&change->data.ddl.cmdtype, data, sizeof(DeparsedCommandType)); + data += sizeof(int); + change->data.ddl.prefix = MemoryContextAlloc(rb->context, prefix_size); + memcpy(change->data.ddl.prefix, data, prefix_size); + Assert(change->data.ddl.prefix[prefix_size - 1] == '\0'); + data += prefix_size; I had suggested before ([3] #23) that it would be better to use: data += sizeof(DeparsedCommandType); instead of: data += sizeof(int); You already changed this OK in another place but this instance got accidentally missed. ====== src/backend/replication/logical/worker.c 23. preprocess_create_table + if (castmt->objtype == OBJECT_TABLE) + { + /* + * Force skipping data population to avoid data + * inconsistency. Data should be replicated from the + * publisher instead. + */ + castmt->into->skipData = true; + } I had suggested before ([4] #16b) that the "Force skipping" comments are not necessary because the function header comment already says the same thing. One of the "Force skipping" comments was removed OK, but there is still one more remaining that should be removed. ~~~ 24. postprocess_ddl_create_table + commandTag = CreateCommandTag((Node *) command); + cstmt = (CreateStmt *) command->stmt; + rv = cstmt->relation; + + if (commandTag != CMDTAG_CREATE_TABLE) + return; + + cstmt = (CreateStmt *) command->stmt; + rv = cstmt->relation; + if (!rv) + return; This code is still flawed as previously described (see [4]#18). There are duplicate assignments of 'cstmt' and 'rv'. ~~~ 25. apply_handle_ddl +/* + * Handle DDL replication messages. Convert the json string into a query + * string and run it through the query portal. + */ +static void +apply_handle_ddl(StringInfo s) IMO for consistency this should use the same style as the other function comments. So after the first sentence, put a more detailed description after a blank line. ~~~ 26. apply_handle_ddl I previously ([4]#21) asked a questio: There seems to be an assumption here that the only kind of command processed here would be TABLE related. Maybe that is currently true, but shouldn't there be some error checking just to make sure it cannot execute unexpected commands? ~ IMO this question remains relevant -- I think this ddl code needs some kind of additional guards/checks in it otherwise it will attempt to deparse commands that it does not understand (e.g. imagine a later version publisher which supports more DDL than the subscriber does). ====== src/backend/replication/pgoutput/pgoutput.c 27. PGOutputTxnData typedef struct PGOutputTxnData { bool sent_begin_txn; /* flag indicating whether BEGIN has been sent */ + List *deleted_relids; /* maintain list of deleted table oids */ } PGOutputTxnData; Actually, from my previous review (see [4]#22) I meant for this to be a more detailed *structure* level comment to say why this is necessary even to have this member; not just a basic field comment like what has been added. ~~~ 28. is_object_published + /* + * Only send this ddl if we don't publish ddl message or the ddl + * need to be published via its root relation. + */ + if (relentry->pubactions.pubddl_table && + relentry->publish_as_relid == objid) + return true; The comment seems wrong/confused – "Only send this ddl if we don't publish ddl message" (??) ====== src/bin/pg_dump/pg_dump.c 29. getEventTriggers + /* skip internally created event triggers by checking evtisinternal */ appendPQExpBufferStr(query, "SELECT e.tableoid, e.oid, evtname, evtenabled, " "evtevent, evtowner, " Uppercase the comment. ====== src/bin/pg_dump/t/002_pg_dump.pl 30. A reply to my comment ([1]9b) about missing test cases says "test case are in later patches". In general, I don't think it is a good idea to separate the test cases from the functionality. Eventually, they must be merged anyway because it would be wrong to push untested code to HEADe. So why not include all the relevant test cases here and now? Or at least put a "FIXME" in this patch so you know there is a test case that *must* be handled in a subsequent patch, otherwise, this information will be lost. ====== src/bin/psql/describe.c 31. listPublications @@ -6210,6 +6210,10 @@ listPublications(const char *pattern) gettext_noop("Inserts"), gettext_noop("Updates"), gettext_noop("Deletes")); + if (pset.sversion >= 160000) + appendPQExpBuffer(&buf, + ",\n pubddl_table AS \"%s\"", + gettext_noop("Table DDLs")); if (pset.sversion >= 110000) IMO the order of the columns is wrong. I think the insert/update/delete/truncate should all be grouped together because they are all determined by the same publication parameter ('publish'). ~~~ 32. describePublications " puballtables, pubinsert, pubupdate, pubdelete"); + if (has_pubddl) + appendPQExpBufferStr(&buf, + ", pubddl_table"); if (has_pubtruncate) appendPQExpBufferStr(&buf, Same as previous comment about the column order ====== src/include/catalog/pg_event_trigger.h 33. @@ -36,7 +36,7 @@ CATALOG(pg_event_trigger,3466,EventTriggerRelationId) * called */ char evtenabled; /* trigger's firing configuration WRT * session_replication_role */ - + bool evtisinternal; /* trigger is system-generated */ #ifdef CATALOG_VARLEN text evttags[1]; /* command TAGs this event trigger targets */ #endif ~ This change should not remove the blank line that previously existed before the #ifdef CATALOG_VARLEN. ====== src/include/catalog/pg_publication. 34. +/* Publication trigger events */ +#define PUB_TRIG_DDL_CMD_START "ddl_command_start" +#define PUB_TRIG_DDL_CMD_END "ddl_command_end" +#define PUB_TRIG_TBL_REWRITE "table_rewrite" +#define PUB_TRIG_TBL_INIT_WRITE "table_init_write" Elsewhere in PG15 code there are already hardcoded literal strings for these triggers, so I am wondering if these constants should really be defined in some common place where everybody can make use of them instead of having a mixture of string literals and macros for the same strings. ====== src/include/commands/event_trigger.h 35. -extern Oid CreateEventTrigger(CreateEventTrigStmt *stmt); +extern Oid CreateEventTrigger(CreateEventTrigStmt *stmt, bool isinternal); extern Oid get_event_trigger_oid(const char *trigname, bool missing_ok); IMO a better name is 'is_internal' (Using a snake-case name matches like the other 'missing_ok') ====== src/include/replication/ddlmessage.h 36. + * Copyright (c) 2022, PostgreSQL Global Development Group Copyright for the new file should be 2023? ====== src/include/tcop/ddldeparse.h 37. * ddldeparse.h * * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * * src/include/tcop/ddldeparse.h ~ I think this is a new file for the feature so why is the copyright talking about old dates like 1994,1996 etc? ====== src/test/regress/expected/psql.out 38. IMO the column output is not in a good order. See review comments for describe.c ====== src/test/regress/expected/publication.out 39. I previously posted a review comment about some missing test cases ([1] #21). The reply was "they will be added in later patches". I think it's better to put the relevant tests in the same patch that introduced the functionality. ------ My previous review posts for this patch [1] https://www.postgresql.org/message-id/CAHut%2BPtzpuuRFrLnjkQePq296ip_0WfmQ4CtydM9JDR6gEf%3DQw%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAHut%2BPtMkVoweJrd%3DSLw7BfpW883skasdnemoj4N19NnyjrT3Q%40mail.gmail.com [3] https://www.postgresql.org/message-id/CAHut+PuG8J8uA5V-F-o4TczhvFSWGG1B8qL+EZO0HjWWEEYG+g@mail.gmail.com [4] https://www.postgresql.org/message-id/CAHut%2BPtOODRybaptKRKUWZnGw-PZuLF2BxaitnMSNeAiU8-yPg%40mail.gmail.com [5] Houz replies to my previous review comments - https://www.postgresql.org/message-id/OS0PR01MB571690B2DB46B1C4AF61D184946F9%40OS0PR01MB5716.jpnprd01.prod.outlook.com Kind Regards, Peter Smith. Fujitsu Australia
В списке pgsql-hackers по дате отправления:
Следующее
От: "Drouvot, Bertrand"Дата:
Сообщение: Re: Add two missing tests in 035_standby_logical_decoding.pl