RE: speed up a logical replica setup
От | Hayato Kuroda (Fujitsu) |
---|---|
Тема | RE: speed up a logical replica setup |
Дата | |
Msg-id | TYCPR01MB12077C74ACCF1F16777EE76F1F5502@TYCPR01MB12077.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: speed up a logical replica setup (vignesh C <vignesh21@gmail.com>) |
Ответы |
Re: speed up a logical replica setup
(vignesh C <vignesh21@gmail.com>)
|
Список | pgsql-hackers |
Dear Vignesh, Thanks for giving comments! > Few comments for v22-0001 patch: > 1) The second "if (strcmp(PQgetvalue(res, 0, 1), "t") != 0)"" should > be if (strcmp(PQgetvalue(res, 0, 2), "t") != 0): > + if (strcmp(PQgetvalue(res, 0, 1), "t") != 0) > + { > + pg_log_error("permission denied for database %s", > dbinfo[0].dbname); > + return false; > + } > + if (strcmp(PQgetvalue(res, 0, 1), "t") != 0) > + { > + pg_log_error("permission denied for function \"%s\"", > + > "pg_catalog.pg_replication_origin_advance(text, pg_lsn)"); > + return false; > + } I have already pointed out as comment #8 [1] and fixed in v22-0005. > 2) pg_createsubscriber fails if a table is parallely created in the > primary node: > 2024-02-20 14:38:49.005 IST [277261] LOG: database system is ready to > accept connections > 2024-02-20 14:38:54.346 IST [277270] ERROR: relation "public.tbl5" > does not exist > 2024-02-20 14:38:54.346 IST [277270] STATEMENT: CREATE SUBSCRIPTION > pg_createsubscriber_5_277236 CONNECTION ' dbname=postgres' PUBLICATION > pg_createsubscriber_5 WITH (create_slot = false, copy_data = false, > enabled = false) > > If we are not planning to fix this, at least it should be documented The error will be occurred when tables are created after the promotion, right? I think it cannot be fixed until DDL logical replication would be implemented. So, +1 to add descriptions. > 3) Error conditions is verbose mode has invalid error message like > "out of memory" messages like in below: > pg_createsubscriber: waiting the postmaster to reach the consistent state > pg_createsubscriber: postmaster reached the consistent state > pg_createsubscriber: dropping publication "pg_createsubscriber_5" on > database "postgres" > pg_createsubscriber: creating subscription > "pg_createsubscriber_5_278343" on database "postgres" > pg_createsubscriber: error: could not create subscription > "pg_createsubscriber_5_278343" on database "postgres": out of memory Because some places use PQerrorMessage() wrongly. It should be PQresultErrorMessage(). Fixed in v22-0005. > 4) In error cases we try to drop this publication again resulting in error: > + /* > + * Since the publication was created before the > consistent LSN, it is > + * available on the subscriber when the physical > replica is promoted. > + * Remove publications from the subscriber because it > has no use. > + */ > + drop_publication(conn, &dbinfo[i]); > > Which throws these errors(because of drop publication multiple times): > pg_createsubscriber: dropping publication "pg_createsubscriber_5" on > database "postgres" > pg_createsubscriber: error: could not drop publication > "pg_createsubscriber_5" on database "postgres": ERROR: publication > "pg_createsubscriber_5" does not exist > pg_createsubscriber: dropping publication "pg_createsubscriber_5" on > database "postgres" > pg_createsubscriber: dropping the replication slot > "pg_createsubscriber_5_278343" on database "postgres" Right. One approach is to use DROP PUBLICATION IF EXISTS statement. Thought? > 5) In error cases, wait_for_end_recovery waits even though it has > identified that the replication between primary and standby is > stopped: > +/* > + * Is recovery still in progress? > + * If the answer is yes, it returns 1, otherwise, returns 0. If an error occurs > + * while executing the query, it returns -1. > + */ > +static int > +server_is_in_recovery(PGconn *conn) > +{ > + PGresult *res; > + int ret; > + > + res = PQexec(conn, "SELECT pg_catalog.pg_is_in_recovery()"); > + > + if (PQresultStatus(res) != PGRES_TUPLES_OK) > + { > + PQclear(res); > + pg_log_error("could not obtain recovery progress"); > + return -1; > + } > + > > You can simulate this by stopping the primary just before > wait_for_end_recovery and you will see these error messages, but > pg_createsubscriber will continue to wait: > pg_createsubscriber: error: could not obtain recovery progress > pg_createsubscriber: error: could not obtain recovery progress > pg_createsubscriber: error: could not obtain recovery progress > pg_createsubscriber: error: could not obtain recovery progress Yeah, v22-0001 cannot detect the disconnection from primary and standby. V22-0007 can detect the standby crash, but v22 set could not detect the primary crash. Euler came up with an approach [2] for it but not implemented yet. [1]: https://www.postgresql.org/message-id/TYCPR01MB12077756323B79042F29DDAEDF54C2%40TYCPR01MB12077.jpnprd01.prod.outlook.com [2]: https://www.postgresql.org/message-id/2231a04b-f2d4-4a4e-b5cd-56be8b002427%40app.fastmail.com Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Quan ZongliangДата:
Сообщение: Re: Change the bool member of the Query structure to bits
Следующее
От: David RowleyДата:
Сообщение: Re: Add bump memory context type and use it for tuplesorts