Re: logical decoding and replication of sequences, take 2

Поиск
Список
Период
Сортировка
От Jonathan S. Katz
Тема Re: logical decoding and replication of sequences, take 2
Дата
Msg-id eb893ca5-97a9-806d-6298-e7d2ddceeede@postgresql.org
обсуждение исходный текст
Ответ на Re: logical decoding and replication of sequences, take 2  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: logical decoding and replication of sequences, take 2  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
Hi,

On 2/16/23 10:50 AM, Tomas Vondra wrote:
> Hi,
> 
> Here's a rebased patch, without the last bit which is now unnecessary
> thanks to c981d9145dea.

Thanks for continuing to work on this patch! I tested the latest version 
and have some feedback/clarifications.

I did some testing using a demo-app-based-on-a-real-world app I had 
conjured up[1]. This uses integer sequences as surrogate keys.

In general things seemed to work, but I had a couple of 
observations/questions.

1. Sequence IDs after a "failover". I believe this is a design decision, 
but I noticed that after simulating a failover, the IDs were replicating 
from a higher value, e.g.

INSERT INTO room (name) VALUES ('room 1');
INSERT INTO room (name) VALUES ('room 2');
INSERT INTO room (name) VALUES ('room 3');
INSERT INTO room (name) VALUES ('room 4');

The values of room_id_seq on each instance:

instance 1:

  last_value | log_cnt | is_called
------------+---------+-----------
           4 |      29 | t

  instance 2:

   last_value | log_cnt | is_called
------------+---------+-----------
          33 |       0 | t

After the switchover on instance 2:

INSERT INTO room (name) VALUES ('room 5') RETURNING id;

  id
----
  34

I don't see this as an issue for most applications, but we should at 
least document the behavior somewhere.

2. Using with origin=none with nonconflicting sequences.

I modified the example in [1] to set up two schemas with non-conflicting 
sequences[2], e.g. on instance 1:

CREATE TABLE public.room (
     id int GENERATED BY DEFAULT AS IDENTITY (INCREMENT 2 START WITH 1) 
PRIMARY KEY,
     name text NOT NULL
);

and instance 2:

CREATE TABLE public.room (
     id int GENERATED BY DEFAULT AS IDENTITY (INCREMENT 2 START WITH 2) 
PRIMARY KEY,
     name text NOT NULL
);

I ran the following on instance 1:

INSERT INTO public.room ('name') VALUES ('room 1-e');

This committed and successfully replicated.

However, when I ran the following on instance 2, I received a conlifct 
error:

INSERT INTO public.room ('name') VALUES ('room 1-w');

The conflict came further down the trigger change, i.e. to a change in 
the `public.calendar` table:

2023-02-22 01:49:12.293 UTC [87235] ERROR:  duplicate key value violates 
unique constraint "calendar_pkey"
2023-02-22 01:49:12.293 UTC [87235] DETAIL:  Key (id)=(661) already exists.

After futzing with the logging and restarting, I was also able to 
reproduce a similar conflict with the same insert pattern into 'room'.

I did notice that the sequence values kept bouncing around between the 
servers. Without any activity, this is what "SELECT * FROM room_id_seq" 
would return with queries run ~4s apart:

  last_value | log_cnt | is_called
------------+---------+-----------
         131 |       0 | t

  last_value | log_cnt | is_called
------------+---------+-----------
          65 |       0 | t

The values were more varying on "calendar". Again, this is under no 
additional write activity, these numbers kept fluctuating:

  last_value | log_cnt | is_called
------------+---------+-----------
         197 |       0 | t

  last_value | log_cnt | is_called
------------+---------+-----------
         461 |       0 | t

  last_value | log_cnt | is_called
------------+---------+-----------
         263 |       0 | t

  last_value | log_cnt | is_called
------------+---------+-----------
         527 |       0 | t

To handle this case for now, I adapted the schema to create sequences 
that we clearly independently named[3]. I did learn that I had to create 
sequences on both instances to support this behavior, e.g.:

-- instance 1
CREATE SEQUENCE public.room_id_1_seq AS int INCREMENT BY 2 START WITH 1;
CREATE SEQUENCE public.room_id_2_seq AS int INCREMENT BY 2 START WITH 2;
CREATE TABLE public.room (
     id int DEFAULT nextval('room_id_1_seq') PRIMARY KEY,
     name text NOT NULL
);

-- instance 2
CREATE SEQUENCE public.room_id_1_seq AS int INCREMENT BY 2 START WITH 1;
CREATE SEQUENCE public.room_id_2_seq AS int INCREMENT BY 2 START WITH 2;
CREATE TABLE public.room (
     id int DEFAULT nextval('room_id_2_seq') PRIMARY KEY,
     name text NOT NULL
);

After building out [3] this did work, but it was more tedious.

Is it possible to support IDENTITY columns (or serial columns) where the 
values of the sequence are set to different intervals on the 
publisher/subscriber?

Thanks,

Jonathan

[1] 
https://github.com/CrunchyData/postgres-realtime-demo/blob/main/examples/demo/demo1.sql
[2] https://gist.github.com/jkatz/5c34bf1e401b3376dfe8e627fcd30af3
[3] https://gist.github.com/jkatz/1599e467d55abec88ab487d8ac9dc7c3


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Следующее
От: David Rowley
Дата:
Сообщение: Re: Add support for unit "B" to pg_size_pretty()