Re: Could postgres12 support millions of sequences? (like 10 million)
От | Adrian Klaver |
---|---|
Тема | Re: Could postgres12 support millions of sequences? (like 10 million) |
Дата | |
Msg-id | 032b7a7f-e08a-fb7d-abf9-4a227f815f7a@aklaver.com обсуждение исходный текст |
Ответ на | Re: Could postgres12 support millions of sequences? (like 10 million) (Adrian Klaver <adrian.klaver@aklaver.com>) |
Ответы |
Re: Could postgres12 support millions of sequences? (like 10 million)
(pabloa98 <pabloa98@gmail.com>)
|
Список | pgsql-general |
On 3/20/20 9:59 AM, Adrian Klaver wrote: > On 3/19/20 10:31 PM, pabloa98 wrote: >> I see. >> >> Any suggestion? It should behave like a sequence in the sense that >> concurrent transitions will get different numbers from this >> alternative sequence like solution. >> >> In our case, we will need to do a call nextval('some_seq') (or >> similar) from different processes no more than twice every minute. >> >> >> It would be nice to have a sequence data type. Correct me if I am >> wrong, but It seems to me that a sequence data type would cost the >> same or less than the current sequence implementation. >> >> The schema would be more clear too. We could have a table like: >> >> CREATE TABLE pair( >> group INT NOT NULL, >> element INT NOT NULL, >> seq SEQUENCE INCREMENT 1 >> START 1 >> CACHE 1 >> MINVALUE 1 >> MAXVALUE 99999999 >> NOT NULL, >> CONSTRAINT PRIMARY KEY (group, element) >> ); >> >> And then: >> >> INSERT INTO event(group, element, code) >> VALUES ( >> 1, >> 1, >> ( SELECT seq.nextval('code_1_1_seq') FROM pair p WHERE >> p.group=1 and p.code=1 ) >> ); >> >> Or perhaps storing all the sequences in the same table as rows will >> have the same behavior. > > If code is just something to show the sequence of insertion for group, > element combinations then maybe something like below: > > CREATE TABLE event( > group_id INT NOT NULL, --changed as group is reserved word > element INT NOT NULL, > insert_ts timestamptz NOT NULL DEFAULT clock_timestamp(), > PRIMARY KEY(group_id, element, insert_ts) > ); > > > insert into event(group_id, element) VALUES > (1, 1), > (1, 1), > (1, 1), > (2, 1), > (1, 1), > (1, 3), > (1, 1), > (1, 3), > (2, 1), > (2, 1); > > > select * from event ; > group_id | element | insert_ts > ----------+---------+-------------------------------- > 1 | 1 | 03/20/2020 09:51:12.675926 PDT > 1 | 1 | 03/20/2020 09:51:12.675985 PDT > 1 | 1 | 03/20/2020 09:51:12.675991 PDT > 2 | 1 | 03/20/2020 09:51:12.675996 PDT > 1 | 1 | 03/20/2020 09:51:12.676 PDT > 1 | 3 | 03/20/2020 09:51:12.676004 PDT > 1 | 1 | 03/20/2020 09:51:12.676008 PDT > 1 | 3 | 03/20/2020 09:51:12.676012 PDT > 2 | 1 | 03/20/2020 09:51:12.676016 PDT > 2 | 1 | 03/20/2020 09:51:12.67602 PDT > (10 rows) > > > select group_id, element, row_number() OVER (partition by (group_id, > element) order by (group_id, element)) AS code from event; > group_id | element | code > ----------+---------+------ > 1 | 1 | 1 > 1 | 1 | 2 > 1 | 1 | 3 > 1 | 1 | 4 > 1 | 1 | 5 > 1 | 3 | 1 > 1 | 3 | 2 > 2 | 1 | 1 > 2 | 1 | 2 > 2 | 1 | 3 > (10 rows) Oops the above actually returned: select group_id, element, row_number() OVER (partition by (group_id, element) order by (group_id, element)) AS code, insert_ts from event; group_id | element | code | insert_ts ----------+---------+------+-------------------------------- 1 | 1 | 1 | 03/20/2020 09:51:12.675926 PDT 1 | 1 | 2 | 03/20/2020 09:51:12.675985 PDT 1 | 1 | 3 | 03/20/2020 09:51:12.675991 PDT 1 | 1 | 4 | 03/20/2020 09:51:12.676008 PDT 1 | 1 | 5 | 03/20/2020 09:51:12.676 PDT 1 | 3 | 1 | 03/20/2020 09:51:12.676004 PDT 1 | 3 | 2 | 03/20/2020 09:51:12.676012 PDT 2 | 1 | 1 | 03/20/2020 09:51:12.67602 PDT 2 | 1 | 2 | 03/20/2020 09:51:12.676016 PDT 2 | 1 | 3 | 03/20/2020 09:51:12.675996 PDT (10 rows) Needs to be: select group_id, element, row_number() OVER (partition by (group_id, element) order by (group_id, element, insert_ts)) AS code, insert_ts from event; group_id | element | code | insert_ts ----------+---------+------+-------------------------------- 1 | 1 | 1 | 03/20/2020 09:51:12.675926 PDT 1 | 1 | 2 | 03/20/2020 09:51:12.675985 PDT 1 | 1 | 3 | 03/20/2020 09:51:12.675991 PDT 1 | 1 | 4 | 03/20/2020 09:51:12.676 PDT 1 | 1 | 5 | 03/20/2020 09:51:12.676008 PDT 1 | 3 | 1 | 03/20/2020 09:51:12.676004 PDT 1 | 3 | 2 | 03/20/2020 09:51:12.676012 PDT 2 | 1 | 1 | 03/20/2020 09:51:12.675996 PDT 2 | 1 | 2 | 03/20/2020 09:51:12.676016 PDT 2 | 1 | 3 | 03/20/2020 09:51:12.67602 PDT (10 rows) > >> >> Pablo >> >> >> On Thu, Mar 19, 2020 at 7:56 PM Tom Lane <tgl@sss.pgh.pa.us >> <mailto:tgl@sss.pgh.pa.us>> wrote: >> >> Michael Lewis <mlewis@entrata.com <mailto:mlewis@entrata.com>> >> writes: >> > On Thu, Mar 19, 2020, 5:48 PM David G. Johnston >> <david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> >> > wrote: >> >> However, one other consideration with sequences: do you care that >> >> PostgreSQL will cache/pin (i.e., no release) every single >> sequence you >> >> touch for the lifetime of the session? (I do not think DISCARD >> matters here >> >> but I'm just guessing) >> >> > Would you expand on this point or is there someplace specific >> in the >> > documentation on this? >> >> I think what David is worried about is that a sequence object is a >> one-row table in PG's implementation. Thus >> >> (1) each sequence requires a dozen or two rows in assorted system >> catalogs (not sure exactly how many offhand). >> >> (2) each sequence uses up 8KB on disk for its table file. >> >> (3) each sequence you actually access within a particular session >> results in creation of relcache and catcache entries in that >> session's local memory. I'm not sure offhand how big those are >> either, but a few KB per sequence would be a reasonable guess. >> >> (4) each sequence competes for space in the shared-buffer arena, >> since its 8K block has to be swapped into there whenever you try >> to access/increment the sequence's value. >> >> This is all mighty inefficient of course, and there's been talk >> of trying to reduce the per-sequence overhead; but I don't know >> of anyone actively working on that. As things stand, I think >> having millions of sequences would be quite painful performance- >> wise, especially if your workload were actively hitting a lot >> of them concurrently. It would work, for some value of "work", >> but it wouldn't perform very well. >> >> Also, as multiple people mentioned already, this still wouldn't >> guarantee gap-free sequences of ID values. >> >> regards, tom lane >> > > -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления:
Предыдущее
От: Adrian KlaverДата:
Сообщение: Re: Could postgres12 support millions of sequences? (like 10 million)