[Proposal] global sequence implemented by snowflake ID

Поиск

Список

Период

Сортировка

От	Hayato Kuroda (Fujitsu)
Тема	[Proposal] global sequence implemented by snowflake ID
Дата	23 ноября 2023 г. 13:18:59
Msg-id	TY3PR01MB988983D23E4F1DA10567BC5BF5B9A@TY3PR01MB9889.jpnprd01.prod.outlook.com обсуждение исходный текст
Ответы	Re: [Proposal] global sequence implemented by snowflake ID (Michael Paquier <michael@paquier.xyz>)
Список	pgsql-hackers

Дерево обсуждения

Hi hackers,

I want to discuss a new feature for assigning a snowflake ID[1], which can be
cluster-wide unique numbers. Also, Snowflake ID can be allocated standalone.

# Use case

A typical use case is a multi-master system constructed by logical replication.
This feature allows multi-node system to use GENERATED values. IIUC, this is
desired in another thread [2].

When the postgres is standalone, it is quite often that a sequence is used as
default value of the primary key. However, this cannot be done on the multi-master
system as it is because the value on nodeA might be already used on nodeB.
Logical decoding of sequence partially solves the issue, but not sufficient -
what about the case of asynchronous replication? Managing chucks of values is worse.

# What is the formats of Snowflake ID?

Snowflake ID has a below form:

[1bit - unused] + [41bit millisecond timestamp] + [10bit machine ID] + [12bit local sequence number]

Trivially, the millisecond timestamp represents the time when the number is allocated.
I.e., the time nextval() is called. Using a UNIX time seems an easiest way.

Machine ID can be an arbitrary number, but recommended to be unique in the system.
Duplicated machine ID might trigger a conflict.

## Characteristics of snowflake ID

Snowflake ID can generate a unique numbers standalone. According to the old discussion,
allocating value spaces to each nodes was considered [3], but it must communicating
with other nodes, this brings extra difficulties. (e.g., Which protocol would be used?)

Also, Snowflake IDs are roughly time ordered. As Andres pointed out in the old
discussions [4], large indexes over random values perform worse.
Snowflake can avoid the situation.

Moreover, Snowflake IDs are 64-bit integer, shorter than UUID (128-bit).

# Implementation

There are several approaches for implementing a snowflake ID. For example,

* Implement as contrib module. Features needed for each components of snowflakeID
have already been implemented in core, so basically it can be.
* Implement as a variant of sequence access method. I found that sequence AM was
proposed many years ago [5], but it has not been active now. It might be a
fundamental way but needs a huge works.

Attached patch adds a minimal contrib module which can be used for testing my proposal.
Below shows an usage.

```
-- Create an extension
postgres=# CREATE EXTENSION snowflake_sequence ;
CREATE EXTENSION
-- Create a sequence which generates snowflake IDs
postgres=# SELECT snowflake_sequence.create_sequence('test_sequence');
create_sequence
-----------------

(1 row)
-- Get next snowflake ID
postgres=# SELECT snowflake_sequence.nextval('test_sequence');
nextval
---------------------
3162329056562487297
(1 row)
```

How do you think?

[1]: https://github.com/twitter-archive/snowflake/tree/b3f6a3c6ca8e1b6847baa6ff42bf72201e2c2231
[2]: https://www.postgresql.org/message-id/1b25328f-5f4d-9b75-b3f2-f9d9931d1b9d%40postgresql.org
[3]: https://www.postgresql.org/message-id/CA%2BU5nMLSh4fttA4BhAknpCE-iAWgK%2BBG-_wuJS%3DEAcx7hTYn-Q%40mail.gmail.com
[4]: https://www.postgresql.org/message-id/201210161515.54895.andres%402ndquadrant.com
[5]: https://www.postgresql.org/message-id/flat/CA%2BU5nMLV3ccdzbqCvcedd-HfrE4dUmoFmTBPL_uJ9YjsQbR7iQ%40mail.gmail.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Вложения

0001-initial-commit-for-snowflake_sequence.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Ajin Cherian
Дата: 23 ноября 2023 г., 13:06:03
Сообщение: Re: Synchronizing slots from primary to standby

Следующее

От: John Naylor
Дата: 23 ноября 2023 г., 13:41:08
Сообщение: Re: Change GUC hashtable to use simplehash?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

[Proposal] global sequence implemented by snowflake ID

Вложения

Предыдущее

Следующее