Re: RFI: Extending the TOAST Pointer

Поиск
Список
Период
Сортировка
От Aleksander Alekseev
Тема Re: RFI: Extending the TOAST Pointer
Дата
Msg-id CAJ7c6TNAYyeMYKVkiwOZChy7UpE_CkjpYOk73gcWTXMkLkEyzw@mail.gmail.com
обсуждение исходный текст
Ответ на RFI: Extending the TOAST Pointer  (Nikita Malakhov <hukutoc@gmail.com>)
Ответы Re: RFI: Extending the TOAST Pointer  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Список pgsql-hackers
Hi Nikita,

> this part of the PostgreSQL screams to be revised and improved

I completely agree. The problem with TOAST pointers is that they are
not extendable at the moment which prevents adding new compression
algorithms (e.g. ZSTD), new features like compression dictionaries
[1], etc. I suggest we add extensibility in order to solve this
problem for the foreseeable future for everyone.

> where Custom TOAST Pointer is distinguished from Regular one by va_flag field
> which is a part of varlena header

I don't think that varlena header is the best place to distinguish a
classical TOAST pointer from an extended one. On top of that I don't
see any free bits that would allow adding such a flag to the on-disk
varlena representation [2].

The current on-disk TOAST pointer representation is following:

```
typedef struct varatt_external
{
int32 va_rawsize; /* Original data size (includes header) */
uint32 va_extinfo; /* External saved size (without header) and
                              * compression method */
Oid va_valueid; /* Unique ID of value within TOAST table */
Oid va_toastrelid; /* RelID of TOAST table containing it */
} varatt_external;
```

Note that currently only 2 compression methods are supported:

```
typedef enum ToastCompressionId
{
TOAST_PGLZ_COMPRESSION_ID = 0,
TOAST_LZ4_COMPRESSION_ID = 1,
TOAST_INVALID_COMPRESSION_ID = 2
} ToastCompressionId;
```

I suggest adding a new flag that will mark an extended TOAST format:

```
typedef enum ToastCompressionId
{
TOAST_PGLZ_COMPRESSION_ID = 0,
TOAST_LZ4_COMPRESSION_ID = 1,
TOAST_RESERVED_COMPRESSION_ID = 2,
TOAST_HAS_EXTENDED_FORMAT = 3,
} ToastCompressionId;
```

For an extended format we add a varint (utf8-like) bitmask right after
varatt_external that marks the features supported in this particular
instance of the pointer. The rest of the data is interpreted depending
on the bits set. This will allow us to extend the pointers
indefinitely.

Note that the proposed approach doesn't require running any
migrations. Note also that I described only the on-disk
representation. We can tweak the in-memory representation as we want
without affecting the end user.

Thoughts?

[1]: https://commitfest.postgresql.org/43/3626/
[2]:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/include/postgres.h;h=0446daa0e61722067bb75aa693a92b38736e12df;hb=164d174bbf9a3aba719c845497863cd3c49a3ad0#l178


-- 
Best regards,
Aleksander Alekseev



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Wei Wang (Fujitsu)"
Дата:
Сообщение: RE: WL_SOCKET_ACCEPT fairness on Windows
Следующее
От: Aleksander Alekseev
Дата:
Сообщение: Re: [PATCH] Allow Postgres to pick an unused port to listen