Обсуждение: Memory-comparable Serialization of Data Types

Поиск
Список
Период
Сортировка

Memory-comparable Serialization of Data Types

От
Shichao Jin
Дата:
Hi Postgres Developers,

We are currently integrating LSM-tree based storage engine RocksDB into Postgres. I am wondering is there any function that serialize data types in memory-comparable format, similar to MySQL and MariaDB. With that kind of function, we can directly store the serialized format in the storage engine and compare them in the storage engine level instead of deserializing data and comparing in the upper level. I know PostgreSQL is towards supporting pluggble storage engine, so I think this feature would be particular useful.

Best,
Shichao 

Re: Memory-comparable Serialization of Data Types

От
Peter Geoghegan
Дата:
On Tue, Feb 11, 2020 at 11:53 AM Shichao Jin <jsc0218@gmail.com> wrote:
> We are currently integrating LSM-tree based storage engine RocksDB into Postgres. I am wondering is there any
functionthat serialize data types in memory-comparable format, similar to MySQL and MariaDB. With that kind of
function,we can directly store the serialized format in the storage engine and compare them in the storage engine level
insteadof deserializing data and comparing in the upper level. 

Do you mean a format that can perform Index comparisons using a
memcmp() rather than per-datatype comparison code?



--
Peter Geoghegan



Re: Memory-comparable Serialization of Data Types

От
Shichao Jin
Дата:
Yes, this is exactly what I mean.

On Tue, 11 Feb 2020 at 15:01, Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, Feb 11, 2020 at 11:53 AM Shichao Jin <jsc0218@gmail.com> wrote:
> We are currently integrating LSM-tree based storage engine RocksDB into Postgres. I am wondering is there any function that serialize data types in memory-comparable format, similar to MySQL and MariaDB. With that kind of function, we can directly store the serialized format in the storage engine and compare them in the storage engine level instead of deserializing data and comparing in the upper level.

Do you mean a format that can perform Index comparisons using a
memcmp() rather than per-datatype comparison code?



--
Peter Geoghegan

Re: Memory-comparable Serialization of Data Types

От
Peter Geoghegan
Дата:
On Tue, Feb 11, 2020 at 12:19 PM Shichao Jin <jsc0218@gmail.com> wrote:
> Yes, this is exactly what I mean.

PostgreSQL doesn't have this capability. It might make sense to have
it for some specific data structures, such as tuples on internal
B-Tree pages -- these merely guide index scans, so there some
information loss may be acceptable compared to the native/base
representation. However, that would only be faster because memcmp() is
generally faster than the underlying datatype's native comparator. Not
because comparisons have to take place in "the upper levels". There is
some indirection/overhead involved in using SQL-callable operators,
but not that much.

Note that such a representation has to lose information in at least
some cases. For example, case-insensitive collations would have to
lose information about the original case used (or store the original
alongside the conditioned binary string). Note also that a "one pass"
representation that we can just memcmp() will have to be significantly
larger in some cases, especially when collatable text is used. A
strxfrm() blob is typically about 3.3x larger than the original string
IIRC.

-- 
Peter Geoghegan



Re: Memory-comparable Serialization of Data Types

От
Alvaro Herrera
Дата:
On 2020-Feb-11, Peter Geoghegan wrote:

> On Tue, Feb 11, 2020 at 12:19 PM Shichao Jin <jsc0218@gmail.com> wrote:
> > Yes, this is exactly what I mean.
> 
> PostgreSQL doesn't have this capability. It might make sense to have
> it for some specific data structures,

I think adding that would be too much of a burden, both for the project
itself as for third-party type definitions; I think we'd rather rely on
calling the BTORDER_PROC btree support function for the type.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Memory-comparable Serialization of Data Types

От
Peter Geoghegan
Дата:
On Tue, Feb 11, 2020 at 1:40 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> I think adding that would be too much of a burden, both for the project
> itself as for third-party type definitions; I think we'd rather rely on
> calling the BTORDER_PROC btree support function for the type.

An operator class would still need to provide a BTORDER_PROC. What I
describe would be an optional capability. This is something that I
have referred to as key normalization in the past:

https://wiki.postgresql.org/wiki/Key_normalization

I think that it would only make sense as an enabler of multiple
optimizations -- not just the memcmp()/strcmp() thing. A common
strcmp()'able binary string format can be used in many different ways.
Note that this has nothing to do with changing the representation used
by the vast majority of all tuples -- just the pivot tuples, which are
mostly located in internal pages. They only make up less than 1% of
all pages in almost all cases.

I intend to prototype this technique within the next year. It's
possible that it isn't worth the trouble, but there is only one way to
find out. I might just work on the "abbreviated keys in internal
pages" thing, for example. Though you really need some kind of prefix
compression to make that effective.

-- 
Peter Geoghegan



Re: Memory-comparable Serialization of Data Types

От
Shichao Jin
Дата:
Thank you for both your feedback. Yes, as indicated by Peter, we indeed use that technique in comparison in index, and now we will try passing comparator to the storage engine according to Alvaro's suggestion.

Best,
Shichao


  

On Tue, 11 Feb 2020 at 17:16, Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, Feb 11, 2020 at 1:40 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> I think adding that would be too much of a burden, both for the project
> itself as for third-party type definitions; I think we'd rather rely on
> calling the BTORDER_PROC btree support function for the type.

An operator class would still need to provide a BTORDER_PROC. What I
describe would be an optional capability. This is something that I
have referred to as key normalization in the past:

https://wiki.postgresql.org/wiki/Key_normalization

I think that it would only make sense as an enabler of multiple
optimizations -- not just the memcmp()/strcmp() thing. A common
strcmp()'able binary string format can be used in many different ways.
Note that this has nothing to do with changing the representation used
by the vast majority of all tuples -- just the pivot tuples, which are
mostly located in internal pages. They only make up less than 1% of
all pages in almost all cases.

I intend to prototype this technique within the next year. It's
possible that it isn't worth the trouble, but there is only one way to
find out. I might just work on the "abbreviated keys in internal
pages" thing, for example. Though you really need some kind of prefix
compression to make that effective.

--
Peter Geoghegan


--
Shichao Jin
PhD Student at University of Waterloo, Canada