Обсуждение: Btree vs. GIN

Поиск
Список
Период
Сортировка

Btree vs. GIN

От
Jack Orenstein
Дата:
I am working on a research project involving a datatype, D, with the following characteristics:

- A value of D is a variable-length binary string.

- A value of D usually gives rise to a single index term, but there could occasionally be more, (up to some configurable maximum).

- The index terms are int64.

GIN looks like a good fit for my requirements, and I've done a little experimentation (using btree_gin) to determine that the optimizer is working as I'd hoped.

So I'm going to proceed with GIN indexes, writing an extension for my datatype. But I'm wondering about btrees and GIN, in general. This discussion was pretty interesting: https://www.postgresql-archive.org/Questions-about-btree-gin-vs-btree-gist-for-low-cardinality-columns-td6088041.html.

My main questions are about the remaining differences between btree and GIN indexes. With btree key deduplication in Postgres 12, one of the main differences between btrees and GIN is gone, (my understanding is that GIN has done this optimization for a long time).

I think that one other major difference between the two is that GIN can handle multiple keys per row, while a btree cannot. Is there some fundamental reason why the btree cannot accommodate multiple keys per row? I think that this would have no impact on the btree structure itself. The only difference would be that the same row (ctid) could be associated with multiple keys.

I guess the top-level question is this: Is it possible in principle for the btree index to subsume GIN index capabilities?

Jack Orenstein

Re: Btree vs. GIN

От
Tom Lane
Дата:
Jack Orenstein <jao@geophile.com> writes:
> I guess the top-level question is this: Is it possible in principle for the
> btree index to subsume GIN index capabilities?

Well, you could write a completely different index AM that happened to
share the same on-disk representation, perhaps.  But it would be a
different AM.  btree is built for scalar values and has none of the
mechanisms that GIN does for breaking down input values into
components-to-be-indexed, nor for analyzing complex query operators
to find out what indexable search conditions are implied.

            regards, tom lane