Re: Questions about btree_gin vs btree_gist for low cardinality columns

Поиск
Список
Период
Сортировка
От Jeremy Finzel
Тема Re: Questions about btree_gin vs btree_gist for low cardinality columns
Дата
Msg-id CAMa1XUixPDxyYSy8BN4_spUc518H4WUO-URYYf6DV0=pLmV6Fg@mail.gmail.com
обсуждение исходный текст
Ответ на Questions about btree_gin vs btree_gist for low cardinality columns  (Jeremy Finzel <finzelj@gmail.com>)
Ответы Re: Questions about btree_gin vs btree_gist for low cardinality columns  (Will Hartung <willhartung@gmail.com>)
Список pgsql-general


On Fri, May 24, 2019 at 10:25 AM Jeremy Finzel <finzelj@gmail.com> wrote:
I have been hoping for clearer direction from the community about specifically btree_gin indexes for low cardinality columns (as well as low cardinality multi-column indexes).  In general there is very little discussion about this both online and in the docs.  Rather, the emphasis for GIN indexes discussed is always on full text search of JSON indexing, not btree_gin indexes.

However, I have never been happy with the options open to me for indexing low cardinality columns and was hoping this could be a big win.  Often I use partial indexes as a solution, but I really want to know how many use cases btree_gin could solve better than either a normal btree or a partial index.

Here are my main questions:

1.

"The docs say regarding *index only scans*: The index type must support index-only scans. B-tree indexes always do. GiST and SP-GiST indexes support index-only scans for some operator classes but not others. Other index types have no support. The underlying requirement is that the index must physically store, or else be able to reconstruct, the original data value for each index entry. As a counterexample, GIN indexes cannot support index-only scans because each index entry typically holds only part of the original data value."

This is confusing to say "B-tree indexes always do" and "GIN indexes cannot support index-only scans", when we have a btree_gin index type.  Explanation please ???

Is it true that for a btree_gin index on a regular column, "each index entry typically holds only part of the original data value"?  Do these still not support index only scans?  Could they?  I can't see why they shouldn't be able to for a single indexed non-expression field?

2. 

Lack of index only scans is definitely a downside.  However, I see basically identical performance, but way less memory and space usage, for gin indexes.  In terms of read-only performance, if index only scans are not a factor, why not always recommend btree_gin indexes instead of regular btree for low cardinality fields, which will yield similar performance but use far, far less space and resources?

3.

This relates to 2.  I understand the write overhead can be much greater for GIN indexes, which is why the fastupdate feature exists.  But again, in those discussions in the docs, it appears to me they are emphasizing that penalty more for full text or json GIN indexes.  Does the same overhead apply to a btree_gin index on a regular column with no expressions?

Those are my questions.

FYI, I can see an earlier thread about this topic (https://www.postgresql.org/message-id/flat/E260AEE7-95B3-4142-9A4B-A4416F1701F0%40aol.com#5def5ce1864298a3c0ba2d4881a660c2), but a few questions were left unanswered and unclear there.

I first started seriously considering using btree_gin indexes for low cardinality columns, for example some text field with 30 unique values across 100 million rows, after reading a summary of index types from Bruce's article: https://momjian.us/main/writings/pgsql/indexing.pdf

This article was also helpful but yet again I wonder it's broader viability: http://hlinnaka.iki.fi/2014/03/28/gin-as-a-substitute-for-bitmap-indexes/


Thank you!
Jeremy

Could anyone shed any light on these questions?  I appreciate it.

Thanks,
Jeremy 

В списке pgsql-general по дате отправления:

Предыдущее
От: Julie Nishimura
Дата:
Сообщение: Re: POSTGRES_FSM_RELATIONS CRITICAL: DB control fsm relations used:79569 of 80000 (99%)
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: POSTGRES_FSM_RELATIONS CRITICAL: DB control fsm relations used:79569 of 80000 (99%)