Re: index fragmentation on insert-only table with non-unique column

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: index fragmentation on insert-only table with non-unique column
Дата
Msg-id 20160525042625.GX21416@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: index fragmentation on insert-only table with non-unique column  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-performance
* Peter Geoghegan (pg@bowt.ie) wrote:
> On Tue, May 24, 2016 at 10:39 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > I was able to see great improvement without planner parameters by REINDEX the
> > timestamp index.  My theory is that the index/planner doesn't handle well the
> > case of many tuples with same column value, and returns pages out of logical
> > order.  Reindex fixes that, rewriting the index data with pages in order
> > (confirmed with pageinspect), which causes index scans to fetch heap data more
> > or less monotonically (if not consecutively).  strace shows that consecutive
> > read()s are common (without intervening seeks).  I gather this allows the OS
> > readahead to kick in.
>
> The basic problem is that the B-Tree code doesn't maintain this
> property. However, B-Tree index builds will create an index that
> initially has this property, because the tuplesort.c code happens to
> sort index tuples with a CTID tie-breaker.
>
> > Postgres seems to assume that the high degree of correlation of the table
> > column seen in pg_stats is how it will get data from the index scan, which
> > assumption seems to be very poor on what turns out to be a higly fragmented
> > index.  Is there a way to help it to understand otherwise??
>
> Your complaint is vague. Are you complaining about the planner making
> a poor choice? I don't think that's the issue here, because you never
> made any firm statement about the planner making a choice that was
> worth than an alternative that it had available.
>
> If you're arguing for the idea that B-Trees should reliably keep
> tuples in order by a tie-break condition, that seems difficult to
> implement, and likely not worth it in practice.

The ongoing discussion around how to effectively handle duplicate values
in B-Tree seems relevant to this.  In particular, if we're able to store
duplicate values efficiently and all the locations under a single key
are easily available then we could certainly sort those prior to going
and visiting them.

That's not quite the same as keeping the tuples in order in the heap,
but would more-or-less achieve the effect desired, I believe?

Thanks!

Stephen

Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: index fragmentation on insert-only table with non-unique column
Следующее
От: Tom Lane
Дата:
Сообщение: Re: index fragmentation on insert-only table with non-unique column