Re: GiST index performance

Поиск

Список

Период

Сортировка

От	Yeb Havinga
Тема	Re: GiST index performance
Дата	17 марта 2010 г. 09:26:29
Msg-id	4BA0A036.1070508@gmail.com обсуждение исходный текст
Ответ на	Re: GiST index performance (Yeb Havinga <yebhavinga@gmail.com>)
Ответы	Re: GiST index performance (Yeb Havinga <yebhavinga@gmail.com>)
Список	pgsql-performance

Дерево обсуждения

Yeb Havinga wrote:
> Matthew Wakeling wrote:
>>> Matthew Wakeling wrote:
>>>> A second quite distinct issue is the general performance of GiST
>>>> indexes
>>>> which is also mentioned in the old thread linked from Open Items. For
>>>> that, we have a test case at
>>>> http://archives.postgresql.org/pgsql-performance/2009-04/msg00276.php
>>>> for
>>>> btree_gist indexes. I have a similar example with the bioseg GiST
>>>> index. I
>>>> have completely reimplemented the same algorithms in Java for
>>>> algorithm
>>>> investigation and instrumentation purposes, and it runs about a
>>>> hundred
>>>> times faster than in Postgres. I think this is a problem, and I'm
>>>> willing
>>>> to do some investigation to try and solve it.
>> I have not made any progress on this issue. I think Oleg and Teodor
>> would be better placed working it out. All I can say is that I
>> implemented the exact same indexing algorithm in Java, and it
>> performed 100 times faster than Postgres. Now, Postgres has to do a
>> lot of additional work, like mapping the index onto disc, locking
>> pages, and abstracting to plugin user functions, so I would expect
>> some difference - I'm not sure 100 times is reasonable though. I
>> tried to do some profiling, but couldn't see any one section of code
>> that was taking too much time. Not sure what I can further do.
> Hello Mathew and list,
>
> A lot of time spent in gistget.c code and a lot of functioncall5's to
> the gist's consistent function which is out of sight for gprof.
> Something different but related since also gist: we noticed before
> that gist indexes that use a compressed form for index entries suffer
> from repeated compress calls on query operands (see
> http://archives.postgresql.org/pgsql-hackers/2009-05/msg00078.php).
>
> The btree_gist int4 compress function calls the generic
> gbt_num_compress, which does a palloc. Maybe this palloc is allso hit
> al lot when scanning the index, because the constants that are queries
> with are repeatedly compressed and palloced.
Looked in the code a bit more - only the index nodes are compressed at
index creation, the consistent functions does not compress queries, so
not pallocs there. However when running Mathews example from
http://archives.postgresql.org/pgsql-performance/2009-04/msg00276.php
with the gist index, the coverage shows in gistget.c: 1000000 palloc0 's
of gistsearchstack at line 152 and 2010982 palloc's also of the
gistsearchstack on line 342. Two pfrees are also hit a lot: line 195:
1010926 of a stackentry and line 293: 200056 times. My $0.02 cents is
that the pain is here. My knowledge of gistget or the other sources in
access/gist is zero, but couldn't it be possible to determine the
maximum needed size of the stack and then allocate it at once and use a
pop/push kind off api?

regards,
Yeb Havinga





>
> regards,
> Yeb Havinga
>
>

В списке pgsql-performance по дате отправления:

Предыдущее

От: "Pierre C"
Дата: 17 марта 2010 г., 07:36:29
Сообщение: Re: Block at a time ...

Следующее

От: Greg Stark
Дата: 17 марта 2010 г., 09:52:46
Сообщение: Re: Block at a time ...

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: GiST index performance

Предыдущее

Следующее