Re: WIP: Fast GiST index build

Поиск

Список

Период

Сортировка

От	Heikki Linnakangas
Тема	Re: WIP: Fast GiST index build
Дата	6 июня 2011 г. 10:52:00
Msg-id	4DECB141.2040707@enterprisedb.com обсуждение исходный текст
Ответ на	Re: WIP: Fast GiST index build (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы	Re: WIP: Fast GiST index build (Alexander Korotkov <aekorotkov@gmail.com>) Re: WIP: Fast GiST index build (Alexander Korotkov <aekorotkov@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On 06.06.2011 10:42, Heikki Linnakangas wrote:
> On 03.06.2011 14:02, Alexander Korotkov wrote:
>> Hackers,
>>
>> WIP patch of fast GiST index build is attached. Code is dirty and
>> comments
>> are lacking, but it works. Now it is ready for first benchmarks, which
>> should prove efficiency of selected technique. It's time to compare fast
>> GiST index build with repeat insert build on large enough datasets
>> (datasets
>> which don't fit to cache). There are following aims of testing:
>> 1) Measure acceleration of index build.
>> 2) Measure change in index quality.
>> I'm going to do first testing using synthetic datasets. Everybody who
>> have
>> interesting real-life datasets for testing are welcome.
>
> I ran another test with a simple table generated with:
>
> CREATE TABLE pointtest (p point);
> INSERT INTO pointtest SELECT point(random(), random()) FROM
> generate_series(1,50000000);
>
> Generating a gist index with:
>
> CREATE INDEX i_pointtest ON pointtest USING gist (p);
>
> took about 15 hours without the patch, and 2 hours with it. That's quite
> dramatic.

Oops, that was a rounding error, sorry. The run took about 2.7 hours 
with the patch, which of course should be rounded to 3 hours, not 2. 
Anyway, it is still a very impressive improvement.

I'm glad you could get the patch ready for benchmarking this quickly. 
Now you just need to get the patch into shape so that it can be 
committed. That is always the more time-consuming part, so I'm glad you 
have plenty of time left for it.

Could you please create a TODO list on the wiki page, listing all the 
missing features, known bugs etc. that will need to be fixed? That'll 
make it easier to see how much work there is left. It'll also help 
anyone looking at the patch to know which issues are known issues.

Meanwhile, it would still be very valuable if others could test this 
with different workloads. And Alexander, it would be good if at some 
point you could write some benchmark scripts too, and put them on the 
wiki page, just to see what kind of workloads have been taken into 
consideration and tested already. Do you think there's some worst-case 
data distributions where this algorithm would perform particularly badly?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Heikki Linnakangas
Дата: 06 июня 2011 г., 10:20:10
Сообщение: Re: reducing the overhead of frequent table locks - now, with WIP patch

Следующее

От: Peter Eisentraut
Дата: 06 июня 2011 г., 11:36:45
Сообщение: Re: DOMAINs and CASTs

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: WIP: Fast GiST index build

Предыдущее

Следующее