Re: Performance on inserts

Поиск

Список

Период

Сортировка

От	Alfred Perlstein
Тема	Re: Performance on inserts
Дата	25 августа 2000 г. 15:01:47
Msg-id	20000825085219.V1209@fw.wintelcom.net обсуждение исходный текст
Ответ на	Re: Performance on inserts (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Performance on inserts (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

* Tom Lane <tgl@sss.pgh.pa.us> [000825 08:25] wrote:
> 
> In other words, we do a linear scan over all the pages containing equal
> key values, in the hope of finding one where we can shoehorn in the new
> entry.  If the keys are roughly equal-sized, that hope is usually vain,
> and we end up splitting the rightmost page that contains any keys
> equal to the target key.  Subsequent inserts of the same key value fill
> first the left-half split page, then the right-half, then split the
> right-half page again and repeat --- but for each insert we first had
> to scan over all the preceding pages containing equal keys.  That means
> inserting N equal keys takes O(N^2) time, for large enough N.
> 
> After thinking about this for a while, I propose a probabilistic
> solution.  Suppose that we add to the above loop a random-number check
> that causes us to fall out of the loop with probability 1/3 or so;
> that is, if the current page is full of the target key, then we move
> right to check the next page with probability 2/3, but with probability
> 1/3 we give up and split the current page to insert the new key.
> 
> With this approach we'd hardly ever move more than, say, 10 pages before
> splitting, so the time to insert a new key is bounded no matter how many
> duplicates it has.
> 
> The reason for using a random decision is that if we always stopped
> after scanning a fixed number of pages, then the space freed up by the
> previous decision to split would always be just out of reach, and we'd
> end up with lots of half-full pages of the same key.  With a random
> decision we will visit and fill both the left and right halves of a
> previously split page.
> 
> Comments?  Is there a better way?  What's the best probability to use?


I'm unsure if it's possible, but somehow storing the last place one
'gave up' and decided to split the page could offer a useful next-start
for the next insert.  Sort of attempting the split work amongst multiple
requests.  For some reason it looks like your algorithm might cause
problems because it plain gives up after 10 pages?                       
hope this helps,
-Alfred

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tom Lane
Дата: 25 августа 2000 г., 14:39:59
Сообщение: Re: advice on extensions needed

Следующее

От: Brook Milligan
Дата: 25 августа 2000 г., 16:32:15
Сообщение: Re: advice on extensions needed

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Performance on inserts

Предыдущее

Следующее