Re: Frequent Update Project: Design Overview of HOT Updates

Поиск
Список
Период
Сортировка
От Gregory Stark
Тема Re: Frequent Update Project: Design Overview of HOT Updates
Дата
Msg-id 87ac2zpj1a.fsf@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Frequent Update Project: Design Overview of HOT Updates  ("Zeugswetter Andreas ADI SD" <ZeugswetterA@spardat.at>)
Ответы Re: Frequent Update Project: Design Overview of HOT Updates  (NikhilS <nikkhils@gmail.com>)
Re: Frequent Update Project: Design Overview of HOT Updates  ("Zeugswetter Andreas ADI SD" <ZeugswetterA@spardat.at>)
Список pgsql-hackers
"Zeugswetter Andreas ADI SD" <ZeugswetterA@spardat.at> writes:

> 1. It doubles the IO (original page + hot page), if the new row would 
>     have fit into the original page.

That's an awfully big IF there. Even if you use a fillfactor of 50% in which
case you're paying a 100% performance penalty *all* the time, not just when
dealing with a table that's been bloated by multiple versions you still have
no guarantee the extra versions will fit on the same page.

> 4. although at first it might seem so I see no advantage for vacuum with
> overflow 

The main problem with vacuum now is that it must scan the entire table (and
the entire index) even if only a few records are garbage. If we isolate the
garbage in a separate area then vacuum doesn't have to scan unrelated tuples.

I'm not sure this really solves that problem because there are still DELETEs
to consider but it does remove one factor that exacerbates it unnecessarily.

I think the vision is that the overflow table would never be very large
because it can be vacuumed very aggressively. It has only tuples that are busy
and will need vacuuming as soon as a transaction ends. Unlike the main table
which is mostly tuples that don't need vacuuming. 

> 5. the size reduction of heap is imho moot because you trade it for a
> growing overflow
>     (size reduction only comes from reusing dead tuples and not
> adding index tuples --> SITC)

I think you're comparing the wrong thing. Size isn't a problem in itself, size
is a problem because it causes extra i/o. So a heap that's double in size
necessary takes twice as long as necessary to scan. The fact that the overflow
tables are taking up space isn't interesting if they don't have to be scanned.
Hitting the overflow tables should be quite rare, it only comes into play when
looking at concurrently updated tuples. It certainly happens but most tuples
in the table will be committed and not being concurrently updated by anyone
else.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Heikki Linnakangas"
Дата:
Сообщение: Re: Frequent Update Project: Design Overview of HOTUpdates
Следующее
От: Gregory Stark
Дата:
Сообщение: Re: Frequent Update Project: Design Overview of HOTUpdates