Re: More thoughts about planner's cost estimates

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: More thoughts about planner's cost estimates
Дата
Msg-id 87k67zdymk.fsf@stark.xeocode.com
обсуждение исходный текст
Ответ на Re: More thoughts about planner's cost estimates  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: More thoughts about planner's cost estimates  (Josh Berkus <josh@agliodbs.com>)
Список pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > And a 5% sample is a pretty big. In fact my tests earlier showed the i/o from
> > 5% block sampling took just as long as reading all the blocks. Even if we
> > figure out what's causing that (IMHO surprising) result and improve matters I
> > would only expect it to be 3-4x faster than a full scan.
> 
> One way to reduce the I/O pain from extensive sampling would be to turn
> VACUUM ANALYZE into a genuine combined operation instead of a mere
> notational shorthand for two separate scans.

Except that we're also looking for every way we can to avoid having vacuum
have to touch every page so large tables with narrow hot spots can still
afford to be vacuumed regularly during peak hours.

But for most users analyze doesn't really have to run as often as vacuum. One
sequential scan per night doesn't seem like that big a deal to me.

> I'd still be worried about the CPU pain though.  ANALYZE can afford to
> expend a pretty fair number of cycles per sampled tuple, but with a
> whole-table sample that's going to add up.

That is a concern. Though the algorithm is pretty efficient, it basically
amounts to hashing all the tuples keeping only a limited number of hash
buckets and just throwing away the rest of the data. 

Things could get more complicated if we get to more complex stats than simple
n_distinct estimates and performing dependence tests on the data for
multi-column statistics.

-- 
greg



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Fetter
Дата:
Сообщение: Re: COPY (query) TO file
Следующее
От: Greg Stark
Дата:
Сообщение: Re: COPY (query) TO file