Re: multivariate statistics v14

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: multivariate statistics v14
Дата
Msg-id 89341a68-4729-ad28-bb39-cef31849aedb@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: multivariate statistics v14  (Tatsuo Ishii <ishii@postgresql.org>)
Ответы Re: multivariate statistics v14  (Tatsuo Ishii <ishii@postgresql.org>)
Список pgsql-hackers
Hello,

On 03/22/2016 09:13 AM, Tatsuo Ishii wrote:
>>> Do you have any other missing parts in this work? I am asking
>>> because I wonder if you want to push this into 9.6 or rather 9.7.
>>
>> I think the first few parts of the patch series, namely:
>>
>>   * shared infrastructure (0002)
>>   * functional dependencies (0003)
>>   * MCV lists (0004)
>>   * histograms (0005)
>>
>> might make it into 9.6. I believe the code for building and storing
>> the different kinds of stats is reasonably solid. What probably needs
>> more thorough review are the changes in clauselist_selectivity(), but
>> the code in these parts is reasonably simple as it only supports using
>> a single multi-variate statistics per relation.
>>
>> The part (0006) that allows using multiple statistics (i.e. selects
>> which of the available stats to use and in what order) is probably the
>> most complex part of the whole patch, and I myself do have some
>> questions about some aspects of it. I don't think this part might get
>> into 9.6 at this point (although it'd be nice if we managed to do
>> that).
>
> Hum. So without 0006 or beyond, there's not much benefit for the
> PostgreSQL users, and you are not too confident about 0006 or
> beyond. Then I would think it is a little bit hard to justify in
> putting 000[2-5] into 9.6. I really like this feature and would like
> to see in PostgreSQL someday, but I'm not sure if we should put the
> patches (0002-0005) into PostgreSQL now. Please let me know if there's
> some reaons we should put the patches into PostgreSQL now.

I don't think so. While being able to combine multiple statistics is 
certainly useful, I'm convinced that the initial patched add enough 
value on their own, even if the 0006 patch gets committed later.

A lot of queries will be just fine with the "single multivariate 
statistics" limitation, either because it's using less than 8 columns, 
or because only 8 columns are actually correlated. (FWIW the 8 column 
limit is mostly arbitrary, it may get increased if needed.)

I haven't really mentioned the aspects of 0006 that I think need more 
discussion, but it's mostly about the question whether combining the 
statistics by using the overlapping clauses as "conditions" is the right 
thing to do (or whether a more expensive approach is needed). None of 
that however invalidates the preceding patches.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: multivariate statistics v14
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: checkpointer continuous flushing