Re: [HACKERS] CTE inlining

Поиск
Список
Период
Сортировка
От Serge Rielau
Тема Re: [HACKERS] CTE inlining
Дата
Msg-id 651A1657-3645-40B2-8202-3BE0B2A35FD1@rielau.com
обсуждение исходный текст
Ответ на Re: [HACKERS] CTE inlining  (Gavin Flower <GavinFlower@archidevsys.co.nz>)
Ответы Re: [HACKERS] CTE inlining  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
> On May 4, 2017, at 3:02 AM, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:
>
> On 30/04/17 16:28, Tom Lane wrote:
>> Craig Ringer <craig.ringer@2ndquadrant.com> writes:
>>> - as you noted, it is hard to decide when it's worth inlining vs
>>> materializing for CTE terms referenced more than once.
>> [ raised eyebrow... ]  Please explain why the answer isn't trivially
>> "never".
>>
>> There's already a pretty large hill to climb here in the way of
>> breaking peoples' expectations about CTEs being optimization
>> fences.  Breaking the documented semantics about CTEs being
>> single-evaluation seems to me to be an absolute non-starter.
>>
>>             regards, tom lane
>>
>>
> Could not each CTE be only evaluated once, but restricted (as far as is practicable) to the rows actually needed by
thebody of the SELECT? 
>
Tom,

Are you worried about semantics or performance?
With proper detection of mutating functions and snapshot isolation I do not see how a user would detect “lack of”
singleevaluation. 
As for performance we’d be talking about what? An uncorrelated inner of a nested loop join?

Anyway it seems to me that there a multiple properties at play here which are quite orthogonal.

1. Full materialization/Slow materialization/pipelining I cannot come up with any scenario where full materialization
wouldbe beneficial from a performance point of view (which speaks to Gavin’s view). I can see it from a semantic point
ofview when order of execution may matter (for example with embedded DML and triggers present). As soon as semantics
areat play having syntax is absolutely the right thing: +1 for MATERIALIZE 
2.Pushing predicates (or other operators) into the CTE.  All this can ever do is reduce the number of rows being looked
at. As long as the optimizer is careful, not to do what it isn’t supposed to do in a nested query (push past a mutating
function)I don’t see the harm 
3. Replicating the CTE to push distinct operators from different consumers.  Again this can only be done if there are
nomutators or non deterministic operators. 

To summarize: big +1 to preserve the existing behavior with MATERIALIZE and to set CTE’s free by default with the onus
onthe optimizer to prove semantic equivalence. 

Cheers
Serge Rielau
Salesforce.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Chapman Flack
Дата:
Сообщение: [HACKERS] idea: custom log_line_prefix components besides application_name
Следующее
От: David Rowley
Дата:
Сообщение: Re: [HACKERS] Should pg_current_wal_location() become pg_current_wal_lsn()