Re: [HACKERS] CTE inlining

Поиск
Список
Период
Сортировка
От Ilya Shkuratov
Тема Re: [HACKERS] CTE inlining
Дата
Msg-id 8550391494354384@web15m.yandex.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] CTE inlining  (Vik Fearing <vik.fearing@2ndquadrant.com>)
Ответы Re: [HACKERS] CTE inlining  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Ok, it seems that most people in discussion are agree that removing optimization
fence is a right thing to do. But so far the main topic was whether it worth to 
make "inlining" by default, and how we should enable it.

Nonetheless I still hoping to discuss the algorithm and its implementation. 

I suppose, in case of a single reference we can validate CTE subquery and inline it
just before SS_process_ctes() in subquery_planner() and then process remaining
CTEs as before. 

The case of multiple reference is more interesting.
Ideally, we would decide whether to inline just before pull_up_sublinks(), so all 
the optimizations can be applied to inlined subquery. But It is impossible as we 
have no information to build subquery paths and estimate they costs at this point. 
All necessary initialization is performed in query_planner(), that invoked far
later in grouping_planner(). (As far as I understand.)

The most straighforward way is to compare CTE scan cost with subquery execution
and result scan cost in set_rel_size(), just after set_cte_pathlist(), and alter 
RelOptInfo, if we choose to inline.
(e.g (CTE scan) < (cheapest_path(subquery) + subquery scan))
This way we still can push down predicates as it is performed in 
set_subquery_pathlist(), but we missed pull_up_subquery().
Besides, it seems like a dirty quick solution.

Maybe it possible to add subquery scan to RTE_CTE RelOptInfo, but I'm not sure.

So what is a right way to conduct comparison between CTE scan and subquery 
execution with subsequent scan?

I am new to PostgreSQL development, so I need a guidance from someone who 
familiar with optimizer infrastructure to ensure that I moving in a right 
direction and not making something weird.


P.S. There is a paper [1] describing implementation of CTE optimization in Orca 
optimizer. It may be useful, though architecture is completely different.

[1] Optimization of Common Table Expressions in MPP Database Systems 
(http://www.vldb.org/pvldb/vol8/p1704-elhelw.pdf)


Ilya Shkuratov



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Erez Segal
Дата:
Сообщение: [HACKERS] COMPRESS VALUES feature request
Следующее
От: Marko Tiikkaja
Дата:
Сообщение: Re: [HACKERS] COMPRESS VALUES feature request