Re: Performance improvement for joins where outer side is unique

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: Performance improvement for joins where outer side is unique
Дата
Msg-id 20150316.172855.213501931.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Performance improvement for joins where outer side is unique  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Performance improvement for joins where outer side is unique  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Список pgsql-hackers
Hello, I don't have enough time for now but made some
considerations on this.

It might be a way that marking unique join peer at bottom and
propagate up, not searching from top of join list.
Around create_join_clause might be a candidate for it.
I'll investigate that later.

regards,


At Sat, 14 Mar 2015 23:05:24 +1300, David Rowley <dgrowleyml@gmail.com> wrote in
<CAApHDvoh-EKF51QQyNoJUe0eHYMZw6OzJjjgYP63Cmw7QfebjA@mail.gmail.com>
dgrowleyml> On 14 March 2015 at 14:51, David Rowley <dgrowleyml@gmail.com> wrote:
dgrowleyml> 
dgrowleyml> > On 13 March 2015 at 20:34, Kyotaro HORIGUCHI <
dgrowleyml> > horiguchi.kyotaro@lab.ntt.co.jp> wrote:
dgrowleyml> >
dgrowleyml> >> Unfortunately I can't decide this to be 'ready for commiter' for
dgrowleyml> >>
dgrowleyml> > now. I think we should have this on smaller footprint, in a
dgrowleyml> >> method without separate exhauxtive searching. I also had very
dgrowleyml> >> similar problem in the past but I haven't find such a way for my
dgrowleyml> >> problem..
dgrowleyml> >>
dgrowleyml> >>
dgrowleyml> > I don't think it's ready yet either. I've just been going over a few
dgrowleyml> > things and looking at Tom's recent commit b557226 in pathnode.c I've got a
dgrowleyml> > feeling that this patch would need to re-factor some code that's been
dgrowleyml> > modified around the usage of relation_has_unique_index_for() as when this
dgrowleyml> > code is called, the semi joins have already been analysed to see if they're
dgrowleyml> > unique, so it could be just a case of ripping all of that out
dgrowleyml> > of create_unique_path() and just putting a check to say rel->is_unique_join
dgrowleyml> > in there. But if I do that then I'm not quite sure if
dgrowleyml> > SpecialJoinInfo->semi_rhs_exprs and SpecialJoinInfo->semi_operators would
dgrowleyml> > still be needed at all. These were only added in b557226. Changing this
dgrowleyml> > would help reduce the extra planning time when the query contains
dgrowleyml> > semi-joins. To be quite honest, this type of analysis belongs in
dgrowleyml> > analyzejoin.c anyway. I'm tempted to hack at this part some more, but I'd
dgrowleyml> > rather Tom had a quick glance at what I'm trying to do here first.
dgrowleyml> >
dgrowleyml> >
dgrowleyml> 
dgrowleyml> I decided to hack away any change the code Tom added in b557226. I've
dgrowleyml> changed it so that create_unique_path() now simply just uses if
dgrowleyml> (rel->is_unique_join), instead off all the calls to
dgrowleyml> relation_has_unique_index_for() and query_is_distinct_for().  This vastly
dgrowleyml> simplifies that code. One small change is that Tom's checks for uniqueness
dgrowleyml> on semi joins included checks for volatile functions, this check didn't
dgrowleyml> exist in the original join removal code, so I've left it out. We'll never
dgrowleyml> match a expression with a volatile function to a unique index as indexes
dgrowleyml> don't allow volatile function expressions anyway. So as I understand it
dgrowleyml> this only serves as a fast path out if the join condition has a volatile
dgrowleyml> function... But I'd assume that check is not all that cheap.
dgrowleyml> 
dgrowleyml> I ended up making query_supports_distinctness() and query_is_distinct_for()
dgrowleyml> static in analyzejoins.c as they're not used in any other files.
dgrowleyml> relation_has_unique_index_for() is also now only used in analyzejoins.c,
dgrowleyml> but I've not moved it into that file yet as I don't want to bloat the
dgrowleyml> patch. I just added a comment to say it needs moved.
dgrowleyml> 
dgrowleyml> I've also added a small amount of code to query_is_distinct_for() which
dgrowleyml> allows subqueries such as (select 1 a offset 0) to be marked as unique. I
dgrowleyml> thought it was a little silly that these were not being detected as unique,
dgrowleyml> so I fixed it. This has the side effect of allowing left join removals for
dgrowleyml> queries such as: select t1.* from t1 left join (select 1 a offset 0) a on
dgrowleyml> t1.id=a.a;
dgrowleyml> 
dgrowleyml> Updated patch attached.
dgrowleyml> 
dgrowleyml> Regards
dgrowleyml> 
dgrowleyml> David Rowley




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Re: Install shared libs in lib/ and bin/ with MSVC (Was: install libpq.dll in bin directory on Windows / Cygwin)
Следующее
От: Dmitry Voronin
Дата:
Сообщение: Question about TEMP tables