Обсуждение: Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats
At Thu, 6 Apr 2017 18:59:35 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in <CAKJS1f-yrLizV5N_-r1o4vemuZBTJd8EzwPyx2QG=F6891++=g@mail.gmail.com> > On 6 April 2017 at 18:03, Kyotaro HORIGUCHI > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in <CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w@mail.gmail.com> > >> On 6 April 2017 at 13:05, David Rowley <david.rowley@2ndquadrant.com> wrote: > I'm not all that sure why the number of columns in the relation has > relevance to the performance of find_relation_from_clauses(). The > bms_get_singleton_member() is checking which relations are part of the > RestrictInfo, nothing related to columns in relations. > Perhaps you meant clauses in the clauses list? Which does not really > have all that much to do with the number of columns in the relation > either. Sorry, it's number of relations, not columns. I'm not sure up to how many relations we practically should consider but anyway it is extra burden to every call to clauselist_selectivity. We should avoid calling find_relation_from_clauses as far as possible or do the same in more simple way. However I'm not sure more precise exclusion is possible or not, I thinks that the case of jointype != JOIN_INNER can be exluded. > > At Thu, 6 Apr 2017 13:05:24 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in <CAKJS1f_gB=gyZn8wMw0v8uCKD1nYeWyNYCXKz=+Oo0yR4RRyiA@mail.gmail.com> > >> > And you measured the overhead of doing it the other way to be ... ? > >> > Premature optimization and all that. > >> > >> I tested with the attached, and it does not seem to hurt planner > >> performance executing: > > > > Here, bms_singleton_member takes longer time if the relation has > > many columns and there's a functional dependency covering the > > columns at the very tail. Maybe only two are not practical for > > testing. > > Can you explain why you think this? And confirm you're speaking about > the bms_get_singleton() member in find_relation_from_clauses() I mentioned dependency_is_compatible_clause here, but I saw that it has been simplified enough in the committed shape. > > Even if it doesn't impact performance detectably, if only one > > attribute is needed, an AttrNumber member in context will be > > sufficient. No bitmap operation seems required in > > dependency_compatible_walker and it can bail out by the second > > attribute. > > Are you looking at an old patch? That function no longer exists. Yes! Sorry for the noise. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
On 6 April 2017 at 19:50, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > At Thu, 6 Apr 2017 18:59:35 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in <CAKJS1f-yrLizV5N_-r1o4vemuZBTJd8EzwPyx2QG=F6891++=g@mail.gmail.com> >> On 6 April 2017 at 18:03, Kyotaro HORIGUCHI >> <horiguchi.kyotaro@lab.ntt.co.jp> wrote: >> > At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in <CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w@mail.gmail.com> >> >> On 6 April 2017 at 13:05, David Rowley <david.rowley@2ndquadrant.com> wrote: >> I'm not all that sure why the number of columns in the relation has >> relevance to the performance of find_relation_from_clauses(). The >> bms_get_singleton_member() is checking which relations are part of the >> RestrictInfo, nothing related to columns in relations. >> Perhaps you meant clauses in the clauses list? Which does not really >> have all that much to do with the number of columns in the relation >> either. > > Sorry, it's number of relations, not columns. I'm not sure up to > how many relations we practically should consider but anyway it > is extra burden to every call to clauselist_selectivity. We > should avoid calling find_relation_from_clauses as far as > possible or do the same in more simple way. However I'm not sure > more precise exclusion is possible or not, I thinks that the case > of jointype != JOIN_INNER can be exluded. Well, I imagine queries with >= 32 relations are not planning very quickly as of today already. I understand what you mean when you speak of attributes, as we could constantly be looking for the 1400's attribute which is many loops into a bms_get_singleton_member() call. I can't imagine we'll even flow over the first word in a bitmap set in 99% of cases with clause_relids. In any case, even if there's a giant chain of clauses in the the 'clauses' list, we'll bail out on the first join qual anyway, since it won't be a singleton clause_relid. I'd say if you can come up with a test case where you can measure the impact of this, then let's discuss more. Otherwise we're stepping back into the territory that Tom warned me about a few emails up.... Premature Optimisation. I'm not walking down there again, I only just got back. -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services