Re: [HACKERS] Discussion on missing optimizations
От | Andres Freund |
---|---|
Тема | Re: [HACKERS] Discussion on missing optimizations |
Дата | |
Msg-id | 20171007023855.dl3uhfskmhi2wtad@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: [HACKERS] Discussion on missing optimizations (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On 2017-10-06 22:19:54 -0400, Tom Lane wrote: > The impression I have in a quick scan is that probably hardly any of these > are cases that any of the DB designers think are important in > themselves. > In the end, what the article fails to consider is that all of these are > tradeoffs, not unalloyed goods. If you spend planner cycles on every > query to look for cases that only the most unabashedly brain-dead ORMs > ever generate, you're not really doing your users a favor on balance. Partially agreed. A comment to the article also mentions that some other database performs more optimizations depending on the cost of the plan. That's not easy to do in our current plan structure, but I think it's quite a worthwhile concept. > Rather, they fall out of more general optimization attempts, or not, > depending on the optimization mechanisms in use in a particular DB. > For example, reducing "WHERE 1=1" to "WHERE TRUE" and then to nothing > comes out of a constant-subexpression-precalculation mechanism for us, > whereas "WHERE column=column" doesn't fall to that approach. ISTM it > would be really dumb to expend planner cycles looking specifically for > that case, so I guess that DB2 et al are finding it as a side-effect of > some more general optimization ... I wonder what that is? > > (edit: a few minutes later, I seem to remember that equivclass.c has > to do something special with the X=X case, so maybe it could do > something else special instead, with little new overhead.) Yea, I think this should be inferrable from information we essentially already compute for equivclasses. > > (i.e. combining a IN (a,b,c) and a IN (c,d,f) into a IN (c), similar > > with OR) > > > I can't remember proposals about adding this, but it seems worthwhile to > > consider. I think we should be able to check for this without a lot of > > planner overhead. > > It would be enormously expensive to check that in the general case with > a bunch of entries in each IN list. Perhaps it would be OK to add on > the presumption that few queries would contain multiple IN's on the same > column in the first place, though. Not sure. Merging of ORs should be near free if you leave duplicates in there, and the duplicates aren't a huge concern if your alternative is is to have them, but also have two clauses to evaluate. I think the IN AND IN case should usually end up a clear win as well. Both lists are going to be evaluated for each row anyway - you don't need a lot of rows where clauses are evaluated to make it worth the O(n*log(n)) of sorting the lists. Sorting them would be beneficial for other reasons as well, e.g. it improves access patterns for SAO index scans. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
В списке pgsql-hackers по дате отправления: