Обсуждение: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE
Hi, When analyzing the plan of a query I often find myself questioning whether an additional index may be sensible, or if it is sensible that a SeqScan is used if an index is available. The current EXPLAIN ANALYZE only shows the number of tuples matching the qualifier of an SeqScan Node - for analyzing the above situation it is at least equally interesting how many tuples were read and discarded. Therefore I produced a patch which adds a 'discarded=%f' part to the analyze output. As this is only a RFD the implementation is a bit hackish at the moment - the discarded counter is increased in execScan directly instead of a helper routine in instrument.c. Also the discarded count is displayed in other node types as well - for some there might be a sensible semantic meaning to it... Good idea - Bad idea? Greetings, Andres
Re: RFD: Discarded tuple count for SeqScan nodes in EXPLAIN ANALYZE - Patch v1
От
Andres Freund
Дата:
Andres Freund wrote: > > When analyzing the plan of a query I often find myself questioning > whether an additional index may be sensible, or if it is sensible that a > SeqScan is used if an index is available. > > The current EXPLAIN ANALYZE only shows the number of tuples matching the > qualifier of an SeqScan Node - for analyzing the above situation it is > at least equally interesting how many tuples were read and discarded. > > Therefore I produced a patch which adds a 'discarded=%f' part to the > analyze output. > As this is only a RFD the implementation is a bit hackish at the moment > - the discarded counter is increased in execScan directly instead of a > helper routine in instrument.c. > Also the discarded count is displayed in other node types as well - for > some there might be a sensible semantic meaning to it... > > Good idea - Bad idea? Isn't the discarded count always equal to (# of rows in table - matched tuples)? Seems pretty redundant to me. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Hi, On 05/22/2009 03:42 PM, Heikki Linnakangas wrote: > Andres Freund wrote: >> When analyzing the plan of a query I often find myself questioning >> whether an additional index may be sensible, or if it is sensible that >> a SeqScan is used if an index is available. >> >> The current EXPLAIN ANALYZE only shows the number of tuples matching >> the qualifier of an SeqScan Node - for analyzing the above situation >> it is at least equally interesting how many tuples were read and >> discarded. >> Good idea - Bad idea? > Isn't the discarded count always equal to (# of rows in table - matched > tuples)? Seems pretty redundant to me. Not for EXISTS(), LIMIT and similar. Also when looking at more complex plans its quite a nuisance to go through all participating tables and do a separate count(*). Especially its not your plan but some clients plan etc. Andres
Andres Freund <andres@anarazel.de> writes: > On 05/22/2009 03:42 PM, Heikki Linnakangas wrote: >> Isn't the discarded count always equal to (# of rows in table - matched >> tuples)? Seems pretty redundant to me. > Not for EXISTS(), LIMIT and similar. It doesn't really seem useful enough to justify breaking client-side code that looks at EXPLAIN output. This sort of ties into the discussions we have periodically about allowing EXPLAIN to output XML or some other more-machine-friendly data format. The barrier for adding additional output fields would be a lot lower in such a format. regards, tom lane
On Fri, May 22, 2009 at 4:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > It doesn't really seem useful enough to justify breaking client-side > code that looks at EXPLAIN output. Fwiw at least pgadmin I don't think would be confused by this. These tool authors aren't enamoured of fragile assumptions and the maintenance headaches they cause either. > This sort of ties into the discussions we have periodically about > allowing EXPLAIN to output XML or some other more-machine-friendly > data format. The barrier for adding additional output fields would > be a lot lower in such a format. This is still pretty much true if only for the sheer unscalability of the amount of data being presented for users to sift through. I do want us to add a ton more instrumentation into the explain plan and this is only one small addition. If we add number of hard and soft i/os, time spent in user and system space, etc the result would be pretty unreadable and they're at least as important as things like this. -- greg
Hi, On 05/22/2009 05:54 PM, Tom Lane wrote: > This sort of ties into the discussions we have periodically about > allowing EXPLAIN to output XML or some other more-machine-friendly > data format. The barrier for adding additional output fields would > be a lot lower in such a format. So the best thing would be to work on that front... Tom (Raney), did you further work on your XML explain patch? Could you use help? Greetings, Andres