Обсуждение: Re: [SQL] bad select performance fixed by forbidding hash joins

Поиск
Список
Период
Сортировка

Re: [SQL] bad select performance fixed by forbidding hash joins

От
George Young
Дата:
[PostgreSQL 6.5.0 on i586-pc-linux-gnu, compiled by gcc egcs-2.91.66]
table opset_steps
      (name text, id int2, ver int2) [1400 rows]
      non-unique index is on (id, ver)

table run_opsets
      (status int2, id int2, ver int2, run_id int2, seq int2) [17000 rows]
      pkey is (id, seq), second index on(status, id, ver, run_id)
      select count(*) from run_opsets where status=1; --> 187
      select count(*) from run_opsets where status=3; --> 10564

table runs
      (run_name text, run_id int2, status int2) [900 rows]
      pkey is run_name, second index(run_id, status)
      select count(*)from runs where status=1; -->68

I have vacuum analyzed all relevant tables.

explain select os.name,r.run_name,ro.status from opset_steps os,runs r,run_opsets ro where (ro.status=3 or ro.status=1)
andro.opset_id=os.opset_id and ro.run_id=r.run_id and ro.opset_ver=os.opset_ver and r.status=1; 

Hash Join  (cost=1793.58 rows=14560 width=38)
  ->  Hash Join  (cost=1266.98 rows=14086 width=24)
        ->  Seq Scan on run_opsets ro  (cost=685.51 rows=13903 width=8)
        ->  Hash  (cost=70.84 rows=1389 width=16)
              ->  Seq Scan on opset_steps os  (cost=70.84 rows=1389 width=16)
  ->  Hash  (cost=47.43 rows=374 width=14)
        ->  Seq Scan on runs r  (cost=47.43 rows=374 width=14)

This query takes 16 seconds. [returns 3126 rows]

On Tue, Jul 20, 1999 at 05:42:20PM -0400, Tom Lane wrote:
> On Tue, 20 Jul 1999 14:56:46 -0400 George Young wrote:
> > ... Is this then
> > the best that postgres can do?  Is there some rephrasing/restructuring of
> > this query that would make it faster?
>
> Hard to say.  The query looks reasonable as it stands ---
> ...  You have no restriction
> clause on opset_steps so all of those entries get loaded for hashing;
> can you provide one?
No.
> The system's plan looks pretty reasonable as well.  It might be that
> a merge join would be faster than a hash join but I wouldn't assume
> so.  If you want, you can try forcing the system not to use hashes;
> start psql with environment variable
>     PGOPTIONS="-fh"
> and see what sort of plan and execution time you get.  If that does
> turn out to be a big win it would indicate that the planner is using
> poor cost estimates, which is certainly possible...

Yes!  PGOPTIONS="-fh" made the query time go from 16 seconds to 2 seconds!
Is this a safe thing to leave on permanently, or is there some way to set
PGOPTIONS for just this query?

explain select os.name,r.run_name,ro.status from opset_steps os,runs r,run_opsets ro where (ro.status=3 or ro.status=1)
andro.opset_id=os.opset_id and ro.run_id=r.run_id and ro.opset_ver=os.opset_ver and r.status=1; 

Merge Join  (cost=9295.54 rows=14560 width=38)
  ->  Seq Scan  (cost=8676.01 rows=14371 width=22)
        ->  Sort  (cost=8676.01 rows=14371 width=22)
              ->  Merge Join  (cost=1657.30 rows=14371 width=22)
                    ->  Index Scan using run_opsets_pkey on run_opsets ro  (cost=1031.25 rows=13903 width=8)
                    ->  Seq Scan  (cost=154.91 rows=374 width=14)
                          ->  Sort  (cost=154.91 rows=374 width=14)
                                ->  Seq Scan on runs r  (cost=47.43 rows=374 width=14)
  ->  Index Scan using opset_steps_idx_ver_id on opset_steps os  (cost=99.45 rows=1389 width=16)


With PGOPTIONS=-fh, this query takes ~ 2 seconds! [returns 3126 rows]

--
George Young,  Rm. L-204        gry@ll.mit.edu
MIT Lincoln Laboratory
244 Wood St.
Lexington, Massachusetts  02420-9108    (781) 981-2756

Re: [SQL] bad select performance fixed by forbidding hash joins

От
Tom Lane
Дата:
George Young <gry@ll.mit.edu> writes:
> Yes!  PGOPTIONS="-fh" made the query time go from 16 seconds to 2 seconds!
> Is this a safe thing to leave on permanently, or is there some way to set
> PGOPTIONS for just this query?

I wouldn't recommend leaving it on as a long-term solution, because
you're hobbling the system for cases where hashjoin *is* the best
method.  AFAIK there is not a SET VARIABLE method for enabling/disabling
plan types on-the-fly, though perhaps one should be added.

The right long-term solution is to figure out why the system is
misestimating the relative costs of the two plans, and fix the cost
estimates.  (The system is estimating that the mergejoin is about 4x
slower than hash; if it's really 8x faster, there is something pretty
broken about the estimate...)

I am interested in looking into this.  If your data is not proprietary,
perhaps you would be willing to send me a database dump so that I can
reproduce the problem exactly?  (If the dump is no more than a few
megabytes, emailing it should be OK.)  No big hurry, since I probably
won't be able to get to it for a week or so anyway.

            regards, tom lane

Re: [GENERAL] Re: [SQL] bad select performance fixed by forbidding hash joins

От
Bruce Momjian
Дата:
> George Young <gry@ll.mit.edu> writes:
> > Yes!  PGOPTIONS="-fh" made the query time go from 16 seconds to 2 seconds!
> > Is this a safe thing to leave on permanently, or is there some way to set
> > PGOPTIONS for just this query?
>
> I wouldn't recommend leaving it on as a long-term solution, because
> you're hobbling the system for cases where hashjoin *is* the best
> method.  AFAIK there is not a SET VARIABLE method for enabling/disabling
> plan types on-the-fly, though perhaps one should be added.

Postgres does have options to prohibit certain join types, so you could
try forcing a certain join type and see what happens.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: bad select performance fixed by forbidding hash joins

От
Tom Lane
Дата:
A few weeks ago, George Young <gry@ll.mit.edu> complained that the
following query:

> select os.name,r.run_name,ro.status from opset_steps os,runs r,
> run_opsets ro where (ro.status=3 or ro.status=1) and
> ro.opset_id=os.opset_id and ro.run_id=r.run_id and
> ro.opset_ver=os.opset_ver and r.status=1;

had horrible performance when executed via the system's preferred plan,

> Hash Join  (cost=1793.58 rows=14560 width=38)
>   ->  Hash Join  (cost=1266.98 rows=14086 width=24)
>         ->  Seq Scan on run_opsets ro  (cost=685.51 rows=13903 width=8)
>         ->  Hash  (cost=70.84 rows=1389 width=16)
>               ->  Seq Scan on opset_steps os  (cost=70.84 rows=1389 width=16)
>   ->  Hash  (cost=47.43 rows=374 width=14)
>         ->  Seq Scan on runs r  (cost=47.43 rows=374 width=14)

I have looked into this, and it seems that the problem is this: the
innermost hash join between run_opsets and opset_steps is being done
on the join clause ro.opset_ver=os.opset_ver.  In George's data,
the opset_ver columns only have about 14 distinct values, with a
very strong bias towards the values 1,2,3.  This means that the vast
majority of the opset_steps entries go into only three hash buckets,
and the vast majority of the probes from run_opsets search one of those
same three buckets, so that most of the run_opsets rows are being
compared to almost a third of the opset_steps rows, not just a small
fraction of them.  Almost all of the runtime of the query is going into
the tuple comparison tests :-(

It seems clear that we want the system not to risk using a hashjoin
unless it has good evidence that the inner table's column has a fairly
flat distribution.  I'm thinking that the right sort of check would be
to check whether the "disbursion" statistic set by VACUUM ANALYZE is
fairly small, maybe 0.01 or less (but not zero, which would suggest
that VACUUM ANALYZE has never been run).  This would roughly
correspond to the most common value appearing not more than 1% of the
time, so that we can be sure at least 100 different hashbuckets will
be used.  Comments?  Is that value too small?

This change is likely to reduce the optimizer's willingness to use
hashjoins by a *lot*, especially if we make the threshold too small.
If you'd like to see what kind of disbursion numbers you get on your
own data, try something likeselect relname,attname,attdisbursion from pg_class,pg_attributewhere attrelid =
pg_class.oidand relkind = 'r' and attnum > 0order by relname,attname;
 
after a vacuum analyze.
        regards, tom lane

PS: George, in the meantime I bet your query would run fine if the
system would only choose the opset_id clause instead of opset_ver
to do the hashjoin with --- opset_id has far better distribution.
I'm guessing that it thinks the two clauses are equally attractive
and is just choosing whichever one it happens to process first (or
last?).  You might try rearranging the order of the WHERE clauses
as a stopgap solution...