Обсуждение: performance tuning

Поиск
Список
Период
Сортировка

performance tuning

От
Joseph Shraibman
Дата:
I have a query where postgres (7.2.1) seriously overestimates the cost of using an index.
   When I do a  set enable_seqscan = false; The query goes from:

                           ->  Aggregate  (cost=49656.10..49656.10 rows=1 width=12)
                                 ->  Merge Join  (cost=49062.25..49655.18 rows=367 width=12)
                                       ->  Sort  (cost=11794.87..11794.87 rows=15220 width=6)
                                             ->  Seq Scan on u  (cost=0.00..10737.55
rows=15220 width=6)
                                       ->  Sort  (cost=37267.38..37267.38 rows=136643 width=6)
                                             ->  Seq Scan on d  (cost=0.00..24391.43
rows=136643 width=6)

--------------------- to -
                                ->  Nested Loop  (cost=0.00..102204.91 rows=367 width=12)
                                       ->  Index Scan using u_pkey_key on u
(cost=0.00..43167.33 rows=15220 width=6)
                                       ->  Index Scan using d_pkey on d  (cost=0.00..3.86
rows=1 width=6)
- to -

The first query takes three times as long as the second. Since postgres seems to think
that the nested loop takes so long do I have to lower cpu_operator_cost to get postgres to
use the nested loop?

And does 7.3 have any improvements in this area?


Re: performance tuning

От
Joseph Shraibman
Дата:
Joseph Shraibman wrote:
> Since postgres
> seems to think that the nested loop takes so long do I have to lower
> cpu_operator_cost to get postgres to use the nested loop?

To answer my own question that doesn't work. I've kept playing around with different
paramaters with different variables but I can't find anything except disabling seqscans.

This is really annoying, because *all* of my queries suddenly slowed down at the same
time. What can I do?  Is there something I can change in the source to have nested loops
seem cheaper? I haven't found anything.


Re: performance tuning

От
Martijn van Oosterhout
Дата:
On Tue, Dec 03, 2002 at 09:48:46PM -0500, Joseph Shraibman wrote:
> Joseph Shraibman wrote:
> >Since postgres
> >seems to think that the nested loop takes so long do I have to lower
> >cpu_operator_cost to get postgres to use the nested loop?
>
> To answer my own question that doesn't work. I've kept playing around with
> different paramaters with different variables but I can't find anything
> except disabling seqscans.
>
> This is really annoying, because *all* of my queries suddenly slowed down
> at the same time. What can I do?  Is there something I can change in the
> source to have nested loops seem cheaper? I haven't found anything.

What does explain analyze tell you?
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Support bacteria! They're the only culture some people have.

Вложения

Re: performance tuning

От
Joseph Shraibman
Дата:
Martijn van Oosterhout wrote:
> On Tue, Dec 03, 2002 at 09:48:46PM -0500, Joseph Shraibman wrote:
>
>>Joseph Shraibman wrote:
>>
>>>Since postgres
>>>seems to think that the nested loop takes so long do I have to lower
>>>cpu_operator_cost to get postgres to use the nested loop?
>>
>>To answer my own question that doesn't work. I've kept playing around with
>>different paramaters with different variables but I can't find anything
>>except disabling seqscans.
>>
>>This is really annoying, because *all* of my queries suddenly slowed down
>>at the same time. What can I do?  Is there something I can change in the
>>source to have nested loops seem cheaper? I haven't found anything.
>
>
> What does explain analyze tell you?


NOTICE:  QUERY PLAN:

Aggregate  (cost=102546.41..102546.41 rows=1 width=12) (actual time=16863.09..16863.09
rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..102545.49 rows=367 width=12) (actual
time=1034.46..16861.51 rows=254 loops=1)
         ->  Index Scan using u_p_key on u  (cost=0.00..43483.93 rows=15223 width=6)
(actual time=0.29..495.12 rows=17912 loops=1)
         ->  Index Scan using d_pkey on directory d  (cost=0.00..3.86 rows=1 width=6)
(actual time=0.90..0.91 rows=1 loops=17912)
Total runtime: 16863.26 msec

-------------------

NOTICE:  QUERY PLAN:

Aggregate  (cost=51432.61..51432.61 rows=1 width=12) (actual time=22158.72..22158.72
rows=1 loops=1)
   ->  Merge Join  (cost=50838.66..51431.69 rows=367 width=12) (actual
time=21266.19..22156.59 rows=254 loops=1)
         ->  Sort  (cost=12208.53..12208.53 rows=15223 width=6) (actual
time=3297.82..3395.68 rows=17912 loops=1)
               ->  Seq Scan on u  (cost=0.00..11151.01 rows=15223 width=6) (actual
time=0.08..3060.66 rows=17912 loops=1)
         ->  Sort  (cost=38630.13..38630.13 rows=136667 width=6) (actual
time=17967.08..18383.46 rows=140492 loops=1)
               ->  Seq Scan on d  (cost=0.00..25751.95 rows=136667 width=6) (actual
time=0.06..14766.69 rows=140492 loops=1)
Total runtime: 22285.74 msec

These are simplified versions of my query designed to highlight that particular join. In
my real query the results are even more out of balance:

Sort  (cost=95409.39..95409.39 rows=3 width=641) (actual time=47092.77..47092.78 rows=26
loops=1)
  <snip>
vs.
Sort  (cost=205121.27..205121.27 rows=3 width=641) (actual time=6461.65..6461.66 rows=26
loops=1)
  <snip>



Re: performance tuning

От
"scott.marlowe"
Дата:
On Tue, 3 Dec 2002, Joseph Shraibman wrote:

> Joseph Shraibman wrote:
> > Since postgres
> > seems to think that the nested loop takes so long do I have to lower
> > cpu_operator_cost to get postgres to use the nested loop?
>
> To answer my own question that doesn't work. I've kept playing around with different
> paramaters with different variables but I can't find anything except disabling seqscans.
>
> This is really annoying, because *all* of my queries suddenly slowed down at the same
> time. What can I do?  Is there something I can change in the source to have nested loops
> seem cheaper? I haven't found anything.

Have you tried lowering random page cost?

I've often found that I need to set it to somewhere from 0.5 to 2.0 to get
a sane response from the planner on some queries.

Of course, I'm assuming you've analyzed your data, and that the guesses
the analyzer made were close to correct (i.e. number of rows estimated for
a query and the number actually returned are pretty close to each other.)


Re: performance tuning

От
Martijn van Oosterhout
Дата:
On Wed, Dec 04, 2002 at 02:15:35PM -0500, Joseph Shraibman wrote:
> Martijn van Oosterhout wrote:
> >On Tue, Dec 03, 2002 at 09:48:46PM -0500, Joseph Shraibman wrote:
> >
> >>Joseph Shraibman wrote:
> >>
> >>>Since postgres
> >>>seems to think that the nested loop takes so long do I have to lower
> >>>cpu_operator_cost to get postgres to use the nested loop?
> >>
> >>To answer my own question that doesn't work. I've kept playing around
> >>with different paramaters with different variables but I can't find
> >>anything except disabling seqscans.
> >>
> >>This is really annoying, because *all* of my queries suddenly slowed down
> >>at the same time. What can I do?  Is there something I can change in the
> >>source to have nested loops seem cheaper? I haven't found anything.
> >
> >
> >What does explain analyze tell you?

Hmm, the row counts don't seem to be too far off but it's overestimating the
cost of your index scans. As the other poster mentioned try:

set seq_scan=[on|off]
set random_page_cost = 0.5..2.0

Hope this helps,

> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=102546.41..102546.41 rows=1 width=12) (actual
> time=16863.09..16863.09 rows=1 loops=1)
>   ->  Nested Loop  (cost=0.00..102545.49 rows=367 width=12) (actual
> time=1034.46..16861.51 rows=254 loops=1)
>         ->  Index Scan using u_p_key on u  (cost=0.00..43483.93 rows=15223
>         width=6) (actual time=0.29..495.12 rows=17912 loops=1)
>         ->  Index Scan using d_pkey on directory d  (cost=0.00..3.86 rows=1
>         width=6) (actual time=0.90..0.91 rows=1 loops=17912)
> Total runtime: 16863.26 msec
>
> -------------------
>
> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=51432.61..51432.61 rows=1 width=12) (actual
> time=22158.72..22158.72 rows=1 loops=1)
>   ->  Merge Join  (cost=50838.66..51431.69 rows=367 width=12) (actual
> time=21266.19..22156.59 rows=254 loops=1)
>         ->  Sort  (cost=12208.53..12208.53 rows=15223 width=6) (actual
> time=3297.82..3395.68 rows=17912 loops=1)
>               ->  Seq Scan on u  (cost=0.00..11151.01 rows=15223 width=6)
>               (actual time=0.08..3060.66 rows=17912 loops=1)
>         ->  Sort  (cost=38630.13..38630.13 rows=136667 width=6) (actual
> time=17967.08..18383.46 rows=140492 loops=1)
>               ->  Seq Scan on d  (cost=0.00..25751.95 rows=136667 width=6)
>               (actual time=0.06..14766.69 rows=140492 loops=1)
> Total runtime: 22285.74 msec
>
> These are simplified versions of my query designed to highlight that
> particular join. In my real query the results are even more out of balance:
>
> Sort  (cost=95409.39..95409.39 rows=3 width=641) (actual
> time=47092.77..47092.78 rows=26 loops=1)
>  <snip>
> vs.
> Sort  (cost=205121.27..205121.27 rows=3 width=641) (actual
> time=6461.65..6461.66 rows=26 loops=1)
>  <snip>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Support bacteria! They're the only culture some people have.

Вложения

Re: performance tuning

От
Joseph Shraibman
Дата:
Martijn van Oosterhout wrote:
> On Wed, Dec 04, 2002 at 02:15:35PM -0500, Joseph Shraibman wrote:
>
>>Martijn van Oosterhout wrote:
>>
>>>On Tue, Dec 03, 2002 at 09:48:46PM -0500, Joseph Shraibman wrote:
>>>
>>>
>>>>Joseph Shraibman wrote:
>>>>
>>>>
>>>>>Since postgres
>>>>>seems to think that the nested loop takes so long do I have to lower
>>>>>cpu_operator_cost to get postgres to use the nested loop?
>>>>
>>>>To answer my own question that doesn't work. I've kept playing around
>>>>with different paramaters with different variables but I can't find
>>>>anything except disabling seqscans.
>>>>
>>>>This is really annoying, because *all* of my queries suddenly slowed down
>>>>at the same time. What can I do?  Is there something I can change in the
>>>>source to have nested loops seem cheaper? I haven't found anything.
>>>
>>>
>>>What does explain analyze tell you?
>
>
> Hmm, the row counts don't seem to be too far off but it's overestimating the
> cost of your index scans. As the other poster mentioned try:
>
> set seq_scan=[on|off]
> set random_page_cost = 0.5..2.0
>
Setting it to .2 got it to use the index on the d table, setting it to .02 got it to use
an index on both. But now it is using a merge join instead of a nested loop.

Aggregate  (cost=23244.99..23244.99 rows=1 width=12)
   ->  Merge Join  (cost=1645.39..23244.08 rows=367 width=12)
         ->  Sort  (cost=1645.39..1645.39 rows=15223 width=6)
               ->  Index Scan using u_p_key on u  (cost=0.00..587.86 rows=15223 width=6)
         ->  Index Scan using d_pkey on d  (cost=0.00..21005.66 rows=136667 width=6)

where before just setting enable_seqscan = false  gave a nested loop:

Aggregate  (cost=102546.41..102546.41 rows=1 width=12) (actual time=16863.09..16863.09
rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..102545.49 rows=367 width=12) (actual
time=1034.46..16861.51 rows=254 loops=1)
         ->  Index Scan using u_p_key on u  (cost=0.00..43483.93 rows=15223 width=6)
(actual time=0.29..495.12 rows=17912 loops=1)
         ->  Index Scan using d_pkey on directory d  (cost=0.00..3.86 rows=1 width=6)
(actual time=0.90..0.91 rows=1 loops=17912)
Total runtime: 16863.26 msec

What makes postgres choose one or the other, and is it likely to hurt performance? I can't
to an explain analyse right now because the database is being used heavliy right now. I'll
do one later.



Re: performance tuning

От
Martijn van Oosterhout
Дата:
On Wed, Dec 04, 2002 at 05:00:37PM -0500, Joseph Shraibman wrote:
> Martijn van Oosterhout wrote:
> >Hmm, the row counts don't seem to be too far off but it's overestimating
> >the
> >cost of your index scans. As the other poster mentioned try:
> >
> >set seq_scan=[on|off]
> >set random_page_cost = 0.5..2.0
> >
> Setting it to .2 got it to use the index on the d table, setting it to .02
> got it to use an index on both. But now it is using a merge join instead of
> a nested loop.

Odd how it is doing a sort after the index scan. Perhaps you need to look at
your query and see if you can encourage the right join order.

> Aggregate  (cost=23244.99..23244.99 rows=1 width=12)
>   ->  Merge Join  (cost=1645.39..23244.08 rows=367 width=12)
>         ->  Sort  (cost=1645.39..1645.39 rows=15223 width=6)
>               ->  Index Scan using u_p_key on u  (cost=0.00..587.86
>               rows=15223 width=6)
>         ->  Index Scan using d_pkey on d  (cost=0.00..21005.66 rows=136667
>         width=6)
>
> where before just setting enable_seqscan = false  gave a nested loop:
>
> Aggregate  (cost=102546.41..102546.41 rows=1 width=12) (actual
> time=16863.09..16863.09 rows=1 loops=1)
>   ->  Nested Loop  (cost=0.00..102545.49 rows=367 width=12) (actual
> time=1034.46..16861.51 rows=254 loops=1)
>         ->  Index Scan using u_p_key on u  (cost=0.00..43483.93 rows=15223
>         width=6) (actual time=0.29..495.12 rows=17912 loops=1)
>         ->  Index Scan using d_pkey on directory d  (cost=0.00..3.86 rows=1
>         width=6) (actual time=0.90..0.91 rows=1 loops=17912)
> Total runtime: 16863.26 msec
>
> What makes postgres choose one or the other, and is it likely to hurt
> performance? I can't to an explain analyse right now because the database
> is being used heavliy right now. I'll do one later.
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Support bacteria! They're the only culture some people have.

Вложения

Re: performance tuning

От
Joseph Shraibman
Дата:
What do you mean join order? There are only two tables in this simplified query.
It looks like this:
  SELECT COUNT(distinct d.ukey) FROM u,  d WHERE d.ukey = u.ukey  AND <other conditions on
d and u>

Martijn van Oosterhout wrote:
> On Wed, Dec 04, 2002 at 05:00:37PM -0500, Joseph Shraibman wrote:
>
>>Martijn van Oosterhout wrote:
>>
>>>Hmm, the row counts don't seem to be too far off but it's overestimating
>>>the
>>>cost of your index scans. As the other poster mentioned try:
>>>
>>>set seq_scan=[on|off]
>>>set random_page_cost = 0.5..2.0
>>>
>>
>>Setting it to .2 got it to use the index on the d table, setting it to .02
>>got it to use an index on both. But now it is using a merge join instead of
>>a nested loop.
>
>
> Odd how it is doing a sort after the index scan. Perhaps you need to look at
> your query and see if you can encourage the right join order.
>
>
>>Aggregate  (cost=23244.99..23244.99 rows=1 width=12)
>>  ->  Merge Join  (cost=1645.39..23244.08 rows=367 width=12)
>>        ->  Sort  (cost=1645.39..1645.39 rows=15223 width=6)
>>              ->  Index Scan using u_p_key on u  (cost=0.00..587.86
>>              rows=15223 width=6)
>>        ->  Index Scan using d_pkey on d  (cost=0.00..21005.66 rows=136667
>>        width=6)
>>
>>where before just setting enable_seqscan = false  gave a nested loop:
>>
>>Aggregate  (cost=102546.41..102546.41 rows=1 width=12) (actual
>>time=16863.09..16863.09 rows=1 loops=1)
>>  ->  Nested Loop  (cost=0.00..102545.49 rows=367 width=12) (actual
>>time=1034.46..16861.51 rows=254 loops=1)
>>        ->  Index Scan using u_p_key on u  (cost=0.00..43483.93 rows=15223
>>        width=6) (actual time=0.29..495.12 rows=17912 loops=1)
>>        ->  Index Scan using d_pkey on directory d  (cost=0.00..3.86 rows=1
>>        width=6) (actual time=0.90..0.91 rows=1 loops=17912)
>>Total runtime: 16863.26 msec
>>
>>What makes postgres choose one or the other, and is it likely to hurt
>>performance? I can't to an explain analyse right now because the database
>>is being used heavliy right now. I'll do one later.
>>
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 5: Have you checked our extensive FAQ?
>>
>>http://www.postgresql.org/users-lounge/docs/faq.html
>
>


--
Joseph Shraibman
joseph@xtenit.com
Increase signal to noise ratio.  http://xis.xtenit.com


Re: performance tuning

От
Joseph Shraibman
Дата:
OK here are the explain analyse results:

Aggregate  (cost=51432.61..51432.61 rows=1 width=12) (actual time=12384.43..12384.43
rows=1 loops=1)
   ->  Merge Join  (cost=50838.66..51431.69 rows=367 width=12) (actual
time=12150.42..12383.34 rows=254 loops=1)
         ->  Sort  (cost=12208.53..12208.53 rows=15223 width=6) (actual
time=3579.63..3592.50 rows=17918 loops=1)
               ->  Seq Scan on u  (cost=0.00..11151.01 rows=15223 width=6) (actual
time=0.12..3441.75 rows=17918 loops=1)
         ->  Sort  (cost=38630.13..38630.13 rows=136667 width=6) (actual
time=8569.49..8669.88 rows=140552 loops=1)
               ->  Seq Scan on d  (cost=0.00..25751.95 rows=136667 width=6) (actual
time=0.06..7205.61 rows=140553 loops=1)
Total runtime: 12635.24 msec

EXPLAIN
=> set random_page_cost = .2;

Aggregate  (cost=33423.86..33423.86 rows=1 width=12) (actual time=9678.22..9678.22 rows=1
loops=1)
   ->  Merge Join  (cost=3585.41..33422.94 rows=367 width=12) (actual
time=1060.41..9677.13 rows=254 loops=1)
         ->  Index Scan using d_pkey on d  (cost=0.00..29244.50 rows=136667 width=6)
(actual time=0.13..8821.37 rows=140552 loops=1)
         ->  Sort  (cost=3585.41..3585.41 rows=15223 width=6) (actual time=689.15..706.23
rows=17918 loops=1)
               ->  Index Scan using u_p_key on u  (cost=0.00..2527.89 rows=15223 width=6)
(actual time=0.18..550.03 rows=17918 loops=1)
Total runtime: 9681.19 msec

EXPLAIN
=> set random_page_cost = .02;

Aggregate  (cost=23244.99..23244.99 rows=1 width=12) (actual time=2724.45..2724.45 rows=1
loops=1)
   ->  Merge Join  (cost=1645.39..23244.08 rows=367 width=12) (actual time=254.97..2723.40
rows=254 loops=1)
         ->  Sort  (cost=1645.39..1645.39 rows=15223 width=6) (actual time=244.75..262.10
rows=17918 loops=1)
               ->  Index Scan using u_p_key on u  (cost=0.00..587.86 rows=15223 width=6)
(actual time=0.19..106.98 rows=17918 loops=1)
         ->  Index Scan using d_pkey on d  (cost=0.00..21005.66 rows=136667 width=6)
(actual time=0.28..2297.81 rows=140552 loops=1)
Total runtime: 2728.00 msec