Обсуждение: Slow SQL?

Поиск
Список
Период
Сортировка

Slow SQL?

От
Bjørn T Johansen
Дата:
I am trying to move a small system from Oracle to PostgreSQL and I have come upon a sql that runs really slow compared
toon the Oracle database and 
I am not able to interpret why this is slow.

The SQL looks like this:


Select a.status, a.plass, a.navn, a.avlsnr,
           date_part('day',(now() - s.dato)) dato_diff, v.tekst, COALESCE(a.avlsverdi,0)
From   sed_uttak s, sem_avlsverdi a, semin_vare v
where a.aktiv = 1
And    s.dato  = (Select Max(y.dato)
                          From sed_uttak y
                          Where y.avlsnr = s.avlsnr)
And    a.avlsnr = s.avlsnr
And    s.sedtype  = v.tallkode
And     a.avlsnr Not In (Select avlsnr
                                    From dyr_pause_mot)
Union
Select a.status, a.plass, a.navn, a.avlsnr,
           date_part('day',(now() - s.dato)) dato_diff, 'Tappe pause', COALESCE(a.avlsverdi,0)
From   sed_uttak s, sem_avlsverdi a,  dyr_pause_mot p
Where s.dato = (Select Max(x.dato)
                          From sed_uttak x
                          Where x.avlsnr = s.avlsnr)
And      a.avlsnr = s.avlsnr
And      a.avlsnr = p.avlsnr
Union
Select a.status, a.plass, a.navn, a.avlsnr, null dato_diff, 'IKKE TAPPET', COALESCE(a.avlsverdi,0)
From   sem_avlsverdi a
Where a.aktiv = 1
And   a.avlsnr Not In (Select avlsnr From sed_uttak)
And   a.avlsnr Not In (Select avlsnr From dyr_pause_mot)



And the explain result looks like this:

HashAggregate  (cost=7288068.92..7288092.10 rows=2318 width=36) (actual time=10740.366..10741.879 rows=6475 loops=1)
                                                       
  Group Key: a.status, a.plass, a.navn, a.avlsnr, (date_part('day'::text, (now() - (s.dato)::timestamp with time
zone))),v.tekst, (COALESCE((a.avlsverdi)::integer, 0))        
  ->  Append  (cost=1.46..7288028.35 rows=2318 width=36) (actual time=0.203..10730.906 rows=8915 loops=1)
                                                       
        ->  Nested Loop  (cost=1.46..7274678.41 rows=698 width=82) (actual time=0.203..10638.870 rows=8602 loops=1)
                                                       
              Join Filter: (s.sedtype = v.tallkode)
                                                       
              Rows Removed by Join Filter: 127006
                                                       
              ->  Nested Loop  (cost=1.46..7274438.07 rows=698 width=26) (actual time=0.189..10607.509 rows=6164
loops=1)                                                      
                    ->  Seq Scan on sem_avlsverdi a  (cost=1.04..153.19 rows=3238 width=16) (actual time=0.024..4.027
rows=6474loops=1)                                        
                          Filter: ((NOT (hashed SubPlan 5)) AND (aktiv = 1))
                                                       
                          Rows Removed by Filter: 3
                                                       
                          SubPlan 5
                                                       
                            ->  Seq Scan on dyr_pause_mot dyr_pause_mot_1  (cost=0.00..1.03 rows=3 width=4) (actual
time=0.002..0.003rows=3 loops=1)                           
                    ->  Index Scan using idx_seduttak_avlsnr on sed_uttak s  (cost=0.42..2246.53 rows=1 width=14)
(actualtime=1.404..1.637 rows=1 loops=6474)                  
                          Index Cond: (avlsnr = a.avlsnr)
                                                       
                          Filter: (dato = (SubPlan 4))
                                                       
                          Rows Removed by Filter: 42
                                                       
                          SubPlan 4
                                                       
                            ->  Aggregate  (cost=43.09..43.10 rows=1 width=8) (actual time=0.037..0.037 rows=1
loops=279035)                                                   
                                  ->  Index Scan using idx_seduttak_avlsnr on sed_uttak y  (cost=0.42..42.96 rows=52
width=8)(actual time=0.003..0.029 rows=76 loops=279035)   
                                        Index Cond: (avlsnr = s.avlsnr)
                                                       
              ->  Materialize  (cost=0.00..1.33 rows=22 width=60) (actual time=0.000..0.001 rows=22 loops=6164)
                                                       
                    ->  Seq Scan on semin_vare v  (cost=0.00..1.22 rows=22 width=60) (actual time=0.002..0.005 rows=22
loops=1)                                                
        ->  Nested Loop  (cost=0.70..6761.33 rows=1 width=24) (actual time=0.342..1.801 rows=3 loops=1)
                                                       
              ->  Nested Loop  (cost=0.28..25.96 rows=3 width=20) (actual time=0.008..0.015 rows=3 loops=1)
                                                       
                    ->  Seq Scan on dyr_pause_mot p  (cost=0.00..1.03 rows=3 width=4) (actual time=0.002..0.002 rows=3
loops=1)                                                
                    ->  Index Scan using idx_avlsverdi_avlsnr on sem_avlsverdi a_1  (cost=0.28..8.30 rows=1 width=16)
(actualtime=0.003..0.003 rows=1 loops=3)                 
                          Index Cond: (avlsnr = p.avlsnr)
                                                       
              ->  Index Scan using idx_seduttak_avlsnr on sed_uttak s_1  (cost=0.42..2245.11 rows=1 width=12) (actual
time=0.377..0.589rows=1 loops=3)                         
                    Index Cond: (avlsnr = a_1.avlsnr)
                                                       
                    Filter: (dato = (SubPlan 3))
                                                       
                    Rows Removed by Filter: 27
                                                       
                    SubPlan 3
                                                       
                      ->  Aggregate  (cost=43.09..43.10 rows=1 width=8) (actual time=0.020..0.020 rows=1 loops=85)
                                                       
                            ->  Index Scan using idx_seduttak_avlsnr on sed_uttak x  (cost=0.42..42.96 rows=52 width=8)
(actualtime=0.002..0.015 rows=37 loops=85)             
                                  Index Cond: (avlsnr = s_1.avlsnr)
                                                       
        ->  Seq Scan on sem_avlsverdi a_2  (cost=6393.04..6565.43 rows=1619 width=16) (actual time=87.448..89.059
rows=310loops=1)                                             
              Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (aktiv = 1))
                                                       
              Rows Removed by Filter: 6167
                                                       
              SubPlan 1
                                                       
                ->  Seq Scan on sed_uttak  (cost=0.00..5694.20 rows=279120 width=4) (actual time=0.005..36.384
rows=279120loops=1)                                             
              SubPlan 2
                                                       
                ->  Seq Scan on dyr_pause_mot  (cost=0.00..1.03 rows=3 width=4) (actual time=0.002..0.003 rows=3
loops=1)                                                      
Planning time: 0.927 ms
                                                       
Execution time: 10742.300ms
                                                      


Appreciate any pointers on where to look... :)


Regards,

BTJ

--
-----------------------------------------------------------------------------------------------
Bjørn T Johansen

btj@havleik.no
-----------------------------------------------------------------------------------------------
Someone wrote:
"I understand that if you play a Windows CD backwards you hear strange Satanic messages"
To which someone replied:
"It's even worse than that; play it forwards and it installs Windows"
-----------------------------------------------------------------------------------------------


Re: Slow SQL?

От
hamann.w@t-online.de
Дата:
Hi Bjorn,

I have experienced that some subqueries can be quite slow, and would suspect the NOT IN
clause. I occasionally rewrite
NOT IN (select key from unwanted_candidates)
as
IN (select key from possible_candidates except select key from unwanted_candidates)

Admittedly, I am not running latest version

Regards
Wolfgang Hamann

>> I am trying to move a small system from Oracle to PostgreSQL and I have come upon a sql that runs really slow
comparedto on the Oracle database and 
>> I am not able to interpret why this is slow.
>>
>> The SQL looks like this:
>>
>>
>> Select a.status, a.plass, a.navn, a.avlsnr,
>>            date_part('day',(now() - s.dato)) dato_diff, v.tekst, COALESCE(a.avlsverdi,0)
>> From   sed_uttak s, sem_avlsverdi a, semin_vare v
>> where a.aktiv = 1
>> And    s.dato  = (Select Max(y.dato)
>>                           From sed_uttak y
>>                           Where y.avlsnr = s.avlsnr)
>> And    a.avlsnr = s.avlsnr
>> And    s.sedtype  = v.tallkode
>> And     a.avlsnr Not In (Select avlsnr
>>                                     From dyr_pause_mot)
>> Union
>> Select a.status, a.plass, a.navn, a.avlsnr,
>>            date_part('day',(now() - s.dato)) dato_diff, 'Tappe pause', COALESCE(a.avlsverdi,0)
>> From   sed_uttak s, sem_avlsverdi a,  dyr_pause_mot p
>> Where s.dato = (Select Max(x.dato)
>>                           From sed_uttak x
>>                           Where x.avlsnr = s.avlsnr)
>> And      a.avlsnr = s.avlsnr
>> And      a.avlsnr = p.avlsnr
>> Union
>> Select a.status, a.plass, a.navn, a.avlsnr, null dato_diff, 'IKKE TAPPET', COALESCE(a.avlsverdi,0)
>> From   sem_avlsverdi a
>> Where a.aktiv = 1
>> And   a.avlsnr Not In (Select avlsnr From sed_uttak)
>> And   a.avlsnr Not In (Select avlsnr From dyr_pause_mot)
>>
>>
>>
>> And the explain result looks like this:
>>
>> HashAggregate  (cost=7288068.92..7288092.10 rows=2318 width=36) (actual time=10740.366..10741.879 rows=6475 loops=1)
                                                          
>>   Group Key: a.status, a.plass, a.navn, a.avlsnr, (date_part('day'::text, (now() - (s.dato)::timestamp with time
zone))),v.tekst, (COALESCE((a.avlsverdi)::integer, 0))        
>>   ->  Append  (cost=1.46..7288028.35 rows=2318 width=36) (actual time=0.203..10730.906 rows=8915 loops=1)
                                                          
>>         ->  Nested Loop  (cost=1.46..7274678.41 rows=698 width=82) (actual time=0.203..10638.870 rows=8602 loops=1)
                                                          
>>               Join Filter: (s.sedtype = v.tallkode)
                                                          
>>               Rows Removed by Join Filter: 127006
                                                          
>>               ->  Nested Loop  (cost=1.46..7274438.07 rows=698 width=26) (actual time=0.189..10607.509 rows=6164
loops=1)                                                      
>>                     ->  Seq Scan on sem_avlsverdi a  (cost=1.04..153.19 rows=3238 width=16) (actual
time=0.024..4.027rows=6474 loops=1)                                        
>>                           Filter: ((NOT (hashed SubPlan 5)) AND (aktiv = 1))
                                                          
>>                           Rows Removed by Filter: 3
                                                          
>>                           SubPlan 5
                                                          
>>                             ->  Seq Scan on dyr_pause_mot dyr_pause_mot_1  (cost=0.00..1.03 rows=3 width=4) (actual
time=0.002..0.003rows=3 loops=1)                           
>>                     ->  Index Scan using idx_seduttak_avlsnr on sed_uttak s  (cost=0.42..2246.53 rows=1 width=14)
(actualtime=1.404..1.637 rows=1 loops=6474)                  
>>                           Index Cond: (avlsnr = a.avlsnr)
                                                          
>>                           Filter: (dato = (SubPlan 4))
                                                          
>>                           Rows Removed by Filter: 42
                                                          
>>                           SubPlan 4
                                                          
>>                             ->  Aggregate  (cost=43.09..43.10 rows=1 width=8) (actual time=0.037..0.037 rows=1
loops=279035)                                                   
>>                                   ->  Index Scan using idx_seduttak_avlsnr on sed_uttak y  (cost=0.42..42.96 rows=52
width=8)(actual time=0.003..0.029 rows=76 loops=279035)   
>>                                         Index Cond: (avlsnr = s.avlsnr)
                                                          
>>               ->  Materialize  (cost=0.00..1.33 rows=22 width=60) (actual time=0.000..0.001 rows=22 loops=6164)
                                                          
>>                     ->  Seq Scan on semin_vare v  (cost=0.00..1.22 rows=22 width=60) (actual time=0.002..0.005
rows=22loops=1)                                                 
>>         ->  Nested Loop  (cost=0.70..6761.33 rows=1 width=24) (actual time=0.342..1.801 rows=3 loops=1)
                                                          
>>               ->  Nested Loop  (cost=0.28..25.96 rows=3 width=20) (actual time=0.008..0.015 rows=3 loops=1)
                                                          
>>                     ->  Seq Scan on dyr_pause_mot p  (cost=0.00..1.03 rows=3 width=4) (actual time=0.002..0.002
rows=3loops=1)                                                 
>>                     ->  Index Scan using idx_avlsverdi_avlsnr on sem_avlsverdi a_1  (cost=0.28..8.30 rows=1
width=16)(actual time=0.003..0.003 rows=1 loops=3)                 
>>                           Index Cond: (avlsnr = p.avlsnr)
                                                          
>>               ->  Index Scan using idx_seduttak_avlsnr on sed_uttak s_1  (cost=0.42..2245.11 rows=1 width=12)
(actualtime=0.377..0.589 rows=1 loops=3)                         
>>                     Index Cond: (avlsnr = a_1.avlsnr)
                                                          
>>                     Filter: (dato = (SubPlan 3))
                                                          
>>                     Rows Removed by Filter: 27
                                                          
>>                     SubPlan 3
                                                          
>>                       ->  Aggregate  (cost=43.09..43.10 rows=1 width=8) (actual time=0.020..0.020 rows=1 loops=85)
                                                          
>>                             ->  Index Scan using idx_seduttak_avlsnr on sed_uttak x  (cost=0.42..42.96 rows=52
width=8)(actual time=0.002..0.015 rows=37 loops=85)             
>>                                   Index Cond: (avlsnr = s_1.avlsnr)
                                                          
>>         ->  Seq Scan on sem_avlsverdi a_2  (cost=6393.04..6565.43 rows=1619 width=16) (actual time=87.448..89.059
rows=310loops=1)                                             
>>               Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (aktiv = 1))
                                                          
>>               Rows Removed by Filter: 6167
                                                          
>>               SubPlan 1
                                                          
>>                 ->  Seq Scan on sed_uttak  (cost=0.00..5694.20 rows=279120 width=4) (actual time=0.005..36.384
rows=279120loops=1)                                             
>>               SubPlan 2
                                                          
>>                 ->  Seq Scan on dyr_pause_mot  (cost=0.00..1.03 rows=3 width=4) (actual time=0.002..0.003 rows=3
loops=1)                                                      
>> Planning time: 0.927 ms
                                                          
>> Execution time: 10742.300ms
                                                         
>>
>>
>> Appreciate any pointers on where to look... :)
>>
>>
>> Regards,
>>
>> BTJ
>>
>> --
>> -----------------------------------------------------------------------------------------------
>> Bjørn T Johansen
>>
>> btj@havleik.no
>> -----------------------------------------------------------------------------------------------
>> Someone wrote:
>> "I understand that if you play a Windows CD backwards you hear strange Satanic messages"
>> To which someone replied:
>> "It's even worse than that; play it forwards and it installs Windows"
>> -----------------------------------------------------------------------------------------------
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general






Re: Slow SQL?

От
Albe Laurenz
Дата:
hamann.w@t-online.de wrote:
Bjørn T Johansen wrote:
>> I am trying to move a small system from Oracle to PostgreSQL and I have come upon a sql that runs
>> really slow compared to on the Oracle database and I am not able to interpret why this is slow.

> I have experienced that some subqueries can be quite slow, and would suspect the NOT IN
> clause. I occasionally rewrite
> NOT IN (select key from unwanted_candidates)
> as
> IN (select key from possible_candidates except select key from unwanted_candidates)

I would try to rewrite these clauses to NOT EXISTS, for example

   a.avlsnr Not In (Select avlsnr From dyr_pause_mot)

could be

   NOT EXISTS (SELECT NULL FROM dyr_pause_mot WHERE avlsnr = a.avlsnr)

This can be executed as anti-join and is often more efficient.

Yours,
Laurenz Albe

Re: Slow SQL?

От
Bjørn T Johansen
Дата:
On Tue, 12 Jul 2016 09:12:23 +0000
Albe Laurenz <laurenz.albe@wien.gv.at> wrote:

> hamann.w@t-online.de wrote:
> Bjørn T Johansen wrote:
> >> I am trying to move a small system from Oracle to PostgreSQL and I have come upon a sql that runs
> >> really slow compared to on the Oracle database and I am not able to interpret why this is slow.
>
> > I have experienced that some subqueries can be quite slow, and would suspect the NOT IN
> > clause. I occasionally rewrite
> > NOT IN (select key from unwanted_candidates)
> > as
> > IN (select key from possible_candidates except select key from unwanted_candidates)
>
> I would try to rewrite these clauses to NOT EXISTS, for example
>
>    a.avlsnr Not In (Select avlsnr From dyr_pause_mot)
>
> could be
>
>    NOT EXISTS (SELECT NULL FROM dyr_pause_mot WHERE avlsnr = a.avlsnr)
>
> This can be executed as anti-join and is often more efficient.
>
> Yours,
> Laurenz Albe
>
Thx for your suggestions. Tried to use NOT EXISTS and the query was about half a second quicker so not much
difference...
But when I try to run the 3 queries separately, then they are very quick, 2 barely measurable and the third takes about
1,5seconds. The union query 
takes a little over 9 seconds, so I guess the union part is the bottleneck?


BTJ


Re: Slow SQL?

От
hubert depesz lubaczewski
Дата:
On Tue, Jul 12, 2016 at 10:23:24AM +0200, Bjørn T Johansen wrote:
> I am trying to move a small system from Oracle to PostgreSQL and
> I have come upon a sql that runs really slow compared to on the Oracle
> database and I am not able to interpret why this is slow.

I loaded your explain analyze to https://explain.depesz.com/, as:
https://explain.depesz.com/s/iXK

as you can see there, the problem is that you made 280 thousand checks
for "sed_uttak y", which seems to be related to this part:


> Select a.status, a.plass, a.navn, a.avlsnr,
>            date_part('day',(now() - s.dato)) dato_diff, v.tekst, COALESCE(a.avlsverdi,0)
> From   sed_uttak s, sem_avlsverdi a, semin_vare v
> where a.aktiv = 1
> And    s.dato  = (Select Max(y.dato)
>                           From sed_uttak y
>                           Where y.avlsnr = s.avlsnr)

from what I understand, you're doing it to work on newest record from
sed_uttak, for each avlsnr.

What is rowcount in the table, and how many different avlsnr are there?

You might want to do something like:

with s as (
    select distinct on (avlsnr) *
    from sed_uttak
    order by avlsnr desc, dato desc
)

and then use "s" instead of set_uttak, and get rid of the s.dato
= (select max....) checks.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/


Re: Slow SQL?

От
Albe Laurenz
Дата:
Bjørn T Johansen wrote:
> Thx for your suggestions. Tried to use NOT EXISTS and the query was about half a second quicker so not
> much difference...
> But when I try to run the 3 queries separately, then they are very quick, 2 barely measurable and the
> third takes about 1,5 seconds. The union query
> takes a little over 9 seconds, so I guess the union part is the bottleneck?

Looking at EXPLAIN (ANALYZE) output should tell you.

If yes, is UNION ALL an option for you?
That should be cheaper.

Yours,
Laurenz Albe

Re: Slow SQL?

От
Tom Lane
Дата:
=?UTF-8?B?QmrDuHJu?= T Johansen <btj@havleik.no> writes:
> But when I try to run the 3 queries separately, then they are very quick, 2 barely measurable and the third takes
about1,5 seconds. The union query 
> takes a little over 9 seconds, so I guess the union part is the bottleneck?

No; it's clear from your EXPLAIN output that the first UNION arm is taking
the bulk of the runtime.  It's odd that it wouldn't be exactly the same
when run standalone.  What does EXPLAIN ANALYZE show if you run that first
arm by itself?

I concur with depesz's observation that the MAX() subselects seem to be
pretty expensive.  If you don't want to rewrite the query in a wholesale
fashion like he suggests, you might be able to make the MAX's cheaper by
providing an index on sed_uttak(avlsnr, dato); but I'm not sure how much
that will help.

            regards, tom lane


Re: Slow SQL?

От
Bjørn T Johansen
Дата:
On Tue, 12 Jul 2016 14:34:50 +0200
hubert depesz lubaczewski <depesz@depesz.com> wrote:

> On Tue, Jul 12, 2016 at 10:23:24AM +0200, Bjørn T Johansen wrote:
> > I am trying to move a small system from Oracle to PostgreSQL and
> > I have come upon a sql that runs really slow compared to on the Oracle
> > database and I am not able to interpret why this is slow.
>
> I loaded your explain analyze to https://explain.depesz.com/, as:
> https://explain.depesz.com/s/iXK
>
> as you can see there, the problem is that you made 280 thousand checks
> for "sed_uttak y", which seems to be related to this part:
>
>
> > Select a.status, a.plass, a.navn, a.avlsnr,
> >            date_part('day',(now() - s.dato)) dato_diff, v.tekst, COALESCE(a.avlsverdi,0)
> > From   sed_uttak s, sem_avlsverdi a, semin_vare v
> > where a.aktiv = 1
> > And    s.dato  = (Select Max(y.dato)
> >                           From sed_uttak y
> >                           Where y.avlsnr = s.avlsnr)
>
> from what I understand, you're doing it to work on newest record from
> sed_uttak, for each avlsnr.
>
> What is rowcount in the table, and how many different avlsnr are there?
>
> You might want to do something like:
>
> with s as (
>     select distinct on (avlsnr) *
>     from sed_uttak
>     order by avlsnr desc, dato desc
> )
>
> and then use "s" instead of set_uttak, and get rid of the s.dato
> = (select max....) checks.
>
> Best regards,
>
> depesz
>

Rewrote using the with s as method and now  it's  faster than the oracle querytime, so I am happy...
And I learned something new, never used the with s as way before...

Thx! :)

BTJ