Re: Help on explain analyze

Поиск
Список
Период
Сортировка
От Marc Mamin
Тема Re: Help on explain analyze
Дата
Msg-id C4DAC901169B624F933534A26ED7DF31034BB940@JENMAIL01.ad.intershop.net
обсуждение исходный текст
Ответ на Help on explain analyze  (Leif Jensen <leif@crysberg.dk>)
Список pgsql-general
Hello,

did you also try defininig partial indexes?

e.g. 
CREATE INDEX xx on task_definitions (ctrlid) WHERE (name::text = 'UseWSData')
CREATE INDEX yy on ctrl_definitions (ctrlid) WHERE (name::text = 'IrrPeriodStart')

HTH,

Marc Mamin

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Leif Jensen
Sent: Freitag, 26. November 2010 06:04
To: pgsql-general
Subject: [GENERAL] Help on explain analyze

   Hi guys,

   I have a rather complex view that sometimes takes an awful long time to execute. I have tried to do an 'explain
analyze'on it. My intention was to try to optimize the tables involved by creating some indexes to help the lookup. I
lookedfor the "Seq Scan's and created appropriate indexes, I thought. However, in most cases the search got even
slower.I have "expanded" the view as follows:
 

cims=# explain analyze select * from (SELECT t.id AS oid, d.id AS devid, d.description AS devname, cd.value AS period,
upper(dt.typename::text)AS devtype, (date_part('epoch'::text, timezone('GMT'::text, t.firstrun))::bigint -
(z.timezone::integer- 
 
        CASE
            WHEN z.daylightsaving <> 'Y'::bpchar THEN 0
            ELSE 
            CASE
                WHEN cy.dl_start < now() AND now() < cy.dl_finish THEN 1
                ELSE 0
            END
        END) * 3600) % 86400::bigint AS firstrun, t."interval", t.id AS tid, ti.id AS tiid, t.runstatus, t.last,
tt.handler,td.value AS ctrlid, td.string AS alarm, z.timezone AS real_timezone, cy.dl_start < now() AND now() <
cy.dl_finishAS daylight, z.timezone::integer - 
 
        CASE
            WHEN z.daylightsaving <> 'Y'::bpchar THEN 0
            ELSE 
            CASE
                WHEN cy.dl_start < now() AND now() < cy.dl_finish THEN 1
                ELSE 0
            END
        END AS timezone
   FROM device d
   LEFT JOIN task_info ti ON ti.ctrlid = d.id
   LEFT JOIN task t ON t.id = ti.taskid
   LEFT JOIN ctrl_definitions cd ON d.id = cd.ctrlid AND cd.name::text = 'IrrPeriodStart'::text, task_type tt,
task_definitionstd, devtype dt, ctrl_definitions cd2, zip z, county cy
 
  WHERE td.name = 'UseWSData'::text AND ti.id = td.taskinfoid AND d.devtypeid = dt.id AND tt.id = t.tasktypeid AND
(tt.handler= 'modthcswi.so'::text OR tt.handler = 'modthcswb.so'::text) AND btrim(cd2.string) = z.zip::text AND
cd2.ctrlid= td.value AND cd2.name::text = 'ZIP'::text AND z.countyfips = cy.countyfips AND z.state = cy.state AND
date_part('year'::text,now()) = date_part('year'::text, cy.dl_start)) AS wstaskdist
 
  WHERE wstaskdist.ctrlid = 401 AND CAST( alarm AS boolean ) = 't';

  The view is actually the sub-SELECT which I have name 'wstaskdist', and my search criteria is the bottom WHERE. The
resultof the ANALYZE is:
 

                                                                            QUERY PLAN
                                         
 

------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=284.88..9767.82 rows=1 width=109) (actual time=2515.318..40073.432 rows=10 loops=1)
   ->  Nested Loop  (cost=284.88..9745.05 rows=70 width=102) (actual time=2515.184..40071.697 rows=10 loops=1)
         ->  Nested Loop  (cost=229.56..5692.38 rows=1 width=88) (actual time=2512.044..39401.729 rows=10 loops=1)
               ->  Nested Loop  (cost=229.56..5692.07 rows=1 width=80) (actual time=2511.999..39401.291 rows=10
loops=1)
                     ->  Nested Loop  (cost=229.56..5691.76 rows=1 width=77) (actual time=2511.943..39400.680 rows=10
loops=1)
                           Join Filter: (ti.id = td.taskinfoid)
                           ->  Seq Scan on task_definitions td  (cost=0.00..13.68 rows=1 width=22) (actual
time=0.204..0.322rows=10 loops=1)
 
                                 Filter: ((name = 'UseWSData'::text) AND (value = 401) AND (string)::boolean)
                           ->  Hash Left Join  (cost=229.56..5672.72 rows=429 width=59) (actual time=7.159..3939.536
rows=429loops=10)
 
                                 Hash Cond: (d.id = cd.ctrlid)
                                 ->  Nested Loop  (cost=24.66..5442.80 rows=429 width=55) (actual time=6.797..3937.349
rows=429loops=10)
 
                                       ->  Hash Join  (cost=16.65..282.84 rows=429 width=38) (actual time=0.078..6.587
rows=429loops=10)
 
                                             Hash Cond: (t.id = ti.taskid)
                                             ->  Seq Scan on task t  (cost=0.00..260.29 rows=429 width=30) (actual
time=0.022..5.089rows=429 loops=10)
 
                                             ->  Hash  (cost=11.29..11.29 rows=429 width=12) (actual time=0.514..0.514
rows=429loops=1)
 
                                                   ->  Seq Scan on task_info ti  (cost=0.00..11.29 rows=429 width=12)
(actualtime=0.020..0.302 rows=429 loops=1)
 
                                       ->  Bitmap Heap Scan on device d  (cost=8.01..12.02 rows=1 width=21) (actual
time=9.145..9.146rows=1 loops=4290)
 
                                             Recheck Cond: (d.id = ti.ctrlid)
                                             ->  Bitmap Index Scan on pk_device  (cost=0.00..8.01 rows=1 width=0)
(actualtime=0.463..0.463 rows=1569 loops=4290)
 
                                                   Index Cond: (d.id = ti.ctrlid)
                                 ->  Hash  (cost=202.61..202.61 rows=183 width=8) (actual time=3.534..3.534 rows=343
loops=1)
                                       ->  Seq Scan on ctrl_definitions cd  (cost=0.00..202.61 rows=183 width=8)
(actualtime=0.034..3.298 rows=343 loops=1)
 
                                             Filter: ((name)::text = 'IrrPeriodStart'::text)
                     ->  Index Scan using devtype_pkey on devtype dt  (cost=0.00..0.30 rows=1 width=11) (actual
time=0.053..0.055rows=1 loops=10)
 
                           Index Cond: (dt.id = d.devtypeid)
               ->  Index Scan using pk_task_type on task_type tt  (cost=0.00..0.30 rows=1 width=16) (actual
time=0.036..0.039rows=1 loops=10)
 
                     Index Cond: (tt.id = t.tasktypeid)
                     Filter: ((tt.handler = 'modthcswi.so'::text) OR (tt.handler = 'modthcswb.so'::text))
         ->  Hash Join  (cost=55.33..4050.56 rows=211 width=18) (actual time=3.000..66.988 rows=1 loops=10)
               Hash Cond: ((z.zip)::text = btrim(cd2.string))
               ->  Seq Scan on zip z  (cost=0.00..3729.23 rows=42223 width=20) (actual time=0.017..43.637 rows=42108
loops=10)
               ->  Hash  (cost=55.31..55.31 rows=1 width=10) (actual time=0.138..0.138 rows=1 loops=1)
                     ->  Bitmap Heap Scan on ctrl_definitions cd2  (cost=4.39..55.31 rows=1 width=10) (actual
time=0.089..0.128rows=1 loops=1)
 
                           Recheck Cond: (ctrlid = 401)
                           Filter: ((name)::text = 'ZIP'::text)
                           ->  Bitmap Index Scan on ctrl_def_ctrlid  (cost=0.00..4.39 rows=19 width=0) (actual
time=0.055..0.055rows=25 loops=1)
 
                                 Index Cond: (ctrlid = 401)
   ->  Index Scan using county_state_fips_inx on county cy  (cost=0.00..0.31 rows=1 width=25) (actual time=0.083..0.146
rows=1loops=10)
 
         Index Cond: ((cy.state = z.state) AND (cy.countyfips = z.countyfips))
         Filter: (date_part('year'::text, now()) = date_part('year'::text, cy.dl_start))
 Total runtime: 40073.738 ms
(41 rows)

  I have concentrate my effort on the (double) 'Seq Scan':

                                       ->  Hash Join  (cost=16.65..282.84 rows=429 width=38) (actual time=0.078..6.587
rows=429loops=10)
 
                                             Hash Cond: (t.id = ti.taskid)
                                             ->  Seq Scan on task t  (cost=0.00..260.29 rows=429 width=30) (actual
time=0.022..5.089rows=429 loops=10)
 
                                             ->  Hash  (cost=11.29..11.29 rows=429 width=12) (actual time=0.514..0.514
rows=429loops=1)
 
                                                   ->  Seq Scan on task_info ti  (cost=0.00..11.29 rows=429 width=12)
(actualtime=0.020..0.302 rows=429 loops=1)
 
 
   and tried various indexes on the two table 'task' and 'task_info' to help avoid the sequential scans, but at best it
givesthe same result, at worst it was 25% slower :-(. The tables aren't that big. I think the largest are the 'zip' and
the'county' tables with ~42000 and ~22000 rows, respectively.
 

   What is it that I'm missing,

 Leif

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

В списке pgsql-general по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: number of not null arguments
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: Terms advice.