Обсуждение: Interpreting query plan

Поиск
Список
Период
Сортировка

Interpreting query plan

От
"Chris Smith"
Дата:
I've just noticed in the regular profiling information from our web
application that a particular query on a fairly small database is taking about
15 seconds.  The query is generated from software on the fly, hence its
quirkiness -- if any of that is the problem, then I'll go ahead and fix it,
but I can't imagine a few repeated WHERE conditions fooling the query
optimizer.

Anyway, I don't know how to interpret query plans.  Can anyone give me a hand?
To get the plan, I just plugged in various values -- The actual query is run
with various different values, and even a few different lengths for the IN
clause.

miqweb=> explain select distinct t0.* from UserAccount t0, UserMapping t1
where
 (t0.companyid = 123) and ((t0.companyid = 123) and (t0.userid = t1.userid)
and
 (t1.groupid in (123, 234, 345, 456))) and (t0.companyid = 123);
                                                                  QUERY PLAN

------------------------------------------------------------------------------
--
--------------------------------------------------------------
 Unique  (cost=133.78..133.81 rows=1 width=55)
   ->  Sort  (cost=133.78..133.79 rows=1 width=55)
         Sort Key: t0.userid, t0.companyid, t0.username, t0."password",
t0.isact
ive, t0.isregistered, t0.lastlogin, t0.firstname, t0.lastname
         ->  Hash Join  (cost=13.44..133.77 rows=1 width=55)
               Hash Cond: ("outer".userid = "inner".userid)
               ->  Seq Scan on usermapping t1  (cost=0.00..120.26 rows=13
width=
4)
                     Filter: ((groupid = 123) OR (groupid = 234) OR (groupid =
3
45) OR (groupid = 456))
               ->  Hash  (cost=13.43..13.43 rows=4 width=55)
                     ->  Index Scan using useraccount_lookup on useraccount t0
(cost=0.00..13.43 rows=4 width=55)
                           Index Cond: (companyid = 123)
(10 rows)


And relevant tables (apparently a little messed up by prior database version
upgrades, so that come of the foreign keys show up directly as triggers):

miqweb=> \d useraccount
     Table "public.useraccount"
    Column    |  Type   | Modifiers
--------------+---------+-----------
 userid       | integer | not null
 companyid    | integer | not null
 username     | text    | not null
 password     | text    | not null
 isactive     | boolean | not null
 isregistered | boolean | not null
 lastlogin    | date    |
 firstname    | text    |
 lastname     | text    |
Indexes:
    "useraccount_pkey" primary key, btree (userid)
    "useraccount_lookup" unique, btree (companyid, username)
Triggers:
    "RI_ConstraintTrigger_255906" AFTER INSERT OR UPDATE ON useraccount FROM
com
pany NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE
"RI_FKey_
check_ins"('useraccount_fk1', 'useraccount', 'company', 'UNSPECIFIED',
'companyi
d', 'companyid')
    "RI_ConstraintTrigger_255916" AFTER DELETE ON useraccount FROM
registrationf
ield NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE
"RI_FKey_
noaction_del"('registrationfield_fk2', 'registrationfield', 'useraccount',
'UNSP
ECIFIED', 'userid', 'userid')
    "RI_ConstraintTrigger_255917" AFTER UPDATE ON useraccount FROM
registrationf
ield NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE
"RI_FKey_
noaction_upd"('registrationfield_fk2', 'registrationfield', 'useraccount',
'UNSP
ECIFIED', 'userid', 'userid')
    "RI_ConstraintTrigger_255919" AFTER DELETE ON useraccount FROM userrole
NOT
DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE
"RI_FKey_noaction_
del"('userrole_fk1', 'userrole', 'useraccount', 'UNSPECIFIED', 'userid',
'userid
')
    "RI_ConstraintTrigger_255920" AFTER UPDATE ON useraccount FROM userrole
NOT
DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE
"RI_FKey_noaction_
upd"('userrole_fk1', 'userrole', 'useraccount', 'UNSPECIFIED', 'userid',
'userid
')
    "RI_ConstraintTrigger_255928" AFTER DELETE ON useraccount FROM visit NOT
DEF
ERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE
"RI_FKey_noaction_del
"('visit_fk1', 'visit', 'useraccount', 'UNSPECIFIED', 'userid', 'userid')
    "RI_ConstraintTrigger_255929" AFTER UPDATE ON useraccount FROM visit NOT
DEF
ERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE
"RI_FKey_noaction_upd
"('visit_fk1', 'visit', 'useraccount', 'UNSPECIFIED', 'userid', 'userid')
    "RI_ConstraintTrigger_255940" AFTER DELETE ON useraccount FROM adminvisit
NO
T DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE
"RI_FKey_noactio
n_del"('adminvisit_fk1', 'adminvisit', 'useraccount', 'UNSPECIFIED', 'userid',
'
userid')
    "RI_ConstraintTrigger_255941" AFTER UPDATE ON useraccount FROM adminvisit
NO
T DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE
"RI_FKey_noactio
n_upd"('adminvisit_fk1', 'adminvisit', 'useraccount', 'UNSPECIFIED', 'userid',
'
userid')


miqweb=> \d usermapping
  Table "public.usermapping"
 Column  |  Type   | Modifiers
---------+---------+-----------
 userid  | integer | not null
 groupid | integer | not null
Foreign-key constraints:
    "$1" FOREIGN KEY (userid) REFERENCES useraccount(userid)
    "$2" FOREIGN KEY (groupid) REFERENCES groups(groupid)

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation


Re: Interpreting query plan

От
Stephan Szabo
Дата:
On Fri, 2 Jul 2004, Chris Smith wrote:

> I've just noticed in the regular profiling information from our web
> application that a particular query on a fairly small database is taking about
> 15 seconds.  The query is generated from software on the fly, hence its
> quirkiness -- if any of that is the problem, then I'll go ahead and fix it,
> but I can't imagine a few repeated WHERE conditions fooling the query
> optimizer.
>
> Anyway, I don't know how to interpret query plans.  Can anyone give me a hand?
> To get the plan, I just plugged in various values -- The actual query is run
> with various different values, and even a few different lengths for the IN
> clause.
>
> miqweb=> explain select distinct t0.* from UserAccount t0, UserMapping t1
> where
>  (t0.companyid = 123) and ((t0.companyid = 123) and (t0.userid = t1.userid)
> and
>  (t1.groupid in (123, 234, 345, 456))) and (t0.companyid = 123);

Plain explain output is useful for finding what the plan is, but not as
useful for determining why a query takes a particular amount of time.
You might want to use "explain analyze" and send that result (which gives
the real time and real number of rows for different steps).


Re: Interpreting query plan

От
"Chris Smith"
Дата:
Stephan,

Thanks for your reply.  Here is the output of "explain analyze".  I also
replaced by simple values with a real query that gets run and (according to
our profiling) takes a long time.  However, the query is now taking a much
shorter period of time than it was profiled at by the application.  I can only
guess the difference has something to do with system load.  I could try to run
this again during a high-load period, if that is necessary.  Let me know.

miqweb=> explain analyze select distinct t0.* from UserAccount t0, UserMapping
t1 where (t0.companyid = 628) and ((t0.companyid = 628) and (t0.userid =
t1.use
rid) and (t1.groupid in
(628,948,949,950,951,953,954,1272,1279,1296,1299,1300,1
363,1423,1446,1467,1526,1724,1735,1759,1763,1772,1785,1841,1862,1975,2721,2800
,
2801,2802,2803,1264,1394,1525,1662,1843,1844,1845,1396,1528,1860,1846,1762,242
2
,1271,1847,1848,1281,1849,1850,1851,1266,1809,1852,1853,2421,1854,1855,1913,18
5
6,1857,1269,1268,1858,1859,2804))) and (t0.companyid = 628);

[...]

 Unique  (cost=952.15..959.37 rows=289 width=55) (actual time=137.130..143.363
r
ows=752 loops=1)
   ->  Sort  (cost=952.15..952.87 rows=289 width=55) (actual
time=137.123..138.0
04 rows=1328 loops=1)
         Sort Key: t0.userid, t0.companyid, t0.username, t0."password",
t0.isact
ive, t0.isregistered, t0.lastlogin, t0.firstname, t0.lastname
         ->  Hash Join  (cost=869.15..940.34 rows=289 width=55) (actual
time=112
.112..130.948 rows=1328 loops=1)
               Hash Cond: ("outer".userid = "inner".userid)
               ->  Seq Scan on useraccount t0  (cost=0.00..55.71 rows=629
width=
55) (actual time=0.239..8.501 rows=753 loops=1)
                     Filter: (companyid = 628)
               ->  Hash  (cost=866.28..866.28 rows=1151 width=4) (actual
time=11
1.762..111.762 rows=0 loops=1)
                     ->  Seq Scan on usermapping t1  (cost=0.00..866.28
rows=115
1 width=4) (actual time=4.251..109.563 rows=1328 loops=1)
                           Filter: ((groupid = 628) OR (groupid = 948) OR
(group
id = 949) OR (groupid = 950) OR (groupid = 951) OR (groupid = 953) OR (groupid
=
 954) OR (groupid = 1272) OR (groupid = 1279) OR (groupid = 1296) OR (groupid
=
1299) OR (groupid = 1300) OR (groupid = 1363) OR (groupid = 1423) OR (groupid
=
1446) OR (groupid = 1467) OR (groupid = 1526) OR (groupid = 1724) OR (groupid
=
1735) OR (groupid = 1759) OR (groupid = 1763) OR (groupid = 1772) OR (groupid
=
1785) OR (groupid = 1841) OR (groupid = 1862) OR (groupid = 1975) OR (groupid
=
2721) OR (groupid = 2800) OR (groupid = 2801) OR (groupid = 2802) OR (groupid
=
2803) OR (groupid = 1264) OR (groupid = 1394) OR (groupid = 1525) OR (groupid
=
1662) OR (groupid = 1843) OR (groupid = 1844) OR (groupid = 1845) OR (groupid
=
1396) OR (groupid = 1528) OR (groupid = 1860) OR (groupid = 1846) OR (groupid
=
1762) OR (groupid = 2422) OR (groupid = 1271) OR (groupid = 1847) OR (groupid
=
1848) OR (groupid = 1281) OR (groupid = 1849) OR (groupid = 1850) OR (groupid
=
1851) OR (groupid = 1266) OR (groupid = 1809) OR (groupid = 1852) OR (groupid
=
1853) OR (groupid = 2421) OR (groupid = 1854) OR (groupid = 1855) OR (groupid
=
1913) OR (groupid = 1856) OR (groupid = 1857) OR (groupid = 1269) OR (groupid
=
1268) OR (groupid = 1858) OR (groupid = 1859) OR (groupid = 2804))
 Total runtime: 144.690 ms
(11 rows)


--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation


Re: Interpreting query plan

От
Stephan Szabo
Дата:
On Tue, 6 Jul 2004, Chris Smith wrote:

> Thanks for your reply.  Here is the output of "explain analyze".  I also
> replaced by simple values with a real query that gets run and (according to
> our profiling) takes a long time.  However, the query is now taking a much
> shorter period of time than it was profiled at by the application.  I can only
> guess the difference has something to do with system load.  I could try to run
> this again during a high-load period, if that is necessary.  Let me know.

It might help.

Notes and questions about the query and the plan below inline.

> miqweb=> explain analyze select distinct t0.* from UserAccount t0, UserMapping
> t1 where (t0.companyid = 628) and ((t0.companyid = 628) and (t0.userid =
> t1.use
> rid) and (t1.groupid in
> (628,948,949,950,951,953,954,1272,1279,1296,1299,1300,1
> 363,1423,1446,1467,1526,1724,1735,1759,1763,1772,1785,1841,1862,1975,2721,2800
> ,
> 2801,2802,2803,1264,1394,1525,1662,1843,1844,1845,1396,1528,1860,1846,1762,242
> 2
> ,1271,1847,1848,1281,1849,1850,1851,1266,1809,1852,1853,2421,1854,1855,1913,18
> 5
> 6,1857,1269,1268,1858,1859,2804))) and (t0.companyid = 628);

I wonder if in practice this query uses distinct to get around a problem
with subqueries.  I'd think that a query of the general form:
select t0.* from UserAccount t0 where t0.companyid=62 and t0.userid in
 (select userid from UserMapping t1 where t1.groupid in (...));
might work better (although it doesn't look like the extra steps are that
big in practice).  For systems that don't do in well (older PostgreSQL
versions included), perhaps an exists would be better.

>  Unique  (cost=952.15..959.37 rows=289 width=55) (actual time=137.130..143.363
> r
> ows=752 loops=1)
>    ->  Sort  (cost=952.15..952.87 rows=289 width=55) (actual
> time=137.123..138.0
> 04 rows=1328 loops=1)
>          Sort Key: t0.userid, t0.companyid, t0.username, t0."password",
> t0.isact
> ive, t0.isregistered, t0.lastlogin, t0.firstname, t0.lastname

These steps are for the distinct.  It's not alot of actual time, but if
the row set returned was large enough to exceed sort_mem the sort might
start going off to disk and be slower.

>          ->  Hash Join  (cost=869.15..940.34 rows=289 width=55) (actual
> time=112
> .112..130.948 rows=1328 loops=1)
>                Hash Cond: ("outer".userid = "inner".userid)
>                ->  Seq Scan on useraccount t0  (cost=0.00..55.71 rows=629
> width=
> 55) (actual time=0.239..8.501 rows=753 loops=1)
>                      Filter: (companyid = 628)

The row estimate is pretty reasonable, estimated 629 versus actual 753.
How many rows are in useraccount?  I'm wondering if 629 is a reasonable
percentage of the rows to see if seq scan is reasonable here.

>                ->  Hash  (cost=866.28..866.28 rows=1151 width=4) (actual
> time=11
> 1.762..111.762 rows=0 loops=1)
>                      ->  Seq Scan on usermapping t1  (cost=0.00..866.28
> rows=115
> 1 width=4) (actual time=4.251..109.563 rows=1328 loops=1)
> [lots of filter conditions on groupid]

Here the estimate isn't so good, estimated 115 vs actual 1328.  You might
want to consider raising the groupid column's statistics target and
re-analyzing to see if you can get a better estimate. It won't probably
directly affect this plan entry, but it might affect the layers above.
Also, how many rows are in usermapping?

I didn't see any indexes on usermapping. Perhaps an index on (userid,
groupid) would help as well.

Re: Interpreting query plan

От
"Chris Smith"
Дата:
Stephan Szabo wrote:
> > [running at higher load]

> It might help.

Okay, I'll give it a shot.  Unfortunately, load isn't so predictable with our
current users, so I'll have to give it a few shots and see if I come up with
something substantially different.

> I wonder if in practice this query uses distinct to get around a
> problem with subqueries.

Yes, it does exactly that.  The database was recently moved over to PostgreSQL
7.4.  The thought of switching over to IN-subqueries is a bit scary, since the
query generator is a really over-generalized mess of about 7000 LOC in 17
different source files; but I'll certainly look at that.

> These steps are for the distinct.  It's not alot of actual time, but
> if the row set returned was large enough to exceed sort_mem the sort
> might start going off to disk and be slower.

Indeed, that looks like it could be a problem for some of our larger customers
who have up to tens of thousands of users.  The IN form would avoid this sort?

> The row estimate is pretty reasonable, estimated 629 versus actual
> 753. How many rows are in useraccount?  I'm wondering if 629 is a
> reasonable percentage of the rows to see if seq scan is reasonable
> here.

Total?  On this server, it's currently 2566.

> Here the estimate isn't so good, estimated 115 vs actual 1328.  You
> might want to consider raising the groupid column's statistics target
> and re-analyzing to see if you can get a better estimate.

Alright.  So how exactly do I raise the groupid column's statistics target?

> Also, how many rows are in usermapping?

4120

> I didn't see any indexes on usermapping. Perhaps an index on (userid,
> groupid) would help as well.

Done.  I'll watch for more profiling results to see if we run into this same
problem again.

Thanks for your help,

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation


Re: Interpreting query plan

От
Stephan Szabo
Дата:
On Tue, 6 Jul 2004, Chris Smith wrote:

> Stephan Szabo wrote:
>
> > I wonder if in practice this query uses distinct to get around a
> > problem with subqueries.
>
> Yes, it does exactly that.  The database was recently moved over to PostgreSQL
> 7.4.  The thought of switching over to IN-subqueries is a bit scary, since the
> query generator is a really over-generalized mess of about 7000 LOC in 17
> different source files; but I'll certainly look at that.

It might be worth just converting some by hand to see what explain analyze
says about them in contrast to the original.

> > These steps are for the distinct.  It's not alot of actual time, but
> > if the row set returned was large enough to exceed sort_mem the sort
> > might start going off to disk and be slower.
>
> Indeed, that looks like it could be a problem for some of our larger customers
> who have up to tens of thousands of users.  The IN form would avoid this sort?

The IN form would potentially use a generally different plan.  It still
uses sort_mem for some of its determinations, but I *think* you'd get more
in than you would be for the sort.  If you have the RAM and are doing
queries like this alot, you might want to raise sort_mem if you haven't
changed it from the default.

> > The row estimate is pretty reasonable, estimated 629 versus actual
> > 753. How many rows are in useraccount?  I'm wondering if 629 is a
> > reasonable percentage of the rows to see if seq scan is reasonable
> > here.
>
> Total?  On this server, it's currently 2566.

Okay, so 629 doing a seq scan is pretty reasonable if the table doesn't
have alot of empty space.

> > Here the estimate isn't so good, estimated 115 vs actual 1328.  You
> > might want to consider raising the groupid column's statistics target
> > and re-analyzing to see if you can get a better estimate.
>
> Alright.  So how exactly do I raise the groupid column's statistics target?

Oops, I mentioned it in a previous version of that paragraph and
apparently removed it upon editing.

ALTER TABLE <table> ALTER COLUMN <column> SET STATISTICS <integer>.
Maybe try 100 to see what it gives you.