Обсуждение: Poor query performance on one of two "like" databases in production.

Поиск
Список
Период
Сортировка

Poor query performance on one of two "like" databases in production.

От
Дата:
Hello,

I hope you can provide some answers to a strange problem.  This is in production and is a Severity #1 issue we are
having,so any help you can provide would be appreciated. 

PG: PostgreSQL 8.3.7
OS: RHEL 5 64 bit

We have two databases with the same DB schema managing different sets of users.  A query that was running fine for some
timehas started to run very poorly (basically never finishes) in one of the two databases.   The EXPLAIN output from
bothDBs is shown below.  The Merge Join approach will run in 24 seconds.  The Nested Loop / Hash Join is taking a
totallydifferent approach and is never finishing. 

I did run an ANALYZE, then vacuum full, then another ANALYZE on all tables involved in the query.  That didn't change
things. I compared QUERY TUNING settings in both postgresql.conf files and they are identical.  There is a difference
inrow counts in each table between the databases, but not by much. 


Bad plan:

postgres@p01c06d130> psql -Upostgres -dmxl -fbadquery.sql
                                                               QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=14310.81..2306550343.42 rows=244089 width=1564)
   ->  Hash Join  (cost=14310.81..2305727555.65 rows=1988818 width=1564)
         Hash Cond: ((u.customer_id = d.customer_id) AND (u.domain_id = d.domain_id))
         ->  Append  (cost=2937.06..2285528361.02 rows=1026394 width=1044)
               ->  Index Scan using mxl_user_domain_id_idx on mxl_user u  (cost=2937.06..2013937900.16 rows=900481
width=52)
                     Filter: ((active <> 0) AND (NOT (subplan)))
                     SubPlan
                       ->  Materialize  (cost=2937.06..4764.75 rows=131369 width=4)
                             ->  Seq Scan on mxl_user_group  (cost=0.00..2291.69 rows=131369 width=4)
               ->  Subquery Scan "*SELECT* 2"  (cost=2945.94..271590460.85 rows=125913 width=1044)
                     Filter: (NOT (subplan))
                     ->  Merge Join  (cost=8.88..101782.76 rows=251826 width=55)
                           Merge Cond: (u.user_id = ua.user_id)
                           ->  Index Scan using mxl_user_pkey on mxl_user u  (cost=0.00..84078.47 rows=1868036
width=28)
                           ->  Index Scan using mxl_user_alias_uid_idx on mxl_user_alias ua  (cost=0.00..10224.50
rows=251826width=31) 
                                 Filter: (ua.active <> 0)
                     SubPlan
                       ->  Materialize  (cost=2937.06..4764.75 rows=131369 width=4)
                             ->  Seq Scan on mxl_user_group  (cost=0.00..2291.69 rows=131369 width=4)
         ->  Hash  (cost=4988.15..4988.15 rows=77507 width=528)
               ->  Append  (cost=0.00..4988.15 rows=77507 width=528)
                     ->  Seq Scan on mxl_domain d  (cost=0.00..1810.59 rows=64367 width=28)
                           Filter: (active = 1)
                     ->  Subquery Scan "*SELECT* 2"  (cost=2454.26..3177.56 rows=13140 width=528)
                           ->  Hash Join  (cost=2454.26..3046.16 rows=13140 width=33)
                                 Hash Cond: (da.domain_id = d.domain_id)
                                 ->  Seq Scan on mxl_domain_alias da  (cost=0.00..296.25 rows=13140 width=25)
                                       Filter: (active = 1)
                                 ->  Hash  (cost=1649.67..1649.67 rows=64367 width=12)
                                       ->  Seq Scan on mxl_domain d  (cost=0.00..1649.67 rows=64367 width=12)
   ->  Index Scan using wds_policy_set_id_idx on wds_policy_set p  (cost=0.00..0.39 rows=2 width=8)
         Index Cond: (p.id = u.customer_id)
         Filter: ((p.default_flag = 1) AND (p.web_access_flag = 1) AND (p.active = 1) AND (p.scope = 3))
(33 rows)



Good plan:

kadams@p02c06d130> psql -Upostgres -dmxl -fbadquery.sql
                                                                 QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=741554.63..751762.07 rows=202281 width=1564)
   Merge Cond: ((p.id = u.customer_id) AND (d.domain_id = u.domain_id))
   ->  Sort  (cost=22143.37..22332.37 rows=75599 width=536)
         Sort Key: p.id, d.domain_id
         ->  Hash Join  (cost=362.69..5423.15 rows=75599 width=536)
               Hash Cond: (d.customer_id = p.id)
               ->  Append  (cost=0.00..3008.78 rows=47116 width=528)
                     ->  Seq Scan on mxl_domain d  (cost=0.00..1003.92 rows=35674 width=28)
                           Filter: (active = 1)
                     ->  Subquery Scan "*SELECT* 2"  (cost=1360.66..2004.86 rows=11442 width=528)
                           ->  Hash Join  (cost=1360.66..1890.44 rows=11442 width=33)
                                 Hash Cond: (da.domain_id = d.domain_id)
                                 ->  Seq Scan on mxl_domain_alias da  (cost=0.00..258.02 rows=11442 width=25)
                                       Filter: (active = 1)
                                 ->  Hash  (cost=914.74..914.74 rows=35674 width=12)
                                       ->  Seq Scan on mxl_domain d  (cost=0.00..914.74 rows=35674 width=12)
               ->  Hash  (cost=329.68..329.68 rows=2641 width=8)
                     ->  Seq Scan on wds_policy_set p  (cost=0.00..329.68 rows=2641 width=8)
                           Filter: ((default_flag = 1) AND (web_access_flag = 1) AND (active = 1) AND (scope = 3))
   ->  Materialize  (cost=719411.26..730421.65 rows=880831 width=1044)
         ->  Sort  (cost=719411.26..721613.34 rows=880831 width=1044)
               Sort Key: u.customer_id, u.domain_id
               ->  Append  (cost=2068.96..159849.28 rows=880831 width=1044)
                     ->  Seq Scan on mxl_user u  (cost=2068.96..66848.58 rows=806426 width=50)
                           Filter: ((active <> 0) AND (NOT (hashed subplan)))
                           SubPlan
                             ->  Seq Scan on mxl_user_group  (cost=0.00..1813.57 rows=102157 width=4)
                     ->  Subquery Scan "*SELECT* 2"  (cost=2077.18..93000.70 rows=74405 width=1044)
                           Filter: (NOT (hashed subplan))
                           ->  Merge Join  (cost=8.22..89071.62 rows=148810 width=53)
                                 Merge Cond: (u.user_id = ua.user_id)
                                 ->  Index Scan using mxl_user_pkey on mxl_user u  (cost=0.00..77888.20 rows=1656041
width=26)
                                 ->  Index Scan using mxl_user_alias_uid_idx on mxl_user_alias ua  (cost=0.00..5227.92
rows=148810width=31) 
                                       Filter: (ua.active <> 0)
                           SubPlan
                             ->  Seq Scan on mxl_user_group  (cost=0.00..1813.57 rows=102157 width=4)
(36 rows)


Query that runs on both DBs:

EXPLAIN SELECT substring(users.email from '^.*\@') || domains.domain as email,
                users.customer_id,
                users.password,
                p.policy_set_id,
                users.user_id,
                domains.auth_type
         FROM
               (
                 SELECT d.customer_id, d.domain, d.domain_id, d.auth_type, d.active from mxl_domain d
                 UNION ALL
                 SELECT d.customer_id, da.domain, da.domain_id, d.auth_type, da.active from mxl_domain d,
mxl_domain_aliasda WHERE da.domain_id = d.domain_id 
               ) as domains,
               (
                 SELECT u.email, u.customer_id, u.user_id, u.domain_id, u.password, u.active from mxl_user u
                 UNION ALL
                 SELECT ua.email, u.customer_id, u.user_id, u.domain_id, u.password, ua.active from mxl_user u,
mxl_user_aliasua WHERE ua.user_id = u.user_id 
               ) as users,
                wds_policy_set p
         WHERE  users.customer_id  = p.id
         AND    users.customer_id  = domains.customer_id
         AND    users.domain_id    = domains.domain_id
         AND    p.default_flag = 1
         AND    p.web_access_flag = 1
         AND    p.scope  = 3
         AND    domains.active = 1
         AND    users.active != 0
         AND    p.active = 1
         AND    users.user_id NOT IN (SELECT user_id FROM mxl_user_group);



#------------------------------------------------------------------------------
# QUERY TUNING
#------------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on
enable_hashagg = off
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#seq_page_cost = 1.0                    # measured on an arbitrary scale
random_page_cost = 1.1                  # same scale as above
#cpu_tuple_cost = 0.01                  # same scale as above
#cpu_index_tuple_cost = 0.005           # same scale as above
#cpu_operator_cost = 0.0025             # same scale as above
effective_cache_size = 2500MB

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5                        # range 1-10
#geqo_pool_size = 0                     # selects default based on effort
#geqo_generations = 0                   # selects default based on effort
#geqo_selection_bias = 2.0              # range 1.5-2.0

# - Other Planner Options -

default_statistics_target = 250         # range 1-1000
constraint_exclusion = on
#from_collapse_limit = 8
#join_collapse_limit = 8                # 1 disables collapsing of explicit
                                        # JOIN clauses


Row counts from DB where a Merge Join is selected:

mxl=# select count(*) from mxl_user;
  count
---------
 1655106
(1 row)

mxl=# select count(*) from mxl_user_alias;
 count
--------
 148864
(1 row)

mxl=# select count(*) from mxl_domain;;
 count
-------
 35487
(1 row)

mxl=# select count(*) from mxl_domain_alias;
 count
-------
 11445
(1 row)

mxl=# select count(*) from mxl_user_group;
 count
--------
 102277
(1 row)

mxl=# select count(*) from wds_policy_set;
 count
-------
  8434
(1 row)



Row counts from DB where Hash Join is selected:

mxl=# select count(*) from mxl_user;
  count
---------
 1869012
(1 row)

mxl=# select count(*) from mxl_user_alias;
 count
--------
 251827
(1 row)

mxl=# select count(*) from mxl_domain;
 count
-------
 64377
(1 row)

mxl=# select count(*) from mxl_domain_alias;
 count
-------
 13149
(1 row)

mxl=# select count(*) from mxl_user_group;
 count
--------
 131305
(1 row)

mxl=# select count(*) from wds_policy_set;
 count
-------
 13387
(1 row)


Thanks,

-K



Re: Poor query performance on one of two "like" databases in production.

От
Vick Khera
Дата:
On Fri, May 14, 2010 at 1:28 PM,  <Keaton_Adams@mcafee.com> wrote:
> I did run an ANALYZE, then vacuum full, then another ANALYZE on all tables involved in the query.  That didn't change
things. I compared QUERY TUNING settings in both postgresql.conf files and they are identical.  There is a difference
inrow counts in each table between the databases, but not by much. 
>

What's your default_statistics_target value?  ie, run "select
default_statistics_target;"

Try setting it to 100 if it is not already, re-run analyze, then try
the queries. The statistics may just be different.

Re: Poor query performance on one of two "like" databases in production.

От
Vick Khera
Дата:
On Fri, May 14, 2010 at 2:16 PM, Vick Khera <vivek@khera.org> wrote:
> What's your default_statistics_target value?  ie, run "select
> default_statistics_target;"
>

sorry... "show default_statistics_target;"

Re: Poor query performance on one of two "like" databases in production.

От
Дата:
No luck.  I set it in the postgresql.conf file and did a reload, ran analyze on the tables and the query plan isn't any
better.

mxl=# show default_statistics_target;
 default_statistics_target
---------------------------
 100
(1 row)

mxl=# analyze mxl_domain;
ANALYZE
mxl=# analyze mxl_domain_alias;
ANALYZE
mxl=# analyze mxl_user;
ANALYZE
mxl=# analyze mxl_user_alias;
ANALYZE
mxl=# analyze mxl_user_group;
ANALYZE
mxl=# analyze wds_policy_set;
ANALYZE
mxl=# \q
postgres@p01c06d130>
postgres@p01c06d130>
postgres@p01c06d130>
postgres@p01c06d130> set -o vi
postgres@p01c06d130>
postgres@p01c06d130> psql -Upostgres -dmxl -fbadquery.sql
                                                               QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=14312.76..2310264747.40 rows=244768 width=1564)
   ->  Hash Join  (cost=14312.76..2309439666.13 rows=1994364 width=1564)
         Hash Cond: ((u.customer_id = d.customer_id) AND (u.domain_id = d.domain_id))
         ->  Append  (cost=2935.45..2289184035.49 rows=1028871 width=1044)
               ->  Index Scan using mxl_user_domain_id_idx on mxl_user u  (cost=2935.45..2017782497.30 rows=902944
width=52)
                     Filter: ((active <> 0) AND (NOT (subplan)))
                     SubPlan
                       ->  Materialize  (cost=2935.45..4761.59 rows=131314 width=4)
                             ->  Seq Scan on mxl_user_group  (cost=0.00..2291.14 rows=131314 width=4)
               ->  Subquery Scan "*SELECT* 2"  (cost=2943.64..271401538.19 rows=125927 width=1044)
                     Filter: (NOT (subplan))
                     ->  Merge Join  (cost=8.19..95177.59 rows=251854 width=55)
                           Merge Cond: (u.user_id = ua.user_id)
                           ->  Index Scan using mxl_user_pkey on mxl_user u  (cost=0.00..77340.62 rows=1873068
width=28)
                           ->  Index Scan using mxl_user_alias_uid_idx on mxl_user_alias ua  (cost=0.00..10109.21
rows=251854width=31) 
                                 Filter: (ua.active <> 0)
                     SubPlan
                       ->  Materialize  (cost=2935.45..4761.59 rows=131314 width=4)
                             ->  Seq Scan on mxl_user_group  (cost=0.00..2291.14 rows=131314 width=4)
         ->  Hash  (cost=4989.27..4989.27 rows=77536 width=528)
               ->  Append  (cost=0.00..4989.27 rows=77536 width=528)
                     ->  Seq Scan on mxl_domain d  (cost=0.00..1810.81 rows=64385 width=28)
                           Filter: (active = 1)
                     ->  Subquery Scan "*SELECT* 2"  (cost=2454.66..3178.46 rows=13151 width=528)
                           ->  Hash Join  (cost=2454.66..3046.95 rows=13151 width=33)
                                 Hash Cond: (da.domain_id = d.domain_id)
                                 ->  Seq Scan on mxl_domain_alias da  (cost=0.00..296.39 rows=13151 width=25)
                                       Filter: (active = 1)
                                 ->  Hash  (cost=1649.85..1649.85 rows=64385 width=12)
                                       ->  Seq Scan on mxl_domain d  (cost=0.00..1649.85 rows=64385 width=12)
   ->  Index Scan using wds_policy_set_id_idx on wds_policy_set p  (cost=0.00..0.39 rows=2 width=8)
         Index Cond: (p.id = u.customer_id)
         Filter: ((p.default_flag = 1) AND (p.web_access_flag = 1) AND (p.active = 1) AND (p.scope = 3))
(33 rows)


I also tried a REINDEX / ANALYZE on all of the involved tables and that didn't help either:


mxl=# reindex table mxl_user;
REINDEX
mxl=# reindex table mxl_user_alias;
REINDEX
mxl=# reindex table mxl_domain;
REINDEX
mxl=# reindex table mxl_domain_alias;
REINDEX
mxl=# reindex table mxl_user_group;
REINDEX
mxl=# reindex table wds_policy_set;
REINDEX

mxl=# analyze mxl_user;
ANALYZE
mxl=# analyze mxl_user_alias;
ANALYZE
mxl=# analyze mxl_domain;
ANALYZE
mxl=# analyze mxl_domain_alias;
ANALYZE
mxl=# analyze mxl_user_group;
ANALYZE
mxl=# analyze wds_policy_set;
ANALYZE


On 5/14/10 12:16 PM, "Vick Khera" <vivek@khera.org> wrote:

On Fri, May 14, 2010 at 2:16 PM, Vick Khera <vivek@khera.org> wrote:
> What's your default_statistics_target value?  ie, run "select
> default_statistics_target;"
>

sorry... "show default_statistics_target;"

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Poor query performance on one of two "like" databases in production.

От
Stephen Frost
Дата:
* Keaton_Adams@McAfee.com (Keaton_Adams@McAfee.com) wrote:
> No luck.  I set it in the postgresql.conf file and did a reload, ran analyze on the tables and the query plan isn't
anybetter. 

Are you sure the database schemas are identical, including indexes, etc?
There's an index being used on the first query that isn't on the
second..  Perhaps it doesn't/shouldn't exist?  Just a thought.

    Stephen

Вложения

Re: Poor query performance on one of two "like" databases in production.

От
Дата:
Yes, I triple checked and the schemas, indexes, FKs, triggers all match.

-K


On 5/14/10 12:29 PM, "Stephen Frost" <sfrost@snowman.net> wrote:

> * Keaton_Adams@McAfee.com (Keaton_Adams@McAfee.com) wrote:
>> No luck.  I set it in the postgresql.conf file and did a reload, ran analyze
>> on the tables and the query plan isn't any better.
>
> Are you sure the database schemas are identical, including indexes, etc?
> There's an index being used on the first query that isn't on the
> second..  Perhaps it doesn't/shouldn't exist?  Just a thought.
>
> Stephen


Re: Poor query performance on one of two "like" databases in production.

От
Stephen Frost
Дата:
* Keaton_Adams@McAfee.com (Keaton_Adams@McAfee.com) wrote:
> Yes, I triple checked and the schemas, indexes, FKs, triggers all match.

Have you checked over for any enable_* settings that are off?  Identical
work_mem and maintenance_work_mem settings?

    Thanks,

        Stephen

Вложения

Re: Poor query performance on one of two "like" databases in production.

От
Josh Kupershmidt
Дата:
On Fri, May 14, 2010 at 1:28 PM,  <Keaton_Adams@mcafee.com> wrote:

> I did run an ANALYZE, then vacuum full, then another ANALYZE on all tables involved in the query.  That didn't change
things. I compared QUERY TUNING settings in both postgresql.conf files and they are identical.  There is a difference
inrow counts in each table between the databases, but not by much. 

Can we see EXPLAIN ANALYZE instead of just EXPLAIN output for the
queries on the two servers?

Also, can you try CLUSTER on the tables involved instead of VACUUM
FULL, to avoid introducing index bloat?

Josh

Re: Poor query performance on one of two "like" databases in production.

От
Дата:
Yes, PG settings are the same. Just checked again.

-K


On 5/14/10 12:54 PM, "Stephen Frost" <sfrost@snowman.net> wrote:

> * Keaton_Adams@McAfee.com (Keaton_Adams@McAfee.com) wrote:
>> Yes, I triple checked and the schemas, indexes, FKs, triggers all match.
>
> Have you checked over for any enable_* settings that are off?  Identical
> work_mem and maintenance_work_mem settings?
>
> Thanks,
>
> Stephen


Re: Poor query performance on one of two "like" databases in production.

От
Дата:
OK, getting closer.

If I comment out the last line ( AND users.user_id NOT IN (SELECT user_id FROM mxl_user_group)) the optimizer goes for
aMerge Join (yea!) and the query runs in 30 seconds.  So something with this NOT IN clause is throwing everything off. 


EXPLAIN SELECT substring(users.email from '^.*\@') || domains.domain as email,
                users.customer_id,
                users.password,
                p.policy_set_id,
                users.user_id,
                domains.auth_type
         FROM
               (
                 SELECT d.customer_id, d.domain, d.domain_id, d.auth_type, d.active from mxl_domain d
                 UNION ALL
                 SELECT d.customer_id, da.domain, da.domain_id, d.auth_type, da.active from mxl_domain d,
mxl_domain_aliasda WHERE da.domain_id = d.domain_id 
               ) as domains,
               (
                 SELECT u.email, u.customer_id, u.user_id, u.domain_id, u.password, u.active from mxl_user u
                 UNION ALL
                 SELECT ua.email, u.customer_id, u.user_id, u.domain_id, u.password, ua.active from mxl_user u,
mxl_user_aliasua WHERE ua.user_id = u.user_id 
               ) as users,
                wds_policy_set p
         WHERE  users.customer_id  = p.id
         AND    users.customer_id  = domains.customer_id
         AND    users.domain_id    = domains.domain_id
         AND    p.default_flag = 1
         AND    p.web_access_flag = 1
         AND    p.scope  = 3
         AND    domains.active = 1
         AND    users.active != 0
         AND    p.active = 1
         -- AND    users.user_id NOT IN (SELECT user_id FROM mxl_user_group);


postgres> psql -Upostgres -dmxl -fbadquery.sql
                                                                  QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=2077204.59..2100972.23 rows=490819 width=1564)
   Merge Cond: ((p.id = u.customer_id) AND (d.domain_id = u.domain_id))
   ->  Sort  (cost=38160.35..38471.30 rows=124381 width=536)
         Sort Key: p.id, d.domain_id
         ->  Hash Join  (cost=578.15..10203.83 rows=124381 width=536)
               Hash Cond: (d.customer_id = p.id)
               ->  Append  (cost=0.00..4989.45 rows=77541 width=528)
                     ->  Seq Scan on mxl_domain d  (cost=0.00..1810.88 rows=64390 width=28)
                           Filter: (active = 1)
                     ->  Subquery Scan "*SELECT* 2"  (cost=2454.78..3178.57 rows=13151 width=528)
                           ->  Hash Join  (cost=2454.78..3047.06 rows=13151 width=33)
                                 Hash Cond: (da.domain_id = d.domain_id)
                                 ->  Seq Scan on mxl_domain_alias da  (cost=0.00..296.39 rows=13151 width=25)
                                       Filter: (active = 1)
                                 ->  Hash  (cost=1649.90..1649.90 rows=64390 width=12)
                                       ->  Seq Scan on mxl_domain d  (cost=0.00..1649.90 rows=64390 width=12)
               ->  Hash  (cost=525.74..525.74 rows=4193 width=8)
                     ->  Seq Scan on wds_policy_set p  (cost=0.00..525.74 rows=4193 width=8)
                           Filter: ((default_flag = 1) AND (web_access_flag = 1) AND (active = 1) AND (scope = 3))
   ->  Materialize  (cost=2039044.24..2064831.74 rows=2063000 width=1044)
         ->  Sort  (cost=2039044.24..2044201.74 rows=2063000 width=1044)
               Sort Key: u.customer_id, u.domain_id
               ->  Append  (cost=0.00..162401.13 rows=2063000 width=1044)
                     ->  Seq Scan on mxl_user u  (cost=0.00..64467.53 rows=1811146 width=52)
                           Filter: (active <> 0)
                     ->  Subquery Scan "*SELECT* 2"  (cost=8.23..97933.60 rows=251854 width=1044)
                           ->  Merge Join  (cost=8.23..95415.06 rows=251854 width=55)
                                 Merge Cond: (u.user_id = ua.user_id)
                                 ->  Index Scan using mxl_user_pkey on mxl_user u  (cost=0.00..77564.77 rows=1878522
width=28)
                                 ->  Index Scan using mxl_user_alias_uid_idx on mxl_user_alias ua  (cost=0.00..10109.21
rows=251854width=31) 
                                       Filter: (ua.active <> 0)
(31 rows)



postgres> time psql -Upostgres -dmxl -fbadquery.sql -o kda.out

real    0m32.344s
user    0m2.101s
sys    0m0.314s


On 5/14/10 12:55 PM, "Josh Kupershmidt" <schmiddy@gmail.com> wrote:

On Fri, May 14, 2010 at 1:28 PM,  <Keaton_Adams@mcafee.com> wrote:

> I did run an ANALYZE, then vacuum full, then another ANALYZE on all tables involved in the query.  That didn't change
things. I compared QUERY TUNING settings in both postgresql.conf files and they are identical.  There is a difference
inrow counts in each table between the databases, but not by much. 

Can we see EXPLAIN ANALYZE instead of just EXPLAIN output for the
queries on the two servers?

Also, can you try CLUSTER on the tables involved instead of VACUUM
FULL, to avoid introducing index bloat?

Josh


Re: Poor query performance on one of two "like" databases in production.

От
Дата:
OK,

So if I change the NOT IN clause the query will run with a MERGE JOIN and complete in about 20 seconds.  Have a look at
thelogic I am following and see if it makes sense.  Might this just be a case where because there is more data in one
DBcompared to another (even though the counts are "close"), that's the difference in this whole thing? 

-K


postgres@> more badquery.sql
EXPLAIN SELECT substring(users.email from '^.*\@') || domains.domain as email,
                users.customer_id,
                users.password,
                p.policy_set_id,
                users.user_id,
                domains.auth_type
         FROM
               (
                 SELECT d.customer_id, d.domain, d.domain_id, d.auth_type, d.active from mxl_domain d
                 UNION ALL
                 SELECT d.customer_id, da.domain, da.domain_id, d.auth_type, da.active from mxl_domain d,
mxl_domain_aliasda WHERE da.domain_id = d.domain_id 
               ) as domains,
               (
                 SELECT u.email, u.customer_id, u.user_id, u.domain_id, u.password, u.active from mxl_user u
                 UNION ALL
                 SELECT ua.email, u.customer_id, u.user_id, u.domain_id, u.password, ua.active from mxl_user u,
mxl_user_aliasua WHERE ua.user_id = u.user_id 
               ) as users,
                wds_policy_set p
         WHERE  users.customer_id  = p.id
         AND    users.customer_id  = domains.customer_id
         AND    users.domain_id    = domains.domain_id
         AND    p.default_flag = 1
         AND    p.web_access_flag = 1
         AND    p.scope  = 3
         AND    domains.active = 1
         AND    users.active != 0
         AND    p.active = 1
         AND    users.user_id NOT IN (SELECT user_id FROM mxl_user_group mug WHERE mug.user_id = users.user_id);


postgres@ time psql -Upostgres -dmxl -fbadquery.sql
                                                                  QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=3522288.96..3534654.13 rows=245730 width=1564)
   Merge Cond: ((p.id = u.customer_id) AND (d.domain_id = u.domain_id))
   ->  Sort  (cost=38160.35..38471.30 rows=124381 width=536)
         Sort Key: p.id, d.domain_id
         ->  Hash Join  (cost=578.15..10203.83 rows=124381 width=536)
               Hash Cond: (d.customer_id = p.id)
               ->  Append  (cost=0.00..4989.45 rows=77541 width=528)
                     ->  Seq Scan on mxl_domain d  (cost=0.00..1810.88 rows=64390 width=28)
                           Filter: (active = 1)
                     ->  Subquery Scan "*SELECT* 2"  (cost=2454.78..3178.57 rows=13151 width=528)
                           ->  Hash Join  (cost=2454.78..3047.06 rows=13151 width=33)
                                 Hash Cond: (da.domain_id = d.domain_id)
                                 ->  Seq Scan on mxl_domain_alias da  (cost=0.00..296.39 rows=13151 width=25)
                                       Filter: (active = 1)
                                 ->  Hash  (cost=1649.90..1649.90 rows=64390 width=12)
                                       ->  Seq Scan on mxl_domain d  (cost=0.00..1649.90 rows=64390 width=12)
               ->  Hash  (cost=525.74..525.74 rows=4193 width=8)
                     ->  Seq Scan on wds_policy_set p  (cost=0.00..525.74 rows=4193 width=8)
                           Filter: ((default_flag = 1) AND (web_access_flag = 1) AND (active = 1) AND (scope = 3))
   ->  Materialize  (cost=3484128.61..3497039.21 rows=1032848 width=1044)
         ->  Sort  (cost=3484128.61..3486710.73 rows=1032848 width=1044)
               Sort Key: u.customer_id, u.domain_id
               ->  Append  (cost=0.00..2826808.61 rows=1032848 width=1044)
                     ->  Index Scan using mxl_user_domain_id_idx on mxl_user u  (cost=0.00..2416377.66 rows=906921
width=52)
                           Filter: ((active <> 0) AND (NOT (subplan)))
                           SubPlan
                             ->  Index Scan using mxl_user_group_uid_idx on mxl_user_group mug  (cost=0.00..2.47 rows=1
width=4)
                                   Index Cond: (user_id = $0)
                     ->  Subquery Scan "*SELECT* 2"  (cost=8.23..410430.95 rows=125927 width=1044)
                           Filter: (NOT (subplan))
                           ->  Merge Join  (cost=8.23..95536.63 rows=251854 width=55)
                                 Merge Cond: (u.user_id = ua.user_id)
                                 ->  Index Scan using mxl_user_pkey on mxl_user u  (cost=0.00..77679.47 rows=1881318
width=28)
                                 ->  Index Scan using mxl_user_alias_uid_idx on mxl_user_alias ua  (cost=0.00..10109.21
rows=251854width=31) 
                                       Filter: (ua.active <> 0)
                           SubPlan
                             ->  Index Scan using mxl_user_group_uid_idx on mxl_user_group mug  (cost=0.00..2.47 rows=1
width=4)
                                   Index Cond: (user_id = $0)
(38 rows)


postgres@ time psql -Upostgres -dmxl -fbadquery.sql -okda.out

real    0m22.645s
user    0m1.565s
sys    0m0.246s
postgres@> wc -l kda.out
285563 kda.out



Re: Poor query performance on one of two "like" databases in production.

От
Дата:
It looks like it is just a difference in data volume.  We are re-working the query to see what that will do.

Thanks for the suggestions.

-K


On 5/14/10 2:23 PM, "Adams, Keaton" <Keaton_Adams@McAfee.com> wrote:

OK,

So if I change the NOT IN clause the query will run with a MERGE JOIN and complete in about 20 seconds.  Have a look at
thelogic I am following and see if it makes sense.  Might this just be a case where because there is more data in one
DBcompared to another (even though the counts are "close"), that's the difference in this whole thing? 

-K


postgres@> more badquery.sql
EXPLAIN SELECT substring(users.email from '^.*\@') || domains.domain as email,
                users.customer_id,
                users.password,
                p.policy_set_id,
                users.user_id,
                domains.auth_type
         FROM
               (
                 SELECT d.customer_id, d.domain, d.domain_id, d.auth_type, d.active from mxl_domain d
                 UNION ALL
                 SELECT d.customer_id, da.domain, da.domain_id, d.auth_type, da.active from mxl_domain d,
mxl_domain_aliasda WHERE da.domain_id = d.domain_id 
               ) as domains,
               (
                 SELECT u.email, u.customer_id, u.user_id, u.domain_id, u.password, u.active from mxl_user u
                 UNION ALL
                 SELECT ua.email, u.customer_id, u.user_id, u.domain_id, u.password, ua.active from mxl_user u,
mxl_user_aliasua WHERE ua.user_id = u.user_id 
               ) as users,
                wds_policy_set p
         WHERE  users.customer_id  = p.id
         AND    users.customer_id  = domains.customer_id
         AND    users.domain_id    = domains.domain_id
         AND    p.default_flag = 1
         AND    p.web_access_flag = 1
         AND    p.scope  = 3
         AND    domains.active = 1
         AND    users.active != 0
         AND    p.active = 1
         AND    users.user_id NOT IN (SELECT user_id FROM mxl_user_group mug WHERE mug.user_id = users.user_id);


postgres@ time psql -Upostgres -dmxl -fbadquery.sql
                                                                  QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=3522288.96..3534654.13 rows=245730 width=1564)
   Merge Cond: ((p.id = u.customer_id) AND (d.domain_id = u.domain_id))
   ->  Sort  (cost=38160.35..38471.30 rows=124381 width=536)
         Sort Key: p.id, d.domain_id
         ->  Hash Join  (cost=578.15..10203.83 rows=124381 width=536)
               Hash Cond: (d.customer_id = p.id)
               ->  Append  (cost=0.00..4989.45 rows=77541 width=528)
                     ->  Seq Scan on mxl_domain d  (cost=0.00..1810.88 rows=64390 width=28)
                           Filter: (active = 1)
                     ->  Subquery Scan "*SELECT* 2"  (cost=2454.78..3178.57 rows=13151 width=528)
                           ->  Hash Join  (cost=2454.78..3047.06 rows=13151 width=33)
                                 Hash Cond: (da.domain_id = d.domain_id)
                                 ->  Seq Scan on mxl_domain_alias da  (cost=0.00..296.39 rows=13151 width=25)
                                       Filter: (active = 1)
                                 ->  Hash  (cost=1649.90..1649.90 rows=64390 width=12)
                                       ->  Seq Scan on mxl_domain d  (cost=0.00..1649.90 rows=64390 width=12)
               ->  Hash  (cost=525.74..525.74 rows=4193 width=8)
                     ->  Seq Scan on wds_policy_set p  (cost=0.00..525.74 rows=4193 width=8)
                           Filter: ((default_flag = 1) AND (web_access_flag = 1) AND (active = 1) AND (scope = 3))
   ->  Materialize  (cost=3484128.61..3497039.21 rows=1032848 width=1044)
         ->  Sort  (cost=3484128.61..3486710.73 rows=1032848 width=1044)
               Sort Key: u.customer_id, u.domain_id
               ->  Append  (cost=0.00..2826808.61 rows=1032848 width=1044)
                     ->  Index Scan using mxl_user_domain_id_idx on mxl_user u  (cost=0.00..2416377.66 rows=906921
width=52)
                           Filter: ((active <> 0) AND (NOT (subplan)))
                           SubPlan
                             ->  Index Scan using mxl_user_group_uid_idx on mxl_user_group mug  (cost=0.00..2.47 rows=1
width=4)
                                   Index Cond: (user_id = $0)
                     ->  Subquery Scan "*SELECT* 2"  (cost=8.23..410430.95 rows=125927 width=1044)
                           Filter: (NOT (subplan))
                           ->  Merge Join  (cost=8.23..95536.63 rows=251854 width=55)
                                 Merge Cond: (u.user_id = ua.user_id)
                                 ->  Index Scan using mxl_user_pkey on mxl_user u  (cost=0.00..77679.47 rows=1881318
width=28)
                                 ->  Index Scan using mxl_user_alias_uid_idx on mxl_user_alias ua  (cost=0.00..10109.21
rows=251854width=31) 
                                       Filter: (ua.active <> 0)
                           SubPlan
                             ->  Index Scan using mxl_user_group_uid_idx on mxl_user_group mug  (cost=0.00..2.47 rows=1
width=4)
                                   Index Cond: (user_id = $0)
(38 rows)


postgres@ time psql -Upostgres -dmxl -fbadquery.sql -okda.out

real    0m22.645s
user    0m1.565s
sys    0m0.246s
postgres@> wc -l kda.out
285563 kda.out




Re: Poor query performance on one of two "like" databases in production.

От
Stephen Frost
Дата:
* Keaton_Adams@McAfee.com (Keaton_Adams@McAfee.com) wrote:
> It looks like it is just a difference in data volume.  We are re-working the query to see what that will do.

Just my 2c, but I'd recommend using JOIN syntax instead of comma-joins.
eg:

select * from a JOIN b USING (col1,col2);

or:

select * from a JOIN b ON (a.col1 = b.col2);

Would make the query alot easier to read, imv.  Also, those UNION ALLs
probably aren't really helping this whole situation..  Perhaps you could
move them out of the FROM clause and just have a simpler query for each
set which is *then* UNION ALL'd together.

    THanks,

        Stephen

    Thanks,

        Stephen

Вложения