Обсуждение: Planner ignoring to use INDEX SCAN
Hello All,
I have two table:
master_table:
pk_id bigint primary key,
name text
child_table :
chk_id bigint,
pk_id bigint
mob varchar(15)
foreign key pk_id to master_table
index on pk_id of child
query which was taking seconds on the join of these two table suddenly started taking 20/25 min
child table has @ 100M rows
The only deifference was I dropped the foreign key pointing to parent pk_id after this the planner is ignoring to use index scan on child table and doing seq scan.
Is it because I dropped foreign key?
Thanks in advance
With regards
Ashish...
Now you can chat without downloading messenger. Click here to know how.
I have two table:
master_table:
pk_id bigint primary key,
name text
child_table :
chk_id bigint,
pk_id bigint
mob varchar(15)
foreign key pk_id to master_table
index on pk_id of child
query which was taking seconds on the join of these two table suddenly started taking 20/25 min
child table has @ 100M rows
The only deifference was I dropped the foreign key pointing to parent pk_id after this the planner is ignoring to use index scan on child table and doing seq scan.
Is it because I dropped foreign key?
Thanks in advance
With regards
Ashish...
Now you can chat without downloading messenger. Click here to know how.
Ashish Karalkar wrote: > query which was taking seconds on the join of these two table > suddenly started taking 20/25 min Show the EXPLAIN ANALYSE of your problem query and someone will be able to tell you why. -- Richard Huxton Archonet Ltd
Ashish Karalkar wrote: > > Richard Huxton <dev@archonet.com> wrote: Ashish Karalkar wrote: >> query which was taking seconds on the join of these two table >> suddenly started taking 20/25 min > > Show the EXPLAIN ANALYSE of your problem query and someone will be able > to tell you why. > > Here is the output from explain analyse: Actually, this is the output from EXPLAIN not EXPLAIN ANALYSE. It doesn't show what actually happened, just what the planner thought was going to happen. Are the row-estimates roughly accurate? > table structures are more or less same with delivery being parent and sms_new being child having index on deliveryid inboth tables. > > HashAggregate (cost=6153350.21..6153352.38 rows=174 width=32) > -> Hash Join (cost=218058.30..6153259.97 rows=6016 width=32) > Hash Cond: ("outer".deliveryid = "inner".deliveryid) > -> Seq Scan on sms_new (cost=0.00..5240444.80 rows=138939341 width=8) > Filter: ((otid)::text !~~ 'ERROR%'::text) > -> Hash (cost=218057.87..218057.87 rows=174 width=32) Well, it knows that it's going to be expensive (cost=5240444.80). Since it thinks you'll only get 174 rows from the other side and 6016 matching, I can't see how an index could be calculated as more expensive. Try issuing ENABLE seq_scan=off and re-running the EXPLAIN, let's see what cost that comes up with. Oh, and I take it sms_new is recently vacuumed and analysed? -- Richard Huxton Archonet Ltd
Thanks Richard for your replay,
here is the output..
Richard Huxton <dev@archonet.com> wrote:
Share files, take polls, and discuss your passions - all under one roof. Click here.
here is the output..
Richard Huxton <dev@archonet.com> wrote:
Ashish Karalkar wrote:
>
> Richard Huxton wrote: Ashish Karalkar wrote:
>> query which was taking seconds on the join of these two table
>> suddenly started taking 20/25 min
>
> Show the EXPLAIN ANALYSE of your problem query and someone will be able
> to tell you why.
>
> Here is the output from explain analyse:
Actually, this is the output from EXPLAIN not EXPLAIN ANALYSE. It
doesn't show what actually happened, just what the planner thought was
going to happen.
Its from EXPLAIN ANALYSE
Are the row-estimates roughly accurate?
Yes Row count of sms_new is approx. same
> table structures are more or less same with delivery being parent and sms_new being child having index on deliveryid in both tables.
>
> HashAggregate (cost=6153350.21..6153352.38 rows=174 width=32)
> -> Hash Join (cost=218058.30..6153259.97 rows=6016 width=32)
> Hash Cond: ("outer".deliveryid = "inner".deliveryid)
> -> Seq Scan on sms_new (cost=0.00..5240444.80 rows=138939341 width=8)
> Filter: ((otid)::text !~~ 'ERROR%'::text)
> -> Hash (cost=218057.87..218057.87 rows=174 width=32)
Well, it knows that it's going to be expensive (cost=5240444.80). Since
it thinks you'll only get 174 rows from the other side and 6016
matching, I can't see how an index could be calculated as more expensive.
Try issuing ENABLE seq_scan=off and re-running the EXPLAIN, let's see
what cost that comes up with.
here is explain out put after setting enable_seqscan=off
HashAggregate (cost=27729224.21..27729226.21 rows=160 width=32)
-> Nested Loop (cost=2534.67..27729143.31 rows=5393 width=32)
-> Bitmap Heap Scan on delivery (cost=2094.41..216143.78 rows=160 width=32)
Recheck Cond: ((createddate >= '2007-12-10 00:00:00'::timestamp without time zone) AND (createddate <= '2007-12-11 00:00:00'::timestamp without time zone))
Filter: ((taskid = 14267) AND (((remoteip)::text = '192.168.3.26'::text) OR ((remoteip)::text = '202.162.231.230'::text) OR ((remoteip)::text = '202.162.231.2'::text) OR ((remoteip)::text = '192.168.4.3'::text) OR ((remoteip)::text = '192.168.3.3'::text) OR ((remoteip)::text = '202.162.231.105'::text) OR ((remoteip)::text = '202.162.231.5'::text) OR ((remoteip)::text = '202.162.231.1'::text) OR ((remoteip)::text = '192.168.4.6'::text) OR ((remoteip)::text = '192.168.3.6'::text) OR ((remoteip)::text = '202.162.231.107'::text) OR ((remoteip)::text = '202.162.231.7'::text)))
-> Bitmap Index Scan on createddate_idx (cost=0.00..2094.41 rows=197068 width=0)
Index Cond: ((createddate >= '2007-12-10 00:00:00'::timestamp without time zone) AND (createddate <= '2007-12-11 00:00:00'::timestamp without time zone))
-> Bitmap Heap Scan on sms_new (cost=440.26..171369.61 rows=46931 width=8)
Recheck Cond: (sms_new.deliveryid = "outer".deliveryid)
Filter: ((otid)::text !~~ 'ERROR%'::text)
-> Bitmap Index Scan on sms_new_deliveryid_idx (cost=0.00..440.26 rows=46931 width=0)
Index Cond: (sms_new.deliveryid = "outer".deliveryid)
Oh, and I take it sms_new is recently vacuumed and analysed?
yes it is vacuumed and analysed
--
Richard Huxton
Archonet Ltd
Share files, take polls, and discuss your passions - all under one roof. Click here.
Ashish Karalkar wrote: > Thanks Richard for your replay, > > here is the output.. > > Richard Huxton <dev@archonet.com> wrote: Ashish Karalkar wrote: >> Richard Huxton wrote: Ashish Karalkar wrote: >>> query which was taking seconds on the join of these two table >>> suddenly started taking 20/25 min >> Show the EXPLAIN ANALYSE of your problem query and someone will be able >> to tell you why. >> >> Here is the output from explain analyse: > > Actually, this is the output from EXPLAIN not EXPLAIN ANALYSE. It > doesn't show what actually happened, just what the planner thought was > going to happen. > > Its from EXPLAIN ANALYSE No it's not, because that shows an extra set of figures. If you see below there is a cost for each stage and an actual time too. EXPLAIN ANALYSE SELECT d.id, l.name FROM items.documents d JOIN lookups.document_class l ON d.class=l.id; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Hash Join (cost=1.16..14.00 rows=162 width=19) (actual time=0.100..0.392 rows=162 loops=1) Hash Cond: ((d.class)::text = (l.id)::text) -> Seq Scan on documents d (cost=0.00..10.62 rows=162 width=9) (actual time=0.024..0.121 rows=162 loops=1) -> Hash (cost=1.07..1.07 rows=7 width=20) (actual time=0.029..0.029 rows=8 loops=1) -> Seq Scan on document_class l (cost=0.00..1.07 rows=7 width=20) (actual time=0.008..0.015 rows=8 loops=1) Total runtime: 0.506 ms (6 rows) > Are the row-estimates roughly accurate? > > Yes Row count of sms_new is approx. same OK, that's good. >> table structures are more or less same with delivery being parent and sms_new being child having index on deliveryid inboth tables. >> >> HashAggregate (cost=6153350.21..6153352.38 rows=174 width=32) >> -> Hash Join (cost=218058.30..6153259.97 rows=6016 width=32) >> Hash Cond: ("outer".deliveryid = "inner".deliveryid) >> -> Seq Scan on sms_new (cost=0.00..5240444.80 rows=138939341 width=8) >> Filter: ((otid)::text !~~ 'ERROR%'::text) >> -> Hash (cost=218057.87..218057.87 rows=174 width=32) > > Well, it knows that it's going to be expensive (cost=5240444.80). Since > it thinks you'll only get 174 rows from the other side and 6016 > matching, I can't see how an index could be calculated as more expensive. > > Try issuing ENABLE seq_scan=off and re-running the EXPLAIN, let's see > what cost that comes up with. > > here is explain out put after setting enable_seqscan=off > > HashAggregate (cost=27729224.21..27729226.21 rows=160 width=32) > -> Nested Loop (cost=2534.67..27729143.31 rows=5393 width=32) > -> Bitmap Heap Scan on delivery (cost=2094.41..216143.78 rows=160 width=32) > -> Bitmap Heap Scan on sms_new (cost=440.26..171369.61 rows=46931 width=8) Well, the estimated cost for this one is up to 27 million from the previous 6 million. It's doing two bitmap scans and then the nested loop which is what's pushing the cost up. Can you post the query too? -- Richard Huxton Archonet Ltd
"Ashish Karalkar" <ashish_postgre@yahoo.co.in> writes: > Thanks Richard for your replay, > > Richard Huxton <dev@archonet.com> wrote: >> > Actually, this is the output from EXPLAIN not EXPLAIN ANALYSE. It > doesn't show what actually happened, just what the planner thought was > going to happen. > > Its from EXPLAIN ANALYSE No, it's not. Not unless you've removed all the added info explain analyse adds to the explain output. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!
Richard Huxton <dev@archonet.com> wrote:
Ashish Karalkar wrote:
> query which was taking seconds on the join of these two table
> suddenly started taking 20/25 min
Show the EXPLAIN ANALYSE of your problem query and someone will be able
to tell you why.
Here is the output from explain analyse:
table structures are more or less same with delivery being parent and sms_new being child having index on deliveryid in both tables.
HashAggregate (cost=6153350.21..6153352.38 rows=174 width=32)
-> Hash Join (cost=218058.30..6153259.97 rows=6016 width=32)
Hash Cond: ("outer".deliveryid = "inner".deliveryid)
-> Seq Scan on sms_new (cost=0.00..5240444.80 rows=138939341 width=8)
Filter: ((otid)::text !~~ 'ERROR%'::text)
-> Hash (cost=218057.87..218057.87 rows=174 width=32)
-> Bitmap Heap Scan on delivery (cost=2218.02..218057.87 rows=174 width=32)
Recheck Cond: ((createddate >= '2007-12-10 00:00:00'::timestamp without time zone) AND (createddate <= '2007-12-11 00:00:00'::timestamp without time zone))
Filter: ((taskid = 14267) AND (((remoteip)::text = '192.168.3.26'::text) OR ((remoteip)::text = '202.162.231.230'::text) OR ((remoteip)::text = '202.162.231.2'::text) OR ((remoteip)::text = '192.168.4.3'::text) OR ((remoteip)::text = '192.168.3.3'::text) OR ((remoteip)::text = '202.162.231.105'::text) OR ((remoteip)::text = '202.162.231.5'::text) OR ((remoteip)::text = '202.162.231.1'::text) OR ((remoteip)::text = '192.168.4.6'::text) OR ((remoteip)::text = '192.168.3.6'::text) OR ((remoteip)::text = '202.162.231.107'::text) OR ((remoteip)::text = '202.162.231.7'::text)))
-> Bitmap Index Scan on createddate_idx (cost=0.00..2218.02 rows=207004 width=0)
Index Cond: ((createddate >= '2007-12-10 00:00:00'::timestamp without time zone) AND (createddate <= '2007-12-11 00:00:00'::timestamp without time zone))
Richard Huxton
Archonet Ltd
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Get the freedom to save as many mails as you wish. Click here to know how.
Hi - been having a few problems like this myself. It's probably a locale thing. Here's at one of your problems: -> Seq Scan on sms_new (cost=0.00..5240444.80 rows=138939341 width=8) Filter: ((otid)::text !~~ 'ERROR%'::text) I assume you have an index on sms_new? Check the locale your database cluster has been set up with: SHOW lc_ctype if it doesn't come back with "C" as your locale, then you will have to make sure your indexes are created as follows: CREATE INDEX IDX_sms_new ON YOUR_TABLE USING btree (sms_new varchar_pattern_ops); Look at chapter 11.8 - Operator Classes for an explanation. Unfortunately you'll need a second index if you want to use a non- pattern matching operator, e.g. =. If your locale is C, then I don't know what the problem is. Hope that helps. Will Temperley