Обсуждение: Planner making wrong decisions 8.2.4. Insane cost calculations.
Hi list, I'm having a weird acting query which simply retrieves some files stored in a db which are related to a specific archive and also has a size lower than 1024 bytes. Explain analyze below. The first one is with seq-scan enabled and the other one with seq-scans disabled. The weird thing is the seq scan on tbl_file_structure and also the insane calculated cost of 100 000 000 on some tables. Explain analyze below with both seq scan on and off. Regards, Henrik EXPLAIN ANALYZE SELECT pk_file_id, file_name_in_tar, tar_name, file_suffix, fk_tar_id, tar_compressed FROM tbl_file INNER JOIN tbl_file_structure ON fk_file_id = pk_file_id INNER JOIN tbl_structure ON fk_structure_id = pk_structure_id LEFT OUTER JOIN tbl_tar ON fk_tar_id = pk_tar_id WHERE file_indexed IS FALSE AND file_copied IS TRUE AND file_size < (1024) AND LOWER (file_suffix) IN( SELECT LOWER (filetype_suffix) FROM tbl_filetype_suffix WHERE filetype_suffix_index IS TRUE ) AND fk_archive_id = 115 ORDER BY fk_tar_id "Sort (cost=238497.42..238499.49 rows=828 width=76) (actual time=65377.316..65377.321 rows=5 loops=1)" " Sort Key: tbl_file.fk_tar_id" " -> Hash Left Join (cost=39935.64..238457.29 rows=828 width=76) (actual time=61135.732..65377.246 rows=5 loops=1)" " Hash Cond: (tbl_file.fk_tar_id = tbl_tar.pk_tar_id)" " -> Hash Join (cost=39828.67..238336.86 rows=828 width=50) (actual time=60776.587..65018.062 rows=5 loops=1)" " Hash Cond: (tbl_file_structure.fk_structure_id = tbl_structure.pk_structure_id)" " -> Hash Join (cost=30975.39..228750.72 rows=72458 width=58) (actual time=14256.555..64577.950 rows=4650 loops=1)" " Hash Cond: (tbl_file_structure.fk_file_id = tbl_file.pk_file_id)" " -> Seq Scan on tbl_file_structure (cost=0.00..167417.09 rows=7902309 width=16) (actual time=9.581..33702.852 rows=7801334 loops=1)" " -> Hash (cost=30874.63..30874.63 rows=8061 width=50) (actual time=14058.396..14058.396 rows=486 loops=1)" " -> Hash Join (cost=3756.16..30874.63 rows=8061 width=50) (actual time=9373.992..14056.119 rows=486 loops=1)" " Hash Cond: (lower ((tbl_file.file_suffix)::text) = lower ((tbl_filetype_suffix.filetype_suffix)::text))" " -> Bitmap Heap Scan on tbl_file (cost=3754.47..29939.50 rows=136453 width=50) (actual time=9068.525..13654.235 rows=154605 loops=1)" " Recheck Cond: (file_size < 1024)" " Filter: ((file_indexed IS FALSE) AND (file_copied IS TRUE))" " -> Bitmap Index Scan on tbl_file_idx4 (cost=0.00..3720.36 rows=195202 width=0) (actual time=9002.683..9002.683 rows=205084 loops=1)" " Index Cond: (file_size < 1024)" " -> Hash (cost=1.52..1.52 rows=14 width=8) (actual time=0.557..0.557 rows=14 loops=1)" " -> HashAggregate (cost=1.38..1.52 rows=14 width=8) (actual time=0.484..0.507 rows=14 loops=1)" " -> Seq Scan on tbl_filetype_suffix (cost=0.00..1.34 rows=14 width=8) (actual time=0.383..0.423 rows=14 loops=1)" " Filter: (filetype_suffix_index IS TRUE)" " -> Hash (cost=8778.54..8778.54 rows=5979 width=8) (actual time=419.491..419.491 rows=11420 loops=1)" " -> Bitmap Heap Scan on tbl_structure (cost=617.08..8778.54 rows=5979 width=8) (actual time=114.501..393.685 rows=11420 loops=1)" " Recheck Cond: (fk_archive_id = 115)" " -> Bitmap Index Scan on tbl_structure_idx3 (cost=0.00..615.59 rows=5979 width=0) (actual time=100.939..100.939 rows=11420 loops=1)" " Index Cond: (fk_archive_id = 115)" " -> Hash (cost=64.21..64.21 rows=3421 width=34) (actual time=359.043..359.043 rows=3485 loops=1)" " -> Seq Scan on tbl_tar (cost=0.00..64.21 rows=3421 width=34) (actual time=19.287..348.237 rows=3485 loops=1)" "Total runtime: 65378.552 ms" Now I disabled seq scans. set enable_seqscan=false; "Merge Left Join (cost=100331398.53..100331526.36 rows=828 width=76) (actual time=36206.575..36291.847 rows=5 loops=1)" " Merge Cond: (tbl_file.fk_tar_id = tbl_tar.pk_tar_id)" " -> Sort (cost=100331398.53..100331400.60 rows=828 width=50) (actual time=36030.473..36030.487 rows=5 loops=1)" " Sort Key: tbl_file.fk_tar_id" " -> Hash Join (cost=100012609.44..100331358.40 rows=828 width=50) (actual time=27279.046..36030.399 rows=5 loops=1)" " Hash Cond: (tbl_file_structure.fk_structure_id = tbl_structure.pk_structure_id)" " -> Nested Loop (cost=100003756.16..100321772.87 rows=72397 width=58) (actual time=13225.815..35533.414 rows=4650 loops=1)" " -> Hash Join (cost=100003756.16..100030874.63 rows=8061 width=50) (actual time=12888.880..19845.110 rows=486 loops=1)" " Hash Cond: (lower ((tbl_file.file_suffix)::text) = lower ((tbl_filetype_suffix.filetype_suffix)::text))" " -> Bitmap Heap Scan on tbl_file (cost=3754.47..29939.50 rows=136453 width=50) (actual time=12747.478..19266.843 rows=154605 loops=1)" " Recheck Cond: (file_size < 1024)" " Filter: ((file_indexed IS FALSE) AND (file_copied IS TRUE))" " -> Bitmap Index Scan on tbl_file_idx4 (cost=0.00..3720.36 rows=195202 width=0) (actual time=12689.593..12689.593 rows=205084 loops=1)" " Index Cond: (file_size < 1024)" " -> Hash (cost=100000001.52..100000001.52 rows=14 width=8) (actual time=0.313..0.313 rows=14 loops=1)" " -> HashAggregate (cost=100000001.38..100000001.52 rows=14 width=8) (actual time=0.230..0.254 rows=14 loops=1)" " -> Seq Scan on tbl_filetype_suffix (cost=100000000.00..100000001.34 rows=14 width=8) (actual time=0.133..0.176 rows=14 loops=1)" " Filter: (filetype_suffix_index IS TRUE)" " -> Index Scan using tbl_file_structure_idx on tbl_file_structure (cost=0.00..35.82 rows=21 width=16) (actual time=7.031..32.178 rows=10 loops=486)" " Index Cond: (tbl_file_structure.fk_file_id = tbl_file.pk_file_id)" " -> Hash (cost=8778.54..8778.54 rows=5979 width=8) (actual time=445.799..445.799 rows=11420 loops=1)" " -> Bitmap Heap Scan on tbl_structure (cost=617.08..8778.54 rows=5979 width=8) (actual time=155.046..419.676 rows=11420 loops=1)" " Recheck Cond: (fk_archive_id = 115)" " -> Bitmap Index Scan on tbl_structure_idx3 (cost=0.00..615.59 rows=5979 width=0) (actual time=126.623..126.623 rows=11420 loops=1)" " Index Cond: (fk_archive_id = 115)" " -> Index Scan using tbl_tar_pkey on tbl_tar (cost=0.00..106.86 rows=3421 width=34) (actual time=22.218..251.830 rows=2491 loops=1)" "Total runtime: 36292.481 ms"
"Henrik Zagerholm" <henke@mac.se> writes: > Hi list, > > I'm having a weird acting query which simply retrieves some files stored in a db > which are related to a specific archive and also has a size lower than 1024 > bytes. > Explain analyze below. The first one is with seq-scan enabled and the other one > with seq-scans disabled. The weird thing is the seq scan on tbl_file_structure > and also the insane calculated cost of 100 000 000 on some tables. Well the way Postgres disables a plan node type is by giving it a cost of 100,000,000. What other way did you expect it to be able to scan tbl_filetype_suffix anyways? What indexes do you have on tbl_filetype_suffix? And any chance you could resend this stuff without the word-wrapping? It's pretty hard to read like this: " -> Seq Scan on tbl_filetype_suffix (cost=100000000.00..100000001.34 rows=14 width=8) (actual time=0.133..0.176 rows=14 loops=1)" " Filter: (filetype_suffix_index IS TRUE)" -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
6 aug 2007 kl. 15:07 skrev Gregory Stark: > "Henrik Zagerholm" <henke@mac.se> writes: > >> Hi list, >> >> I'm having a weird acting query which simply retrieves some files >> stored in a db >> which are related to a specific archive and also has a size lower >> than 1024 >> bytes. >> Explain analyze below. The first one is with seq-scan enabled and >> the other one >> with seq-scans disabled. The weird thing is the seq scan on >> tbl_file_structure >> and also the insane calculated cost of 100 000 000 on some tables. > > Well the way Postgres disables a plan node type is by giving it a > cost of > 100,000,000. What other way did you expect it to be able to scan > tbl_filetype_suffix anyways? What indexes do you have on > tbl_filetype_suffix? > Ahh, my bad. It is a very small table but I have an unique index. CREATE UNIQUE INDEX tbl_filetype_suffix_idx ON tbl_filetype_suffix USING btree (filetype_suffix); > And any chance you could resend this stuff without the word-wrapping? > It's pretty hard to read like this: > Resending and hopefully the line breaks are gone. I couldn't find any in my sent mail. If this doesn't work I'll paste it on pastie. The weird thing is that the seq scan on tbl_file_structure is soooo slow but when I force an index scan it is very fast. > " -> Seq Scan on > tbl_filetype_suffix > (cost=100000000.00..100000001.34 rows=14 width=8) (actual > time=0.133..0.176 > rows=14 loops=1)" > " Filter: > (filetype_suffix_index IS > TRUE)" > Cheers, Henrik EXPLAIN ANALYZE SELECT pk_file_id, file_name_in_tar, tar_name, file_suffix, fk_tar_id, tar_compressed FROM tbl_file INNER JOIN tbl_file_structure ON fk_file_id = pk_file_id INNER JOIN tbl_structure ON fk_structure_id = pk_structure_id LEFT OUTER JOIN tbl_tar ON fk_tar_id = pk_tar_id WHERE file_indexed IS FALSE AND file_copied IS TRUE AND file_size < (1024) AND LOWER (file_suffix) IN( SELECT LOWER (filetype_suffix) FROM tbl_filetype_suffix WHERE filetype_suffix_index IS TRUE ) AND fk_archive_id = 115 ORDER BY fk_tar_id Sort (cost=238497.42..238499.49 rows=828 width=76) (actual time=65377.316..65377.321 rows=5 loops=1) Sort Key: tbl_file.fk_tar_id -> Hash Left Join (cost=39935.64..238457.29 rows=828 width=76) (actual time=61135.732..65377.246 rows=5 loops=1) Hash Cond: (tbl_file.fk_tar_id = tbl_tar.pk_tar_id) -> Hash Join (cost=39828.67..238336.86 rows=828 width=50) (actual time=60776.587..65018.062 rows=5 loops=1) Hash Cond: (tbl_file_structure.fk_structure_id = tbl_structure.pk_structure_id) -> Hash Join (cost=30975.39..228750.72 rows=72458 width=58) (actual time=14256.555..64577.950 rows=4650 loops=1) Hash Cond: (tbl_file_structure.fk_file_id = tbl_file.pk_file_id) -> Seq Scan on tbl_file_structure (cost=0.00..167417.09 rows=7902309 width=16) (actual time=9.581..33702.852 rows=7801334 loops=1) -> Hash (cost=30874.63..30874.63 rows=8061 width=50) (actual time=14058.396..14058.396 rows=486 loops=1) -> Hash Join (cost=3756.16..30874.63 rows=8061 width=50) (actual time=9373.992..14056.119 rows=486 loops=1) Hash Cond: (lower ((tbl_file.file_suffix)::text) = lower ((tbl_filetype_suffix.filetype_suffix)::text)) -> Bitmap Heap Scan on tbl_file (cost=3754.47..29939.50 rows=136453 width=50) (actual time=9068.525..13654.235 rows=154605 loops=1) Recheck Cond: (file_size < 1024) Filter: ((file_indexed IS FALSE) AND (file_copied IS TRUE)) -> Bitmap Index Scan on tbl_file_idx4 (cost=0.00..3720.36 rows=195202 width=0) (actual time=9002.683..9002.683 rows=205084 loops=1) Index Cond: (file_size < 1024) -> Hash (cost=1.52..1.52 rows=14 width=8) (actual time=0.557..0.557 rows=14 loops=1) -> HashAggregate (cost=1.38..1.52 rows=14 width=8) (actual time=0.484..0.507 rows=14 loops=1) -> Seq Scan on tbl_filetype_suffix (cost=0.00..1.34 rows=14 width=8) (actual time=0.383..0.423 rows=14 loops=1) Filter: (filetype_suffix_index IS TRUE) -> Hash (cost=8778.54..8778.54 rows=5979 width=8) (actual time=419.491..419.491 rows=11420 loops=1) -> Bitmap Heap Scan on tbl_structure (cost=617.08..8778.54 rows=5979 width=8) (actual time=114.501..393.685 rows=11420 loops=1) Recheck Cond: (fk_archive_id = 115) -> Bitmap Index Scan on tbl_structure_idx3 (cost=0.00..615.59 rows=5979 width=0) (actual time=100.939..100.939 rows=11420 loops=1) Index Cond: (fk_archive_id = 115) -> Hash (cost=64.21..64.21 rows=3421 width=34) (actual time=359.043..359.043 rows=3485 loops=1) -> Seq Scan on tbl_tar (cost=0.00..64.21 rows=3421 width=34) (actual time=19.287..348.237 rows=3485 loops=1) Total runtime: 65378.552 ms Now I disabled seq scans. set enable_seqscan=false; Merge Left Join (cost=100331398.53..100331526.36 rows=828 width=76) (actual time=36206.575..36291.847 rows=5 loops=1) Merge Cond: (tbl_file.fk_tar_id = tbl_tar.pk_tar_id) -> Sort (cost=100331398.53..100331400.60 rows=828 width=50) (actual time=36030.473..36030.487 rows=5 loops=1) Sort Key: tbl_file.fk_tar_id -> Hash Join (cost=100012609.44..100331358.40 rows=828 width=50) (actual time=27279.046..36030.399 rows=5 loops=1) Hash Cond: (tbl_file_structure.fk_structure_id = tbl_structure.pk_structure_id) -> Nested Loop (cost=100003756.16..100321772.87 rows=72397 width=58) (actual time=13225.815..35533.414 rows=4650 loops=1) -> Hash Join (cost=100003756.16..100030874.63 rows=8061 width=50) (actual time=12888.880..19845.110 rows=486 loops=1) Hash Cond: (lower ((tbl_file.file_suffix)::text) = lower ((tbl_filetype_suffix.filetype_suffix)::text)) -> Bitmap Heap Scan on tbl_file (cost=3754.47..29939.50 rows=136453 width=50) (actual time=12747.478..19266.843 rows=154605 loops=1) Recheck Cond: (file_size < 1024) Filter: ((file_indexed IS FALSE) AND (file_copied IS TRUE)) -> Bitmap Index Scan on tbl_file_idx4 (cost=0.00..3720.36 rows=195202 width=0) (actual time=12689.593..12689.593 rows=205084 loops=1) Index Cond: (file_size < 1024) -> Hash (cost=100000001.52..100000001.52 rows=14 width=8) (actual time=0.313..0.313 rows=14 loops=1) -> HashAggregate (cost=100000001.38..100000001.52 rows=14 width=8) (actual time=0.230..0.254 rows=14 loops=1) -> Seq Scan on tbl_filetype_suffix (cost=100000000.00..100000001.34 rows=14 width=8) (actual time=0.133..0.176 rows=14 loops=1) Filter: (filetype_suffix_index IS TRUE) -> Index Scan using tbl_file_structure_idx on tbl_file_structure (cost=0.00..35.82 rows=21 width=16) (actual time=7.031..32.178 rows=10 loops=486) Index Cond: (tbl_file_structure.fk_file_id = tbl_file.pk_file_id) -> Hash (cost=8778.54..8778.54 rows=5979 width=8) (actual time=445.799..445.799 rows=11420 loops=1) -> Bitmap Heap Scan on tbl_structure (cost=617.08..8778.54 rows=5979 width=8) (actual time=155.046..419.676 rows=11420 loops=1) Recheck Cond: (fk_archive_id = 115) -> Bitmap Index Scan on tbl_structure_idx3 (cost=0.00..615.59 rows=5979 width=0) (actual time=126.623..126.623 rows=11420 loops=1) Index Cond: (fk_archive_id = 115) -> Index Scan using tbl_tar_pkey on tbl_tar (cost=0.00..106.86 rows=3421 width=34) (actual time=22.218..251.830 rows=2491 loops=1) Total runtime: 36292.481 ms > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that > your > message can get through to the mailing list cleanly
"Henrik Zagerholm" <henke@mac.se> writes: > Ahh, my bad. It is a very small table but I have an unique index. > CREATE UNIQUE INDEX tbl_filetype_suffix_idx ON tbl_filetype_suffix > USING btree (filetype_suffix); Well it can't use that to help with a join. If you had an index on lower(filetype_suffix) it might be able to use it. I'm not sure though, especially if it's a small table. >> And any chance you could resend this stuff without the word-wrapping? >> It's pretty hard to read like this: >> > Resending and hopefully the line breaks are gone. I couldn't find any > in my sent mail. No, the double-quotes are gone but the lines are still wrapped. It's become quite a hassle recently to get mailers to do anything reasonable with code. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Henrik Zagerholm <henke@mac.se> writes: > WHERE file_indexed IS FALSE > AND file_copied IS TRUE > AND file_size < (1024) > AND LOWER > (file_suffix) IN( > SELECT LOWER > (filetype_suffix) FROM tbl_filetype_suffix WHERE > filetype_suffix_index IS TRUE > ) AND fk_archive_id > = 115 ORDER BY fk_tar_id Do you really need the lower() calls there? The planner is getting the wrong estimate for the selectivity of the IN-clause, which is likely because it has no statistics about lower(file_suffix) or lower(filetype_suffix). If you don't want to constrain the data to be already lower'd, then setting up functional indexes on the two lower() expressions should prompt ANALYZE to track stats for them. regards, tom lane
Re: [GENERAL] Planner making wrong decisions 8.2.4. Insane cost calculations.
От
Henrik Zagerholm
Дата:
6 aug 2007 kl. 17:31 skrev Tom Lane: > Henrik Zagerholm <henke@mac.se> writes: >> WHERE file_indexed IS FALSE >> AND file_copied >> IS TRUE >> AND file_size < >> (1024) >> AND LOWER >> (file_suffix) IN( >> SELECT LOWER >> (filetype_suffix) FROM tbl_filetype_suffix WHERE >> filetype_suffix_index IS TRUE >> ) AND fk_archive_id >> = 115 ORDER BY fk_tar_id > > Do you really need the lower() calls there? The planner is getting > the > wrong estimate for the selectivity of the IN-clause, which is likely > because it has no statistics about lower(file_suffix) or > lower(filetype_suffix). > > If you don't want to constrain the data to be already lower'd, then > setting up functional indexes on the two lower() expressions should > prompt ANALYZE to track stats for them. > OK, thanx for the tip. I actually think that all the suffixes are lower case so the lower should go. But would this really impact the sequential scan on tbl_file_structure? ->Seq Scan on tbl_file_structure (cost=0.00..167417.09 rows=7902309 width=16) (actual time=9.581..33702.852 rows=7801334 loops=1) At what point does the planner choose seq scans? I've seen the planner use seq scan even though only 1% of the joining tables rows are selected. If the filter gives me 70k rows from tbl_file and tbl_file_structure has 8 million rows why does the planner choose seq scans? Cheers, Henrik > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate
Henrik Zagerholm <henke@mac.se> writes: > At what point does the planner choose seq scans? When it thinks it's cheaper than the other way. There's no hard and fast answer. The immediate problem you've got is that the estimated size of the tbl_file/tbl_filetype_suffix join is off by a factor of almost 20 (8061 vs 486). The plan that you think would be faster involves an inner indexscan on the larger table for each result row from that join, and therefore this error translates directly to a 20x overestimate of its cost, and therefore the planner avoids that in favor of a hash join that indeed is more efficient when there are lots of rows to be joined. It may well be that you also need to adjust random_page_cost and/or effective_cache_size so that the planner's estimates of indexscan vs seqscan costs are more in line with reality on your machine. But it's a capital error to tinker with those numbers on the basis of an example where the rowcount estimates are so far off. (Actually I'd not advise changing them on the basis of *any* single test case, you need to look at average behaviors. Get the count estimates fixed first and then see where you are.) It's also not impossible that the planner is right and the seqscan is better than a lot of index probes ... regards, tom lane