Обсуждение: Bitmap Heap Scan anomaly
<div style="font-family:times new roman, new york, times, serif;font-size:12pt">I have done the following test and I am unableto understand the results. I have tried debugging the code and I have reached down to the Storage Layer. I am playingwith the optimizer etc.. I no very little about the internals of the Executor.<br /><br />If you could point out tome what possible explanation for such anomaly I would be very glad.<br /><br />Thanks,<br />Makarona<br /><br />My Test:<br/><br />Setup:<br />-------<br />I have created two very similar tables mycorr_10 and mycorr_100, attribute namesare {key,a,b} for both tables. <br />I added 16 M rows in both tables in the following fashion:<br /> I gave a randomvalue to each attribute key ( dont care )<br /> Values in a,b take a random value from [1-16M]<br /> In the caseof mycorr_10 I set a random 10% of the a=b<br /> In the case of mycorr_100 I set all a=b<br /> I create index{a,b}on both tables<br /> I VACUUM ANALYZE<br />p.s. I am trying to simulate an optimizer cardinality estimationerror due to Independence assumption.<br /><br />Query :<br />SELECT count(key)<br />FROM mycorr_10 -- (or mycorr_100)<br />WHERE a>15900000 and b>15900000;<br /><br/>Explain:<br />----------<br />As expected using the independence assumption the Planner chooses to use the index forboth tables cases:<br /> Aggregate([4130.82][4130.83][1][94083.95][94083.96][1] width=4)<br /> -> Bitmap Heap Scanon mycorr_100([1997.92][4129.41][566][2021.57][93846.00][95177] width=4)<br /> Recheck Cond: ((a > 15900000)AND (b > 15900000))<br /> -> Bitmap Index Scan on ab_100([0.00][1997.77][566][0.00][1997.77][95177]width=0)<br /> Index Cond: ((a > 15900000) AND (b >15900000))<br />(5 rows)<br /><br />p.s.<br />Explain output may seem weird as i have changes it a bit.<br /><br /><br/>Explain Analyze<br />---------------------<br /><br />restart postgres<br />echo 1 > /proc/sys/vm/drop_caches (drop file system caches)<br />explain analyze select count(key) from mycorr_10 where a>15900000and b>15900000;<br />restart postgres<br />echo 1 > /proc/sys/vm/drop_caches<br />explain analyze selectcount(key) from mycorr_100 where a>15900000 and b>15900000;<br /><br /><br /><br />Result for mycorr_100:<br/>---------------------------<br /> Aggregate([4130.82][4130.83][1][94083.95][94083.96][1] width=4) (actualtime=11424.077..11424.078 rows=1 loops=1)<br /> -> Bitmap Heap Scan on mycorr_100([1997.92][4129.41][566][2021.57][93846.00][95177]width=4) (actual time=167.979..11304.413 rows=100000 loops=1)<br/> Recheck Cond: ((a > 15900000) AND (b > 15900000))<br /> -> Bitmap Index Scan on ab_100([0.00][1997.77][566][0.00][1997.77][95177]width=0) (actual time=120.127..120.127 rows=100000 loops=1)<br /> Index Cond: ((a > 15900000) AND (b > 15900000))<br /> Total runtime: 11426.329 ms<br />(6 rows)<br/><br />Result for mycorr_10:<br />---------------------------<br /><br />Aggregate([4608.36][4608.37][1][94197.91][94197.92][1]width=4) (actual time=24393.058..24393.058 rows=1 loops=1)<br /> -> Bitmap Heap Scan on mycorr_10([2249.51][4606.79][629][2272.83][93963.14][93908] width=4) (actual time=108.219..24374.050rows=10563 loops=1)<br /> Recheck Cond: ((a > 15900000) AND (b > 15900000))<br /> -> Bitmap Index Scan on ab_10([0.00][2249.35][629][0.00][2249.35][93908] width=0) (actual time=89.432..89.432rows=10563 loops=1)<br /> Index Cond: ((a > 15900000) AND (b > 15900000))<br /> Totalruntime: 24393.555 ms<br />(6 rows)<br />-------------------------------------------------------------------------------------------------------------<br />Goodies:<br/>-----------<br />pg_statio_all_tables -><br /> heap_blks_read=9931 (in case of mycorr_10)<br/> heap_blks_read=118693 (in case of mycorr_100)<br /><br />I have repeated the test more than 20 timesup till now.<br />I have also made the same test with different table sizes and correlation level and the same anomalypersists.<br />Question:<br />------------<br />mycorr_100 took 11.4 s to run although it had to fetch 100000 rowfrom the base table.<br />mycorr_10 took 24.4 s to run although it had to fetch 10563 row from the base table.<br /><br/>Any explanation for that?<br /><br />Thank you for your patience.<br />-------------------------------------------------------------------------------------------------------------<br/><br /></div><br/><hr size="1" />Ahhh...imagining that irresistible "new car" smell?<br /> Check out <a href="http://us.rd.yahoo.com/evt=48245/*http://autos.yahoo.com/new_cars.html;_ylc=X3oDMTE1YW1jcXJ2BF9TAzk3MTA3MDc2BHNlYwNtYWlsdGFncwRzbGsDbmV3LWNhcnM-">new carsat Yahoo! Autos.</a>
On Thu, 2007-05-03 at 14:33 -0700, jaba the mobzy wrote: > mycorr_100 took 11.4 s to run although it had to fetch 100000 row from > the base table. > mycorr_10 took 24.4 s to run although it had to fetch 10563 row from > the base table. This is because the physical distribution of data is different. The mycorr_10 table has tuples in which a and b are > 15.9M spread all throughout. mycorr_100 has them all collected together at the end of the physical file. Less disk seeking. You can test this by doing a CLUSTER on both tables and run the same queries again. Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > On Thu, 2007-05-03 at 14:33 -0700, jaba the mobzy wrote: >> mycorr_100 took 11.4 s to run although it had to fetch 100000 row from >> the base table. >> mycorr_10 took 24.4 s to run although it had to fetch 10563 row from >> the base table. > This is because the physical distribution of data is different. The > mycorr_10 table has tuples in which a and b are > 15.9M spread all > throughout. mycorr_100 has them all collected together at the end of the > physical file. Less disk seeking. If the OP had generated the data randomly, as claimed, the rows shouldn't be particularly more clumped in one table than the other. But I sure agree that it sounds like a nonrandom distribution in the mycorr_100 table. FWIW I tried to duplicate the behavior, and could not, using tables made up like this: create table src as select int4(16*1024*1024*random()) as key, int4(16*1024*1024*random()) as a, int4(16*1024*1024*random()) as b from generate_series(1,16*1024*1024); create table mycorr_10 as select key, a, case when random() < 0.1 then a else b end as b from src; create table mycorr_100 as select key, a, a as b from src; create index mycorr_10i on mycorr_10(a,b); create index mycorr_100i on mycorr_100(a,b); vacuum analyze mycorr_10; vacuum analyze mycorr_100; regards, tom lane
Tom,
Did you restart Postgres and drop file system caches?
What I am suspecting is that some sort of prefetching is happening.
I know that Postgres does not do prefetching.
I also understand very little about OS/FileSystem level prefetching.
Did you restart Postgres and drop file system caches?
What I am suspecting is that some sort of prefetching is happening.
I know that Postgres does not do prefetching.
I also understand very little about OS/FileSystem level prefetching.
----- Original Message ----
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Jeff Davis <pgsql@j-davis.com>
Cc: jaba the mobzy <makaronaforna@yahoo.com>; pgsql-hackers@postgresql.org
Sent: Thursday, May 3, 2007 11:42:32 PM
Subject: Re: [HACKERS] Bitmap Heap Scan anomaly
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Jeff Davis <pgsql@j-davis.com>
Cc: jaba the mobzy <makaronaforna@yahoo.com>; pgsql-hackers@postgresql.org
Sent: Thursday, May 3, 2007 11:42:32 PM
Subject: Re: [HACKERS] Bitmap Heap Scan anomaly
Jeff Davis <pgsql@j-davis.com> writes:
> On Thu, 2007-05-03 at 14:33 -0700, jaba the mobzy wrote:
>> mycorr_100 took 11.4 s to run although it had to fetch 100000 row from
>> the base table.
>> mycorr_10 took 24.4 s to run although it had to fetch 10563 row from
>> the base table.
> This is because the physical distribution of data is different. The
> mycorr_10 table has tuples in which a and b are > 15.9M spread all
> throughout. mycorr_100 has them all collected together at the end of the
> physical file. Less disk seeking.
If the OP had generated the data randomly, as claimed, the rows
shouldn't be particularly more clumped in one table than the other.
But I sure agree that it sounds like a nonrandom distribution in the
mycorr_100 table. FWIW I tried to duplicate the behavior, and could
not, using tables made up like this:
create table src as
select int4(16*1024*1024*random()) as key,
int4(16*1024*1024*random()) as a,
int4(16*1024*1024*random()) as b
from generate_series(1,16*1024*1024);
create table mycorr_10 as
select key, a,
case when random() < 0.1 then a else b end as b
from src;
create table mycorr_100 as
select key, a, a as b
from src;
create index mycorr_10i on mycorr_10(a,b);
create index mycorr_100i on mycorr_100(a,b);
vacuum analyze mycorr_10;
vacuum analyze mycorr_100;
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate
> On Thu, 2007-05-03 at 14:33 -0700, jaba the mobzy wrote:
>> mycorr_100 took 11.4 s to run although it had to fetch 100000 row from
>> the base table.
>> mycorr_10 took 24.4 s to run although it had to fetch 10563 row from
>> the base table.
> This is because the physical distribution of data is different. The
> mycorr_10 table has tuples in which a and b are > 15.9M spread all
> throughout. mycorr_100 has them all collected together at the end of the
> physical file. Less disk seeking.
If the OP had generated the data randomly, as claimed, the rows
shouldn't be particularly more clumped in one table than the other.
But I sure agree that it sounds like a nonrandom distribution in the
mycorr_100 table. FWIW I tried to duplicate the behavior, and could
not, using tables made up like this:
create table src as
select int4(16*1024*1024*random()) as key,
int4(16*1024*1024*random()) as a,
int4(16*1024*1024*random()) as b
from generate_series(1,16*1024*1024);
create table mycorr_10 as
select key, a,
case when random() < 0.1 then a else b end as b
from src;
create table mycorr_100 as
select key, a, a as b
from src;
create index mycorr_10i on mycorr_10(a,b);
create index mycorr_100i on mycorr_100(a,b);
vacuum analyze mycorr_10;
vacuum analyze mycorr_100;
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at
http://www.postgresql.org/about/donate
No need to miss a message. Get email on-the-go
with Yahoo! Mail for Mobile. Get started.