Обсуждение: FW: Constraint exclusion in partitions

Поиск
Список
Период
Сортировка

FW: Constraint exclusion in partitions

От
Daniel Begin
Дата:

Sent that on pgsql-novice list but did not get any answers yet.

Maybe someone could help me understand here J

 

 

Hi all,

 

I have split a large table (billions of records) into multiple partitions, hoping the access would be faster. I used an ID to make partitions check (check (id >= 100 AND id < 200)…) and created over 80 tables (children) that are now filled with data.  

 

However, after I did it, I read a second time the following sentence in the documentation and started wondering what it actually means …  “Constraint exclusion only works when the query's WHERE clause contains constants (or externally supplied parameters)”

 

I understand that the following query will use constraint exclusion and will run faster…  

a-      Select * from parent_table where id >=9999; -- using a constant

 

But how constraint exclusion would react with the following queries …

b-      Select * from parent_table where id between 2345 and 6789; -- using a range of ids

c-       Select * from parent_table where id in(select ids from anothertable); -- using a list of ids from a select

 

Since I mostly use queries of type b and c, I am wondering if partitioning the large table was appropriate and if the queries are going to be longer to run…

Thank in advance

 

Daniel

 

Doc: http://www.postgresql.org/docs/9.3/static/ddl-partitioning.html

Re: FW: Constraint exclusion in partitions

От
"David G. Johnston"
Дата:
On Fri, May 22, 2015 at 10:21 AM, Daniel Begin <jfd553@hotmail.com> wrote:

But how constraint exclusion would react with the following queries …

b-      Select * from parent_table where id between 2345 and 6789; -- using a range of ids

​Not sure...

These are constants but I'm not sure how smart the planner is about figuring out inequalities of this form.

I would guess yes but it should be easy enough to confirm this yourself using explain...you already have the tables.

c-       Select * from parent_table where id in(select ids from anothertable); -- using a list of ids from a select

Definitely no...

Constraint exclusion is done by the planner before data is read so there is no possible way for data in a table to be used.​


As for performance the only way to know for sure is to test your usage patterns and data.  Even without constraint exclusion partitioning can provide benefits.

David J.

Re: FW: Constraint exclusion in partitions

От
Francisco Olarte
Дата:
Hi Daniel

On Fri, May 22, 2015 at 7:21 PM, Daniel Begin <jfd553@hotmail.com> wrote:
> I have split a large table (billions of records) into multiple partitions,
> hoping the access would be faster. I used an ID to make partitions check
> (check (id >= 100 AND id < 200)…) and created over 80 tables (children) that
> are now filled with data.
...
> I understand that the following query will use constraint exclusion and will
> run faster…
> a-      Select * from parent_table where id >=9999; -- using a constant
> But how constraint exclusion would react with the following queries …
> b-      Select * from parent_table where id between 2345 and 6789; -- using
> a range of ids
> c-       Select * from parent_table where id in(select ids from
> anothertable); -- using a list of ids from a select

Given you have already partitioned it, why don't you just use explain
[ analyze ] on the queries? I.e., in one of my tables, partitioned
monthly by a timestamp ( with time zone ) field I get ( even if I
never use between, a closed interval, for a continuous like type like
timestamp, the optimizer clearly shows it's transfroaming it to the
equivalent AND condition ):


explain select * from carrier_cdrs where setup between
'20150107T123456' and '20150322T222222';


QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..82202.41 rows=2346599 width=74)
   ->  Seq Scan on carrier_cdrs  (cost=0.00..0.00 rows=1 width=184)
         Filter: ((setup >= '2015-01-07 12:34:56+01'::timestamp with
time zone) AND (setup <= '2015-03-22 22:22:22+01'::timestamp with time
zone))
   ->  Seq Scan on carrier_cdrs_201501  (cost=0.00..30191.10
rows=816551 width=74)
         Filter: ((setup >= '2015-01-07 12:34:56+01'::timestamp with
time zone) AND (setup <= '2015-03-22 22:22:22+01'::timestamp with time
zone))
   ->  Seq Scan on carrier_cdrs_201502  (cost=0.00..25277.45
rows=872830 width=74)
         Filter: ((setup >= '2015-01-07 12:34:56+01'::timestamp with
time zone) AND (setup <= '2015-03-22 22:22:22+01'::timestamp with time
zone))
   ->  Seq Scan on carrier_cdrs_201503  (cost=0.00..26733.85
rows=657217 width=74)
         Filter: ((setup >= '2015-01-07 12:34:56+01'::timestamp with
time zone) AND (setup <= '2015-03-22 22:22:22+01'::timestamp with time
zone))
(9 rows)

> Since I mostly use queries of type b and c, I am wondering if partitioning
> the large table was appropriate and if the queries are going to be longer to
> run…

The problem is gonna be all the extra conditions, so I'll check real
queries. My bet is in a query EXACTLY like b it will use constraint
exclusion, and can be potentially faster if your interval are for just
100 ids, but why speculate when YOU can measure?

Also think if you touch 67-23+1=45 partitions and the DB has to check
other indexed fields it is a lot of work. As I said, it depend on your
actual data, actual indexes, and actual queries, just measure it. In
my excample table I partition the data by the TS, which is the single
indexed field and my tests showed it was faster that way ( but my
queries are normally big scans of date ranges or more selective ones
with narrow TS conditions, and I measured them ).

Francisco Olarte.


Re: FW: Constraint exclusion in partitions

От
Daniel Begin
Дата:
Following Francisco suggestion, I was able to do some tests earlier this morning when the partitioning process
completedand all the resulting tables analyzed. 

Here is what I got on both the original table and its partitioned counterpart while running the same queries. I tested
themonly for a couple of values but in summary... 

Using a constant id:
All the queries I tried took longer on the partitioned table! I got similar results for multiple records using IN (id
value1,id value2 ...) 

Using a range of ids:
Surprisingly again, all the queries I tried took longer on the partitioned table!

Using a list of ids from a select clause:
More surprisingly, the queries I tried took less time on the partitioned table at least when using the primary key.
Usingan indexed field took so long compared to the old table that I cancelled the execution for the new one! 

Guess what, I will get back to my old fat table unless someone tells me I missed something obvious!
Daniel

Note: Tables/indexes description, queries and execution plans are below.






Tables/indexes description ----------------------------------------------------------------------------------
The original table has 3870130000 records. Primary key/index on each partition queries are
ALTER TABLE oldtable ADD CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX oldtable_groupid_idx ON oldtable USING btree (group_id);

The partitioned table has 3870130000 records distributed over 87 partitions. Primary key/index on each partition
queriesare 
ALTER TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX newtablexx_groupid_idx ON newtable_xx USING btree (group_id);
Where xx is the partition's number suffix

constant id -------------------------------------------------------------------------------------------------------
select * from oldtable where id=123456789;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4437.15 rows=1682 width=66)"
"  Index Cond: (id = 123456789::bigint)"
--Total query runtime: 62 ms. 1 rows retrieved


select * from newtable where id=123456789;
"Append  (cost=0.00..20.19 rows=5 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = 123456789::bigint)"
"  ->  Index Scan using newtable72_idversion_pk on newtable_72  (cost=0.56..20.19 rows=4 width=66)"
"        Index Cond: (id = 123456789::bigint)"
--Total query runtime: 156 ms. 1 rows retrieved

I got similar results for multiple records...
select * from oldtable where id IN(10000000,1000000000,2000000000,3000000000);
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.71..17739.18 rows=6726 width=66)"
"  Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 187 ms. 4 rows retrieved

select * from newtable where id IN(10000000,1000000000,2000000000,3000000000);
"Append  (cost=0.00..933.40 rows=223 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
"  ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.57..622.78 rows=156 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
...
"  ->  Index Scan using newtable85_idversion_pk on newtable_85  (cost=0.57..53.37 rows=9 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 421 ms. 4 rows retrieved


range of ids -------------------------------------------------------------------------------------------------------
select * from oldtable where id between 1522999949 and 1523000049;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..383.51 rows=144 width=66)"
"  Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 47 ms. 53 rows retrieved.

select * from newtable where id between 1522999949 and 1523000049;
"Append  (cost=0.00..408.16 rows=104 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable51_idversion_pk on newtable_51  (cost=0.56..183.52 rows=46 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable52_idversion_pk on newtable_52  (cost=0.56..224.64 rows=57 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 78 ms. 53 rows retrieved.


list of ids from a select clause
-------------------------------------------------------------------------------------------------------
--Subset provides 4 ids similar but not identical to the previous query
select * from oldtable where id IN (select * from subset);
"Nested Loop  (cost=37.45..886298.00 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4414.37 rows=1693 width=66)"
"        Index Cond: (id = subset.id)"
Total query runtime: 171 ms. 4 rows retrieved.

select * from newtable where id IN (select * from subset)
"Nested Loop  (cost=36.75..1407672.76 rows=1935067087 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Append  (cost=0.00..7020.68 rows=1749 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"              Filter: (subset.id = id)"
"        ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.56..151.97 rows=39 width=66)"
"              Index Cond: (id = subset.id)"
...
"        ->  Index Scan using newtable86_idversion_pk on newtable_86  (cost=0.56..12.42 rows=2 width=66)"
"              Index Cond: (id = subset.id)"
Total query runtime: 140 ms. 4 rows retrieved.


Using an index, not the primary key ------------------------------------------------------------------------------
--Subset provides 58 group_id pointing to 5978 records in the concerned tables
select * from oldtable where group_id IN (select * from subset)
"Nested Loop  (cost=37.33..21575715.89 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_groupid_idx on oldtable  (cost=0.58..107364.99 rows=51340 width=66)"
"        Index Cond: (group_id = subset.id)"
Total query runtime: 3986 ms. 5978 rows retrieved.


select * from newtable where group_id IN (select * from subset)
"Hash Join  (cost=41.25..138092255.85 rows=1935067087 width=66)"
"  Hash Cond: (newtable.group_id = subset.id)"
"  ->  Append  (cost=0.00..84877869.72 rows=3870134173 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        ->  Seq Scan on newtable_01  (cost=0.00..946235.96 rows=46526896 width=66)"
...
"        ->  Seq Scan on newtable_86  (cost=0.00..986527.64 rows=44269664 width=66)"
"  ->  Hash  (cost=38.75..38.75 rows=200 width=8)"
"        ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"              ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
Execution Cancelled after 766702 ms !

I tried the same with "SET enable_seqscan = OFF" and got an index scan of all tables;



Re: FW: Constraint exclusion in partitions

От
melvin6925
Дата:
Did you remember to set  constraint_exclusion = on and reload the .conf ?




Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone
-------- Original message --------
From: Daniel Begin <jfd553@hotmail.com>
Date: 05/23/2015 14:37 (GMT-05:00)
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] FW: Constraint exclusion in partitions

Following Francisco suggestion, I was able to do some tests earlier this morning when the partitioning process completed and all the resulting tables analyzed.

Here is what I got on both the original table and its partitioned counterpart while running the same queries. I tested them only for a couple of values but in summary...

Using a constant id:
All the queries I tried took longer on the partitioned table! I got similar results for multiple records using IN (id value1, id value2 ...)

Using a range of ids:
Surprisingly again, all the queries I tried took longer on the partitioned table!

Using a list of ids from a select clause:
More surprisingly, the queries I tried took less time on the partitioned table at least when using the primary key. Using an indexed field took so long compared to the old table that I cancelled the execution for the new one!

Guess what, I will get back to my old fat table unless someone tells me I missed something obvious!
Daniel

Note: Tables/indexes description, queries and execution plans are below.






Tables/indexes description ----------------------------------------------------------------------------------
The original table has 3870130000 records. Primary key/index on each partition queries are
ALTER TABLE oldtable ADD CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX oldtable_groupid_idx ON oldtable USING btree (group_id);

The partitioned table has 3870130000 records distributed over 87 partitions. Primary key/index on each partition queries are
ALTER TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX newtablexx_groupid_idx ON newtable_xx USING btree (group_id);
Where xx is the partition's number suffix

constant id -------------------------------------------------------------------------------------------------------
select * from oldtable where id=123456789;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4437.15 rows=1682 width=66)"
"  Index Cond: (id = 123456789::bigint)"
--Total query runtime: 62 ms. 1 rows retrieved


select * from newtable where id=123456789;
"Append  (cost=0.00..20.19 rows=5 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = 123456789::bigint)"
"  ->  Index Scan using newtable72_idversion_pk on newtable_72  (cost=0.56..20.19 rows=4 width=66)"
"        Index Cond: (id = 123456789::bigint)"
--Total query runtime: 156 ms. 1 rows retrieved

I got similar results for multiple records...
select * from oldtable where id IN(10000000,1000000000,2000000000,3000000000);
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.71..17739.18 rows=6726 width=66)"
"  Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 187 ms. 4 rows retrieved

select * from newtable where id IN(10000000,1000000000,2000000000,3000000000);
"Append  (cost=0.00..933.40 rows=223 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
"  ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.57..622.78 rows=156 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
...
"  ->  Index Scan using newtable85_idversion_pk on newtable_85  (cost=0.57..53.37 rows=9 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 421 ms. 4 rows retrieved


range of ids -------------------------------------------------------------------------------------------------------
select * from oldtable where id between 1522999949 and 1523000049;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..383.51 rows=144 width=66)"
"  Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 47 ms. 53 rows retrieved.

select * from newtable where id between 1522999949 and 1523000049;
"Append  (cost=0.00..408.16 rows=104 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable51_idversion_pk on newtable_51  (cost=0.56..183.52 rows=46 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable52_idversion_pk on newtable_52  (cost=0.56..224.64 rows=57 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 78 ms. 53 rows retrieved.


list of ids from a select clause -------------------------------------------------------------------------------------------------------
--Subset provides 4 ids similar but not identical to the previous query
select * from oldtable where id IN (select * from subset);
"Nested Loop  (cost=37.45..886298.00 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4414.37 rows=1693 width=66)"
"        Index Cond: (id = subset.id)"
Total query runtime: 171 ms. 4 rows retrieved.

select * from newtable where id IN (select * from subset)
"Nested Loop  (cost=36.75..1407672.76 rows=1935067087 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Append  (cost=0.00..7020.68 rows=1749 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"              Filter: (subset.id = id)"
"        ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.56..151.97 rows=39 width=66)"
"              Index Cond: (id = subset.id)"
...
"        ->  Index Scan using newtable86_idversion_pk on newtable_86  (cost=0.56..12.42 rows=2 width=66)"
"              Index Cond: (id = subset.id)"
Total query runtime: 140 ms. 4 rows retrieved.


Using an index, not the primary key ------------------------------------------------------------------------------
--Subset provides 58 group_id pointing to 5978 records in the concerned tables
select * from oldtable where group_id IN (select * from subset)
"Nested Loop  (cost=37.33..21575715.89 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_groupid_idx on oldtable  (cost=0.58..107364.99 rows=51340 width=66)"
"        Index Cond: (group_id = subset.id)"
Total query runtime: 3986 ms. 5978 rows retrieved.


select * from newtable where group_id IN (select * from subset)
"Hash Join  (cost=41.25..138092255.85 rows=1935067087 width=66)"
"  Hash Cond: (newtable.group_id = subset.id)"
"  ->  Append  (cost=0.00..84877869.72 rows=3870134173 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        ->  Seq Scan on newtable_01  (cost=0.00..946235.96 rows=46526896 width=66)"
...
"        ->  Seq Scan on newtable_86  (cost=0.00..986527.64 rows=44269664 width=66)"
"  ->  Hash  (cost=38.75..38.75 rows=200 width=8)"
"        ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"              ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
Execution Cancelled after 766702 ms !

I tried the same with "SET enable_seqscan = OFF" and got an index scan of all tables;



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: FW: Constraint exclusion in partitions

От
Bill Moran
Дата:
A large portion of why you describe below is the exact opposite of my
own testing (unfortunately, I don't have the actual test results any
more because I did the tests for a former employer).

In my tests, single lookups against the same column being used to
partition improved performance in direct proportion to the number of
partitions. I.e. if the tables are partitioned on id, and the lookup
is for id, and the table has 10 partitions, the query is 10x faster
on the partitioned version than the non-partitioned verison.

Queries against indexes not partitioned were slightly slower in my
tests, but nowhere near the degree that you're showing below.

I can't help but think that your testing methodology is flawed, but
since you're not showing us what you actually did, it's difficult to
be sure. See below for some specifics on what I'm concerned that you
might be doing wrong ...

On Sat, 23 May 2015 14:37:25 -0400
Daniel Begin <jfd553@hotmail.com> wrote:

> Following Francisco suggestion, I was able to do some tests earlier this morning when the partitioning process
completedand all the resulting tables analyzed. 
>
> Here is what I got on both the original table and its partitioned counterpart while running the same queries. I
testedthem only for a couple of values but in summary... 
>
> Using a constant id:
> All the queries I tried took longer on the partitioned table! I got similar results for multiple records using IN (id
value1,id value2 ...) 
>
> Using a range of ids:
> Surprisingly again, all the queries I tried took longer on the partitioned table!
>
> Using a list of ids from a select clause:
> More surprisingly, the queries I tried took less time on the partitioned table at least when using the primary key.
Usingan indexed field took so long compared to the old table that I cancelled the execution for the new one! 
>
> Guess what, I will get back to my old fat table unless someone tells me I missed something obvious!
> Daniel
>
> Note: Tables/indexes description, queries and execution plans are below.
>
>
>
>
>
>
> Tables/indexes description ----------------------------------------------------------------------------------
> The original table has 3870130000 records. Primary key/index on each partition queries are
> ALTER TABLE oldtable ADD CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version);
> CREATE INDEX oldtable_groupid_idx ON oldtable USING btree (group_id);
>
> The partitioned table has 3870130000 records distributed over 87 partitions. Primary key/index on each partition
queriesare 
> ALTER TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY (id, version);
> CREATE INDEX newtablexx_groupid_idx ON newtable_xx USING btree (group_id);
> Where xx is the partition's number suffix
>
> constant id -------------------------------------------------------------------------------------------------------
> select * from oldtable where id=123456789;
> "Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4437.15 rows=1682 width=66)"
> "  Index Cond: (id = 123456789::bigint)"
> --Total query runtime: 62 ms. 1 rows retrieved
>
>
> select * from newtable where id=123456789;
> "Append  (cost=0.00..20.19 rows=5 width=66)"
> "  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "        Filter: (id = 123456789::bigint)"
> "  ->  Index Scan using newtable72_idversion_pk on newtable_72  (cost=0.56..20.19 rows=4 width=66)"
> "        Index Cond: (id = 123456789::bigint)"
> --Total query runtime: 156 ms. 1 rows retrieved

Notice in these results that you're not showing the command that was executed.
The output is mostly likely from the command
"ANALYZE select * from newtable where id=123456789;" but that's not the command
you claim that you ran. In any event, the analyze output doesn't line up with
the times you claim: i.e. Analyze is showing that the first query should take
about 4437 time units to complete, and the second one should take about 20
time units, yet you claim the second one is slower. The other queries below
exhibit a similar pattern.

Are you sure you're not timing ANALYZE itself instead of the query? Because
timing "ANALYZE select * from newtable where id=123456789;" is not going to
be timing the actual time the query took to run. I would certainly expect the
_planning_ of a query against partitioned tables to take longer than non-
partitioned, but I would also expect the execution time to be the opposite.
Hence my theory that you've accidentally timed the ANALYZE instead of the
actual running of the query. Naturally, the total query time is planning +
execution, and my experience shows that the loss in planning speed is more
than made up for by the gain in execution speed.

Perhaps you should show us the exact output of one of your tests, without
editorializing.

> I got similar results for multiple records...
> select * from oldtable where id IN(10000000,1000000000,2000000000,3000000000);
> "Index Scan using oldtable_idversion_pk on oldtable  (cost=0.71..17739.18 rows=6726 width=66)"
> "  Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> --Total query runtime: 187 ms. 4 rows retrieved
>
> select * from newtable where id IN(10000000,1000000000,2000000000,3000000000);
> "Append  (cost=0.00..933.40 rows=223 width=66)"
> "  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "        Filter: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> "  ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.57..622.78 rows=156 width=66)"
> "        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> ...
> "  ->  Index Scan using newtable85_idversion_pk on newtable_85  (cost=0.57..53.37 rows=9 width=66)"
> "        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> --Total query runtime: 421 ms. 4 rows retrieved
>
>
> range of ids -------------------------------------------------------------------------------------------------------
> select * from oldtable where id between 1522999949 and 1523000049;
> "Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..383.51 rows=144 width=66)"
> "  Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
> Total query runtime: 47 ms. 53 rows retrieved.
>
> select * from newtable where id between 1522999949 and 1523000049;
> "Append  (cost=0.00..408.16 rows=104 width=66)"
> "  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "        Filter: ((id >= 1522999949) AND (id <= 1523000049))"
> "  ->  Index Scan using newtable51_idversion_pk on newtable_51  (cost=0.56..183.52 rows=46 width=66)"
> "        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
> "  ->  Index Scan using newtable52_idversion_pk on newtable_52  (cost=0.56..224.64 rows=57 width=66)"
> "        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
> Total query runtime: 78 ms. 53 rows retrieved.
>
>
> list of ids from a select clause
-------------------------------------------------------------------------------------------------------
> --Subset provides 4 ids similar but not identical to the previous query
> select * from oldtable where id IN (select * from subset);
> "Nested Loop  (cost=37.45..886298.00 rows=2028512050 width=66)"
> "  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
> "        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
> "  ->  Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4414.37 rows=1693 width=66)"
> "        Index Cond: (id = subset.id)"
> Total query runtime: 171 ms. 4 rows retrieved.
>
> select * from newtable where id IN (select * from subset)
> "Nested Loop  (cost=36.75..1407672.76 rows=1935067087 width=66)"
> "  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
> "        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
> "  ->  Append  (cost=0.00..7020.68 rows=1749 width=66)"
> "        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "              Filter: (subset.id = id)"
> "        ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.56..151.97 rows=39 width=66)"
> "              Index Cond: (id = subset.id)"
> ...
> "        ->  Index Scan using newtable86_idversion_pk on newtable_86  (cost=0.56..12.42 rows=2 width=66)"
> "              Index Cond: (id = subset.id)"
> Total query runtime: 140 ms. 4 rows retrieved.
>
>
> Using an index, not the primary key ------------------------------------------------------------------------------
> --Subset provides 58 group_id pointing to 5978 records in the concerned tables
> select * from oldtable where group_id IN (select * from subset)
> "Nested Loop  (cost=37.33..21575715.89 rows=2028512050 width=66)"
> "  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
> "        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
> "  ->  Index Scan using oldtable_groupid_idx on oldtable  (cost=0.58..107364.99 rows=51340 width=66)"
> "        Index Cond: (group_id = subset.id)"
> Total query runtime: 3986 ms. 5978 rows retrieved.
>
>
> select * from newtable where group_id IN (select * from subset)
> "Hash Join  (cost=41.25..138092255.85 rows=1935067087 width=66)"
> "  Hash Cond: (newtable.group_id = subset.id)"
> "  ->  Append  (cost=0.00..84877869.72 rows=3870134173 width=66)"
> "        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "        ->  Seq Scan on newtable_01  (cost=0.00..946235.96 rows=46526896 width=66)"
> ...
> "        ->  Seq Scan on newtable_86  (cost=0.00..986527.64 rows=44269664 width=66)"
> "  ->  Hash  (cost=38.75..38.75 rows=200 width=8)"
> "        ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
> "              ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
> Execution Cancelled after 766702 ms !
>
> I tried the same with "SET enable_seqscan = OFF" and got an index scan of all tables;
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


--
Bill Moran


Re: FW: Constraint exclusion in partitions

От
Daniel Begin
Дата:

I am working with postgresql 9.3 and I understand from the documentation that constraint_exclusion is set to “partition” by default. Looking at my postgres.conf file, the concerned line is “#constraint_exclusion = partition”.

 

Furthermore, the execution plan shows that constraint_exclusion was used at least for constant id and range of ids

 

Daniel

 

 

 

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of melvin6925
Sent: May-23-15 15:15
To: Daniel Begin; pgsql-general@postgresql.org
Subject: Re: [GENERAL] FW: Constraint exclusion in partitions

 

Did you remember to set  constraint_exclusion = on and reload the .conf ?

 

 

 

 

Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone

-------- Original message --------
From: Daniel Begin <jfd553@hotmail.com>
Date: 05/23/2015 14:37 (GMT-05:00)
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] FW: Constraint exclusion in partitions

Following Francisco suggestion, I was able to do some tests earlier this morning when the partitioning process completed and all the resulting tables analyzed.

Here is what I got on both the original table and its partitioned counterpart while running the same queries. I tested them only for a couple of values but in summary...

Using a constant id:
All the queries I tried took longer on the partitioned table! I got similar results for multiple records using IN (id value1, id value2 ...)

Using a range of ids:
Surprisingly again, all the queries I tried took longer on the partitioned table!

Using a list of ids from a select clause:
More surprisingly, the queries I tried took less time on the partitioned table at least when using the primary key. Using an indexed field took so long compared to the old table that I cancelled the execution for the new one!

Guess what, I will get back to my old fat table unless someone tells me I missed something obvious!
Daniel

Note: Tables/indexes description, queries and execution plans are below.






Tables/indexes description ----------------------------------------------------------------------------------
The original table has 3870130000 records. Primary key/index on each partition queries are
ALTER TABLE oldtable ADD CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX oldtable_groupid_idx ON oldtable USING btree (group_id);

The partitioned table has 3870130000 records distributed over 87 partitions. Primary key/index on each partition queries are
ALTER TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX newtablexx_groupid_idx ON newtable_xx USING btree (group_id);
Where xx is the partition's number suffix

constant id -------------------------------------------------------------------------------------------------------
select * from oldtable where id=123456789;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4437.15 rows=1682 width=66)"
"  Index Cond: (id = 123456789::bigint)"
--Total query runtime: 62 ms. 1 rows retrieved


select * from newtable where id=123456789;
"Append  (cost=0.00..20.19 rows=5 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = 123456789::bigint)"
"  ->  Index Scan using newtable72_idversion_pk on newtable_72  (cost=0.56..20.19 rows=4 width=66)"
"        Index Cond: (id = 123456789::bigint)"
--Total query runtime: 156 ms. 1 rows retrieved

I got similar results for multiple records...
select * from oldtable where id IN(10000000,1000000000,2000000000,3000000000);
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.71..17739.18 rows=6726 width=66)"
"  Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 187 ms. 4 rows retrieved

select * from newtable where id IN(10000000,1000000000,2000000000,3000000000);
"Append  (cost=0.00..933.40 rows=223 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
"  ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.57..622.78 rows=156 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
...
"  ->  Index Scan using newtable85_idversion_pk on newtable_85  (cost=0.57..53.37 rows=9 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 421 ms. 4 rows retrieved


range of ids -------------------------------------------------------------------------------------------------------
select * from oldtable where id between 1522999949 and 1523000049;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..383.51 rows=144 width=66)"
"  Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 47 ms. 53 rows retrieved.

select * from newtable where id between 1522999949 and 1523000049;
"Append  (cost=0.00..408.16 rows=104 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable51_idversion_pk on newtable_51  (cost=0.56..183.52 rows=46 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable52_idversion_pk on newtable_52  (cost=0.56..224.64 rows=57 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 78 ms. 53 rows retrieved.


list of ids from a select clause -------------------------------------------------------------------------------------------------------
--Subset provides 4 ids similar but not identical to the previous query
select * from oldtable where id IN (select * from subset);
"Nested Loop  (cost=37.45..886298.00 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4414.37 rows=1693 width=66)"
"        Index Cond: (id = subset.id)"
Total query runtime: 171 ms. 4 rows retrieved.

select * from newtable where id IN (select * from subset)
"Nested Loop  (cost=36.75..1407672.76 rows=1935067087 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Append  (cost=0.00..7020.68 rows=1749 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"              Filter: (subset.id = id)"
"        ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.56..151.97 rows=39 width=66)"
"              Index Cond: (id = subset.id)"
...
"        ->  Index Scan using newtable86_idversion_pk on newtable_86  (cost=0.56..12.42 rows=2 width=66)"
"              Index Cond: (id = subset.id)"
Total query runtime: 140 ms. 4 rows retrieved.


Using an index, not the primary key ------------------------------------------------------------------------------
--Subset provides 58 group_id pointing to 5978 records in the concerned tables
select * from oldtable where group_id IN (select * from subset)
"Nested Loop  (cost=37.33..21575715.89 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_groupid_idx on oldtable  (cost=0.58..107364.99 rows=51340 width=66)"
"        Index Cond: (group_id = subset.id)"
Total query runtime: 3986 ms. 5978 rows retrieved.


select * from newtable where group_id IN (select * from subset)
"Hash Join  (cost=41.25..138092255.85 rows=1935067087 width=66)"
"  Hash Cond: (newtable.group_id = subset.id)"
"  ->  Append  (cost=0.00..84877869.72 rows=3870134173 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        ->  Seq Scan on newtable_01  (cost=0.00..946235.96 rows=46526896 width=66)"
...
"        ->  Seq Scan on newtable_86  (cost=0.00..986527.64 rows=44269664 width=66)"
"  ->  Hash  (cost=38.75..38.75 rows=200 width=8)"
"        ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"              ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
Execution Cancelled after 766702 ms !

I tried the same with "SET enable_seqscan = OFF" and got an index scan of all tables;



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: FW: Constraint exclusion in partitions

От
Daniel Begin
Дата:
Hello Bill,
You wrote that my testing methodology is flawed - I hope you are right!

However, I am a bit confused about your comments. Yes, I did edited the name
of the tables for clarity but if I miss the point I, I will do it again as I
am writing without modifying anything. Here is the procedure I follow and
results...

I use pgadmin_III sql window. I write the following query (I have changed
the id to make sure it does not use previous results still in memory)...

Select * from nodes where id=345678912; -- nodes is the real partitioned
table name

Now I select "explain query" from the menu and I get the following result...
"Append  (cost=0.00..384.08 rows=99 width=66)"
"  ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = 345678912)"
"  ->  Index Scan using nodes19_idversion_pk on nodes_19  (cost=0.56..384.08
rows=98 width=66)"
"        Index Cond: (id = 345678912)"

Now, I select "run" and I get one record as a result and the following
message in history tab...
-- Executing query:
Select * from nodes where id=345678912;
Total query runtime: 62 ms.
1 row retrieved.

Now, if I use the same query on the original table using the same procedure,
here is what I get...
Select * from old_nodes where id=345678912; -- old_nodes is the real
original table name

Explain gives me the following
"Index Scan using nodes_idversion_pk on old_nodes  (cost=0.70..4437.15
rows=1682 width=66)"
"  Index Cond: (id = 345678912)"

Running the query gives me the same record with the following message in
history tab...
-- Executing query:
select * from old_nodes where id=345678912;
Total query runtime: 62 ms.
1 row retrieved.

This time, the history tab shows that both took the same time to run (an
improvement!?)
Let's try this one using the same procedure...

Select * from old_nodes where id IN
(10050000,1000050000,2000050000,3000050000)
"Index Scan using nodes_idversion_pk on old_nodes  (cost=0.71..17739.18
rows=6726 width=66)"
"  Index Cond: (id = ANY
('{10050000,1000050000,2000050000,3000050000}'::bigint[]))"
-- Executing query:
Select * from old_nodes where id IN
(10050000,1000050000,2000050000,3000050000)
Total query runtime: 171 ms.
5 rows retrieved.

Select * from nodes where id IN (10050000,1000050000,2000050000,3000050000)
"Append  (cost=0.00..933.40 rows=223 width=66)"
"  ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = ANY
('{10050000,1000050000,2000050000,3000050000}'::bigint[]))"
"  ->  Index Scan using nodes01_idversion_pk on nodes_01  (cost=0.57..622.78
rows=156 width=66)"
"        Index Cond: (id = ANY
('{10050000,1000050000,2000050000,3000050000}'::bigint[]))"
"  ->  Index Scan using nodes38_idversion_pk on nodes_38  (cost=0.57..138.25
rows=31 width=66)"
"        Index Cond: (id = ANY
('{10050000,1000050000,2000050000,3000050000}'::bigint[]))"
"  ->  Index Scan using nodes63_idversion_pk on nodes_63  (cost=0.57..119.01
rows=26 width=66)"
"        Index Cond: (id = ANY
('{10050000,1000050000,2000050000,3000050000}'::bigint[]))"
"  ->  Index Scan using nodes85_idversion_pk on nodes_85  (cost=0.57..53.37
rows=9 width=66)"
"        Index Cond: (id = ANY
('{10050000,1000050000,2000050000,3000050000}'::bigint[]))"
-- Executing query:
Select * from nodes where id IN (10050000,1000050000,2000050000,3000050000)
Total query runtime: 140 ms.
5 rows retrieved.

This time the history tab shows that the query was shorter to run on
partitioned table (a real improvement!?)

I know, this is different from what I referred to in my original email (at
least both shows similar running time) but I swear, I did not change the
times when editing table names!-)

Do you see any glitch/flaw in the procedure I am using?
Someone has an idea about the problem behind this unexpected behavior?

I really need to get much faster results with my queries on this large table
and partitioning was my last option...

Best regards,
Daniel

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Bill Moran
Sent: May-23-15 15:23
To: Daniel Begin
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] FW: Constraint exclusion in partitions


A large portion of why you describe below is the exact opposite of my own
testing (unfortunately, I don't have the actual test results any more
because I did the tests for a former employer).

In my tests, single lookups against the same column being used to partition
improved performance in direct proportion to the number of partitions. I.e.
if the tables are partitioned on id, and the lookup is for id, and the table
has 10 partitions, the query is 10x faster on the partitioned version than
the non-partitioned verison.

Queries against indexes not partitioned were slightly slower in my tests,
but nowhere near the degree that you're showing below.

I can't help but think that your testing methodology is flawed, but since
you're not showing us what you actually did, it's difficult to be sure. See
below for some specifics on what I'm concerned that you might be doing wrong
...

On Sat, 23 May 2015 14:37:25 -0400
Daniel Begin <jfd553@hotmail.com> wrote:

> Following Francisco suggestion, I was able to do some tests earlier this
morning when the partitioning process completed and all the resulting tables
analyzed.
>
> Here is what I got on both the original table and its partitioned
counterpart while running the same queries. I tested them only for a couple
of values but in summary...
>
> Using a constant id:
> All the queries I tried took longer on the partitioned table! I got
> similar results for multiple records using IN (id value1, id value2
> ...)
>
> Using a range of ids:
> Surprisingly again, all the queries I tried took longer on the partitioned
table!
>
> Using a list of ids from a select clause:
> More surprisingly, the queries I tried took less time on the partitioned
table at least when using the primary key. Using an indexed field took so
long compared to the old table that I cancelled the execution for the new
one!
>
> Guess what, I will get back to my old fat table unless someone tells me I
missed something obvious!
> Daniel
>
> Note: Tables/indexes description, queries and execution plans are below.
>
>
>
>
>
>
> Tables/indexes description
> ----------------------------------------------------------------------
> ------------ The original table has 3870130000 records. Primary
> key/index on each partition queries are ALTER TABLE oldtable ADD
> CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version); CREATE
> INDEX oldtable_groupid_idx ON oldtable USING btree (group_id);
>
> The partitioned table has 3870130000 records distributed over 87
> partitions. Primary key/index on each partition queries are ALTER
> TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY
> (id, version); CREATE INDEX newtablexx_groupid_idx ON newtable_xx
> USING btree (group_id); Where xx is the partition's number suffix
>
> constant id
> ----------------------------------------------------------------------
> --------------------------------- select * from oldtable where
> id=123456789; "Index Scan using oldtable_idversion_pk on oldtable
> (cost=0.70..4437.15 rows=1682 width=66)"
> "  Index Cond: (id = 123456789::bigint)"
> --Total query runtime: 62 ms. 1 rows retrieved
>
>
> select * from newtable where id=123456789; "Append  (cost=0.00..20.19
> rows=5 width=66)"
> "  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "        Filter: (id = 123456789::bigint)"
> "  ->  Index Scan using newtable72_idversion_pk on newtable_72
(cost=0.56..20.19 rows=4 width=66)"
> "        Index Cond: (id = 123456789::bigint)"
> --Total query runtime: 156 ms. 1 rows retrieved

Notice in these results that you're not showing the command that was
executed.
The output is mostly likely from the command "ANALYZE select * from newtable
where id=123456789;" but that's not the command you claim that you ran. In
any event, the analyze output doesn't line up with the times you claim: i.e.
Analyze is showing that the first query should take about 4437 time units to
complete, and the second one should take about 20 time units, yet you claim
the second one is slower. The other queries below exhibit a similar pattern.

Are you sure you're not timing ANALYZE itself instead of the query? Because
timing "ANALYZE select * from newtable where id=123456789;" is not going to
be timing the actual time the query took to run. I would certainly expect
the _planning_ of a query against partitioned tables to take longer than
non- partitioned, but I would also expect the execution time to be the
opposite.
Hence my theory that you've accidentally timed the ANALYZE instead of the
actual running of the query. Naturally, the total query time is planning +
execution, and my experience shows that the loss in planning speed is more
than made up for by the gain in execution speed.

Perhaps you should show us the exact output of one of your tests, without
editorializing.

> I got similar results for multiple records...
> select * from oldtable where id
> IN(10000000,1000000000,2000000000,3000000000);
> "Index Scan using oldtable_idversion_pk on oldtable  (cost=0.71..17739.18
rows=6726 width=66)"
> "  Index Cond: (id = ANY
('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> --Total query runtime: 187 ms. 4 rows retrieved
>
> select * from newtable where id
> IN(10000000,1000000000,2000000000,3000000000);
> "Append  (cost=0.00..933.40 rows=223 width=66)"
> "  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "        Filter: (id = ANY
('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> "  ->  Index Scan using newtable01_idversion_pk on newtable_01
(cost=0.57..622.78 rows=156 width=66)"
> "        Index Cond: (id = ANY
('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> ...
> "  ->  Index Scan using newtable85_idversion_pk on newtable_85
(cost=0.57..53.37 rows=9 width=66)"
> "        Index Cond: (id = ANY
('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> --Total query runtime: 421 ms. 4 rows retrieved
>
>
> range of ids
> ----------------------------------------------------------------------
> --------------------------------- select * from oldtable where id
> between 1522999949 and 1523000049; "Index Scan using
> oldtable_idversion_pk on oldtable  (cost=0.70..383.51 rows=144 width=66)"
> "  Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
> Total query runtime: 47 ms. 53 rows retrieved.
>
> select * from newtable where id between 1522999949 and 1523000049;
> "Append  (cost=0.00..408.16 rows=104 width=66)"
> "  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "        Filter: ((id >= 1522999949) AND (id <= 1523000049))"
> "  ->  Index Scan using newtable51_idversion_pk on newtable_51
(cost=0.56..183.52 rows=46 width=66)"
> "        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
> "  ->  Index Scan using newtable52_idversion_pk on newtable_52
(cost=0.56..224.64 rows=57 width=66)"
> "        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
> Total query runtime: 78 ms. 53 rows retrieved.
>
>
> list of ids from a select clause
> ----------------------------------------------------------------------
> --------------------------------- --Subset provides 4 ids similar but
> not identical to the previous query select * from oldtable where id IN
> (select * from subset); "Nested Loop  (cost=37.45..886298.00
> rows=2028512050 width=66)"
> "  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
> "        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
> "  ->  Index Scan using oldtable_idversion_pk on oldtable
(cost=0.70..4414.37 rows=1693 width=66)"
> "        Index Cond: (id = subset.id)"
> Total query runtime: 171 ms. 4 rows retrieved.
>
> select * from newtable where id IN (select * from subset) "Nested Loop
> (cost=36.75..1407672.76 rows=1935067087 width=66)"
> "  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
> "        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
> "  ->  Append  (cost=0.00..7020.68 rows=1749 width=66)"
> "        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "              Filter: (subset.id = id)"
> "        ->  Index Scan using newtable01_idversion_pk on newtable_01
(cost=0.56..151.97 rows=39 width=66)"
> "              Index Cond: (id = subset.id)"
> ...
> "        ->  Index Scan using newtable86_idversion_pk on newtable_86
(cost=0.56..12.42 rows=2 width=66)"
> "              Index Cond: (id = subset.id)"
> Total query runtime: 140 ms. 4 rows retrieved.
>
>
> Using an index, not the primary key
> ----------------------------------------------------------------------
> -------- --Subset provides 58 group_id pointing to 5978 records in the
> concerned tables select * from oldtable where group_id IN (select *
> from subset) "Nested Loop  (cost=37.33..21575715.89 rows=2028512050
> width=66)"
> "  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
> "        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
> "  ->  Index Scan using oldtable_groupid_idx on oldtable
(cost=0.58..107364.99 rows=51340 width=66)"
> "        Index Cond: (group_id = subset.id)"
> Total query runtime: 3986 ms. 5978 rows retrieved.
>
>
> select * from newtable where group_id IN (select * from subset) "Hash
> Join  (cost=41.25..138092255.85 rows=1935067087 width=66)"
> "  Hash Cond: (newtable.group_id = subset.id)"
> "  ->  Append  (cost=0.00..84877869.72 rows=3870134173 width=66)"
> "        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "        ->  Seq Scan on newtable_01  (cost=0.00..946235.96 rows=46526896
width=66)"
> ...
> "        ->  Seq Scan on newtable_86  (cost=0.00..986527.64 rows=44269664
width=66)"
> "  ->  Hash  (cost=38.75..38.75 rows=200 width=8)"
> "        ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
> "              ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140
width=8)"
> Execution Cancelled after 766702 ms !
>
> I tried the same with "SET enable_seqscan = OFF" and got an index scan
> of all tables;
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To
> make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


--
Bill Moran


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



Re: [NOVICE] Constraint exclusion in partitions

От
"David G. Johnston"
Дата:
On Saturday, May 23, 2015, Daniel Begin <jfd553@hotmail.com> wrote:

I am working with postgresql 9.3 and I understand from the documentation that constraint_exclusion is set to “partition” by default. Looking at my postgres.conf file, the concerned line is “#constraint_exclusion = partition”. 

Furthermore, the execution plan shows that constraint_exclusion was used at least for constant id and range of ids

What is your question/concern? 

Re: FW: Constraint exclusion in partitions

От
Daniel Begin
Дата:

Oops, I was responding to the email below from melvin6925

 

 

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of David G. Johnston
Sent: May-23-15 19:32
To: Daniel Begin
Cc: melvin6925; pgsql-general@postgresql.org
Subject: Re: [GENERAL] [NOVICE] Constraint exclusion in partitions

 

On Saturday, May 23, 2015, Daniel Begin <jfd553@hotmail.com> wrote:

I am working with postgresql 9.3 and I understand from the documentation that constraint_exclusion is set to “partition” by default. Looking at my postgres.conf file, the concerned line is “#constraint_exclusion = partition”. 

Furthermore, the execution plan shows that constraint_exclusion was used at least for constant id and range of ids

What is your question/concern? 

 

 

Did you remember to set  constraint_exclusion = on and reload the .conf ?

 

 

 

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of melvin6925
Sent: May-23-15 15:15
To: Daniel Begin; pgsql-general@postgresql.org
Subject: Re: [GENERAL] FW: Constraint exclusion in partitions

 

Did you remember to set  constraint_exclusion = on and reload the .conf ?

 

 

 

 

Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone

-------- Original message --------
From: Daniel Begin <jfd553@hotmail.com>
Date: 05/23/2015 14:37 (GMT-05:00)
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] FW: Constraint exclusion in partitions

Following Francisco suggestion, I was able to do some tests earlier this morning when the partitioning process completed and all the resulting tables analyzed.

Here is what I got on both the original table and its partitioned counterpart while running the same queries. I tested them only for a couple of values but in summary...

Using a constant id:
All the queries I tried took longer on the partitioned table! I got similar results for multiple records using IN (id value1, id value2 ...)

Using a range of ids:
Surprisingly again, all the queries I tried took longer on the partitioned table!

Using a list of ids from a select clause:
More surprisingly, the queries I tried took less time on the partitioned table at least when using the primary key. Using an indexed field took so long compared to the old table that I cancelled the execution for the new one!

Guess what, I will get back to my old fat table unless someone tells me I missed something obvious!
Daniel

Note: Tables/indexes description, queries and execution plans are below.






Tables/indexes description ----------------------------------------------------------------------------------
The original table has 3870130000 records. Primary key/index on each partition queries are
ALTER TABLE oldtable ADD CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX oldtable_groupid_idx ON oldtable USING btree (group_id);

The partitioned table has 3870130000 records distributed over 87 partitions. Primary key/index on each partition queries are
ALTER TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX newtablexx_groupid_idx ON newtable_xx USING btree (group_id);
Where xx is the partition's number suffix

constant id -------------------------------------------------------------------------------------------------------
select * from oldtable where id=123456789;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4437.15 rows=1682 width=66)"
"  Index Cond: (id = 123456789::bigint)"
--Total query runtime: 62 ms. 1 rows retrieved


select * from newtable where id=123456789;
"Append  (cost=0.00..20.19 rows=5 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = 123456789::bigint)"
"  ->  Index Scan using newtable72_idversion_pk on newtable_72  (cost=0.56..20.19 rows=4 width=66)"
"        Index Cond: (id = 123456789::bigint)"
--Total query runtime: 156 ms. 1 rows retrieved

I got similar results for multiple records...
select * from oldtable where id IN(10000000,1000000000,2000000000,3000000000);
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.71..17739.18 rows=6726 width=66)"
"  Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 187 ms. 4 rows retrieved

select * from newtable where id IN(10000000,1000000000,2000000000,3000000000);
"Append  (cost=0.00..933.40 rows=223 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
"  ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.57..622.78 rows=156 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
...
"  ->  Index Scan using newtable85_idversion_pk on newtable_85  (cost=0.57..53.37 rows=9 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 421 ms. 4 rows retrieved


range of ids -------------------------------------------------------------------------------------------------------
select * from oldtable where id between 1522999949 and 1523000049;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..383.51 rows=144 width=66)"
"  Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 47 ms. 53 rows retrieved.

select * from newtable where id between 1522999949 and 1523000049;
"Append  (cost=0.00..408.16 rows=104 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable51_idversion_pk on newtable_51  (cost=0.56..183.52 rows=46 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable52_idversion_pk on newtable_52  (cost=0.56..224.64 rows=57 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 78 ms. 53 rows retrieved.


list of ids from a select clause -------------------------------------------------------------------------------------------------------
--Subset provides 4 ids similar but not identical to the previous query
select * from oldtable where id IN (select * from subset);
"Nested Loop  (cost=37.45..886298.00 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4414.37 rows=1693 width=66)"
"        Index Cond: (id = subset.id)"
Total query runtime: 171 ms. 4 rows retrieved.

select * from newtable where id IN (select * from subset)
"Nested Loop  (cost=36.75..1407672.76 rows=1935067087 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Append  (cost=0.00..7020.68 rows=1749 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"              Filter: (subset.id = id)"
"        ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.56..151.97 rows=39 width=66)"
"              Index Cond: (id = subset.id)"
...
"        ->  Index Scan using newtable86_idversion_pk on newtable_86  (cost=0.56..12.42 rows=2 width=66)"
"              Index Cond: (id = subset.id)"
Total query runtime: 140 ms. 4 rows retrieved.


Using an index, not the primary key ------------------------------------------------------------------------------
--Subset provides 58 group_id pointing to 5978 records in the concerned tables
select * from oldtable where group_id IN (select * from subset)
"Nested Loop  (cost=37.33..21575715.89 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_groupid_idx on oldtable  (cost=0.58..107364.99 rows=51340 width=66)"
"        Index Cond: (group_id = subset.id)"
Total query runtime: 3986 ms. 5978 rows retrieved.


select * from newtable where group_id IN (select * from subset)
"Hash Join  (cost=41.25..138092255.85 rows=1935067087 width=66)"
"  Hash Cond: (newtable.group_id = subset.id)"
"  ->  Append  (cost=0.00..84877869.72 rows=3870134173 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        ->  Seq Scan on newtable_01  (cost=0.00..946235.96 rows=46526896 width=66)"
...
"        ->  Seq Scan on newtable_86  (cost=0.00..986527.64 rows=44269664 width=66)"
"  ->  Hash  (cost=38.75..38.75 rows=200 width=8)"
"        ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"              ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
Execution Cancelled after 766702 ms !

I tried the same with "SET enable_seqscan = OFF" and got an index scan of all tables;



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: FW: Constraint exclusion in partitions

От
Bill Moran
Дата:
On Sat, 23 May 2015 18:16:43 -0400
Daniel Begin <jfd553@hotmail.com> wrote:

> Hello Bill,
> You wrote that my testing methodology is flawed - I hope you are right!
>
> However, I am a bit confused about your comments. Yes, I did edited the name
> of the tables for clarity but if I miss the point I, I will do it again as I
> am writing without modifying anything. Here is the procedure I follow and
> results...
>
> I use pgadmin_III sql window. I write the following query (I have changed
> the id to make sure it does not use previous results still in memory)...

I didn't realize you were using PGAdmin ... that explains some of it ...
see below:

> Select * from nodes where id=345678912; -- nodes is the real partitioned
> table name
>
> Now I select "explain query" from the menu and I get the following result...
> "Append  (cost=0.00..384.08 rows=99 width=66)"
> "  ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66)"
> "        Filter: (id = 345678912)"
> "  ->  Index Scan using nodes19_idversion_pk on nodes_19  (cost=0.56..384.08
> rows=98 width=66)"
> "        Index Cond: (id = 345678912)"
>
> Now, I select "run" and I get one record as a result and the following
> message in history tab...
> -- Executing query:
> Select * from nodes where id=345678912;
> Total query runtime: 62 ms.
> 1 row retrieved.
>
> Now, if I use the same query on the original table using the same procedure,
> here is what I get...
> Select * from old_nodes where id=345678912; -- old_nodes is the real
> original table name
>
> Explain gives me the following
> "Index Scan using nodes_idversion_pk on old_nodes  (cost=0.70..4437.15
> rows=1682 width=66)"
> "  Index Cond: (id = 345678912)"
>
> Running the query gives me the same record with the following message in
> history tab...
> -- Executing query:
> select * from old_nodes where id=345678912;
> Total query runtime: 62 ms.
> 1 row retrieved.
>
> This time, the history tab shows that both took the same time to run (an
> improvement!?)

If your environment is providing such wildly variant results, then
you need to start running multiple tests instead of assuming that a single
run of a query is indicative of a pattern.

--
Bill Moran


Re: FW: Constraint exclusion in partitions

От
Francisco Olarte
Дата:
Hi Daniel:


On Sat, May 23, 2015 at 8:37 PM, Daniel Begin <jfd553@hotmail.com> wrote:
> Following Francisco suggestion, I was able to do some tests earlier this morning when the partitioning process
completedand all the resulting tables analyzed. 

> Here is what I got on both the original table and its partitioned counterpart while running the same queries. I
testedthem only for a couple of values but in summary... 
> Using a constant id:
> All the queries I tried took longer on the partitioned table! I got similar results for multiple records using IN (id
value1,id value2 ...) 
> Using a range of ids:
> Surprisingly again, all the queries I tried took longer on the partitioned table!
> Using a list of ids from a select clause:
> More surprisingly, the queries I tried took less time on the partitioned table at least when using the primary key.
Usingan indexed field took so long compared to the old table that I cancelled the execution for the new one! 

> Guess what, I will get back to my old fat table unless someone tells me I missed something obvious!
>
> Note: Tables/indexes description, queries and execution plans are below.
>
>
>
>
>
>
> Tables/indexes description ----------------------------------------------------------------------------------
> The original table has 3870130000 records. Primary key/index on each partition queries are
> ALTER TABLE oldtable ADD CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version);
> CREATE INDEX oldtable_groupid_idx ON oldtable USING btree (group_id);
>
> The partitioned table has 3870130000 records distributed over 87 partitions. Primary key/index on each partition
queriesare 
> ALTER TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY (id, version);
> CREATE INDEX newtablexx_groupid_idx ON newtable_xx USING btree (group_id);
> Where xx is the partition's number suffix

It is missing here, but I supose you did not forget to add the
exclusion check constraint.


> constant id -------------------------------------------------------------------------------------------------------
> select * from oldtable where id=123456789;
> "Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4437.15 rows=1682 width=66)"
> "  Index Cond: (id = 123456789::bigint)"
> --Total query runtime: 62 ms. 1 rows retrieved
> select * from newtable where id=123456789;
> "Append  (cost=0.00..20.19 rows=5 width=66)"
> "  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "        Filter: (id = 123456789::bigint)"
> "  ->  Index Scan using newtable72_idversion_pk on newtable_72  (cost=0.56..20.19 rows=4 width=66)"
> "        Index Cond: (id = 123456789::bigint)"
> --Total query runtime: 156 ms. 1 rows retrieved

This is not surprissing, partitions never help much against that type
of queries, with 87 partitions index are not going to be much
shallower and the optimizer has a lot more of work to do. Just a
couple points. When doing a lot of sequential queries on nearby ids (
on the same partition ) it will help ( better data locality,
possibility of having more relevant pages in the cache ). Also, how
did you do your timinngs? ( you should repeat each query 2 or 3 times,
to see if the times go down a lot due to caching and, if you can, time
a couple of explains ( plain, not analyze ) to see how much time the
planner takes. As an example, the explain I sent you in my previous
message takes between 20 and 30 milliseconds and I only have 17 live
partitions ( I routinely un-inherit and move to another schema
partitions older than a two years, to help the planner, and re-inherit
them if needed ) ). Also, your queries seem to be very, very slow for
a single indexed fetch, if I do this on my tables ( similar query to
what I sent, modified to hit just 1 partition ):
select count(*) from carrier_cdrs where setup between
'20150107T123456' and '20150107T222222';
It takes 40 ms in the first go, drops down to 27 after that, and I
have ~15 ms RTT to the server ( which is more or less the time
reported when I do a plain 'select 1' ). I mean, I suspect your
measurements are not good, as they seem too slow. Also, when I use
explain analyze on the previous query the server reports 13 ms, which
is more or less one RTT less ( as the server starts measuring after
receiving the query and ends before sending the reply ).

Another thing, how are you getting the explain results ? I would urge
you to use explain analyze, as, apart of seeming too slow, the
stimations seem to be way off ( explain analyze sends you the
estimated and actual results, so it is easy to see, and executes
everything on the server, so result transmission time, which is not of
use as it has to be the same for every method of calculating the same
correct result, is not reported ). For my query I get this:

            QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1229.36..1229.37 rows=1 width=0) (actual
time=13.794..13.794 rows=1 loops=1)
   ->  Append  (cost=0.00..1169.73 rows=23851 width=0) (actual
time=0.014..12.069 rows=24016 loops=1)
         ->  Seq Scan on carrier_cdrs  (cost=0.00..0.00 rows=1
width=0) (actual time=0.000..0.000 rows=0 loops=1)
               Filter: ((setup >= '2015-01-07 12:34:56+01'::timestamp
with time zone) AND (setup <= '2015-01-07 22:22:22+01'::timestamp with
time zone))
         ->  Index Only Scan using idx_carrier_cdrs_201501_setup on
carrier_cdrs_201501  (cost=0.42..1169.73 rows=23850 width=0) (actual
time=0.014..10.162 rows=24016 loops=1)
               Index Cond: ((setup >= '2015-01-07
12:34:56+01'::timestamp with time zone) AND (setup <= '2015-01-07
22:22:22+01'::timestamp with time zone))
               Heap Fetches: 24016
 Total runtime: 13.819 ms
(8 rows)

As you can see this reports the time for all the nodes, and the
estimated and actual numbers of rows in every step, so things are
easier to check ( like how the planner estimated 23850 rows which
where really 24016, a .69% error, indicating good estimation ).



> I got similar results for multiple records...
> select * from oldtable where id IN(10000000,1000000000,2000000000,3000000000);
> "Index Scan using oldtable_idversion_pk on oldtable  (cost=0.71..17739.18 rows=6726 width=66)"
> "  Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> --Total query runtime: 187 ms. 4 rows retrieved

This seems to slow too, also, estimated rows 6726 for an unique index
with 4 values seems totally bad.

> select * from newtable where id IN(10000000,1000000000,2000000000,3000000000);
> "Append  (cost=0.00..933.40 rows=223 width=66)"
> "  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "        Filter: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> "  ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.57..622.78 rows=156 width=66)"
> "        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"

> ...
********* How many partitions where omitted here? ( if they were two
the problem of redudant checking of id is minor, if they where 85 it
means constratint exclusion is not working for in )

> "  ->  Index Scan using newtable85_idversion_pk on newtable_85  (cost=0.57..53.37 rows=9 width=66)"
> "        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
> --Total query runtime: 421 ms. 4 rows retrieved

And here it is doing strange things ( fetching every value in every partition



> range of ids -------------------------------------------------------------------------------------------------------
> select * from oldtable where id between 1522999949 and 1523000049;
> "Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..383.51 rows=144 width=66)"
> "  Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
> Total query runtime: 47 ms. 53 rows retrieved.

As I was telling you above. It is not normal to take 62 ms for a
single id and just 47 for a 53 range. And this is referring to the old
table case. I keep thinking you are trashing the caches, and your
measurements are not good, you are measuring cache effects ( as I said
before, the easiest ways to discount them is to repeat every query
three or more times in sucession, using explain analyze to discard
network effects, and see where they stabilize )

> Using an index, not the primary key ------------------------------------------------------------------------------

Those are going to be specially bad. If you partition and then do
indexed queries against a non partitioned field WITHOUT extra
conditions which narrow the result to a couple of them, it will always
take a lot of time. The server needs to scan every index ( as your
query is basically a union all of every partition ). Think on it this
way, if I have one big phone call table, partitioned by months,
indexed by timestamp and number ( 2 indexes ) , queries for a TS range
go fast, as they go to the relevant partitions. Queries for a number
with a some-days timestamp range also go fast, as it has to index scan
one or two partitions for the number, but queries for just the number
need to index scan every partition, and they are going to be slow, as
they need to read some pages of every number index.


> --Subset provides 58 group_id pointing to 5978 records in the concerned tables
> select * from oldtable where group_id IN (select * from subset)

If you need to do this queries, and group id is uncorrelated wit id,
you are not going to get good results from partitioning.

Regards.
   Francisco Olarte.


Re: FW: Constraint exclusion in partitions

От
Francisco Olarte
Дата:
Hi Daniel:

Bill Moran already pointed out some things, I more or less agree with him....

On Sun, May 24, 2015 at 12:16 AM, Daniel Begin <jfd553@hotmail.com> wrote:
......
> I use pgadmin_III sql window.

I'll just point from my experience. At work, more than three fourths
of the problems I've had to help my pgadmin using colleagues solve
where due to their usage of pgadmin. Nearly every time they did dome
db administration task, they did it poorly and begun to do it
correctly when using plain postgresql. This may be anecdotal.


> I write the following query (I have changed
> the id to make sure it does not use previous results still in memory)...

That is exactly the opposite of what you should do, unless you are
able to thorougly clean the server caches ( not as easy as it sounds )
between runs. You want warm caches first, to account for the times in
planning, hashing and joining and so. Once you've got that part ready,
you can move on to real queries, busy server, etc for more fine
tuning.

....
> Now I select "explain query" from the menu and I get the following result...
...
> Now, I select "run" and I get one record as a result and the following
> message in history tab...
.....

As I never use pgadmin, I cannot tell you too much, but one thing I
know is NONE of MY users was able to tell me what pgadmin does when
you hit run and/or explain query. OTOH, I know what psql does. And, if
you use EXPLAIN ANALYZE, as suggested by my previous message, you'll
be better of ( as in this case you just get a single query result from
the server and you do not have to care about what your tool of choice
does in the middle ).

....lotta deletions,

> I really need to get much faster results with my queries on this large table
> and partitioning was my last option...

Make sure you are not having an XY-problem. Also, from the type of
queries you've shown us, I do not think partitioning is the way to go.
It will only give you marginal improvements when querying for single
or range of ids ( and small ones if your ranges of ids are somehow
clustered in your tables ).

Francisco Olarte.


Re: FW: Constraint exclusion in partitions

От
Daniel Begin
Дата:
Thank for your patience :-)

- About using PgAdmin, anecdotal problems or not, I did the whole tests again in plain postgresql.

- About running queries once or not, Bill and Francisco both pointed out somehow that I should run each query multiple
timesto get appropriate statistics. I did it for all queries - First trial always longer, all other stabilise around
thesame values. The EXPLAIN ANALYSE for first and second trial on each query I ran on original table and on the
partitionedone can be found below. 

However, in my case, I will have to run most of my queries only once since I simply need to extract sample data for a
researchtopic - there is no insert/update in the DB (that is why I thought looking at first trial was more
appropriate). 

- About adding the exclusion check constraint, thank for remembering me such a simple thing that could have caused all
this!-)but sadly, it was set to "partition", as expected. However, I have decided to run all the queries after having
setthe parameter to ON and restarted the database, just in case. 

Even after doing all this, I did not find any improvement in execution times between my original fat table and the
partitionedversion (sometime even worst). If partitioning the table has improved significantly queries running times, I
couldhave partitioned the tables differently to accommodate other query types I will have to run later in my research
(Ihave the same problem for half a dozen tables).  

Since it does not seem that partitioning will do the job, I will get back to the original table to run my queries...

However, just in case someone knows a magical trick that can improve significantly the speed of my queries (but haven't
toldme yet!-) here are the details about the concerned table/indexes  

(Using https://wiki.postgresql.org/wiki/Index_Maintenance query ...)
Number of records: 3870130000
Table size: 369GB
Indexes size: 425GB
 - nodes_idversion_pk: 125GB
 - nodes_changesetid_idx: 86GB
 - nodes_geom_idx: 241GB

Each record has 3 bigint, 2 boolean, 1 timestamp and 1 geography type.

I am running all this on my personal PC:  Windows 64b, i7 chip, 16GB ram.
I am using PostgreSQL 9.3 and the database cluster is spread over 2X3TB external drives with write caching.

Best regards,
Daniel


Results/explain/analyse follow...

--Constant
ids-------------------------------------------------------------------------------------------------------------------------------------------------
--Explain analyse on original table for a query that will look into different partitions on the new table
--First attempt;
db=# EXPLAIN ANALYSE SELECT * FROM old_nodes WHERE id IN(10005000,1000005000,2000005000,3000005000);
 Index Scan using nodes_idversion_pk on old_nodes  (cost=0.71..17739.18 rows=6726 width=66) (actual
time=52.226..288.700rows=6 loops=1) 
   Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
 Total runtime: 288.732 ms
--Second attempt;
db=# EXPLAIN ANALYSE SELECT * FROM old_nodes WHERE id IN(10005000,1000005000,2000005000,3000005000);
 Index Scan using nodes_idversion_pk on old_nodes  (cost=0.71..17739.18 rows=6726 width=66) (actual time=0.014..0.035
rows=6loops=1) 
   Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
 Total runtime: 0.056 ms

-- Explain analyse on partitioned table for a query that will look into different partitions
--First attempt;
db=# EXPLAIN ANALYSE SELECT * FROM nodes WHERE id IN(10005000,1000005000,2000005000,3000005000);
 Append  (cost=0.00..933.40 rows=223 width=66) (actual time=108.903..287.068 rows=6 loops=1)
   ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66) (actual time=0.001..0.001 rows=0 loops=1)
         Filter: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
   ->  Index Scan using nodes01_idversion_pk on nodes_01  (cost=0.57..622.78 rows=156 width=66) (actual
time=108.900..108.916rows=1 loops=1) 
         Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
   ->  Index Scan using nodes38_idversion_pk on nodes_38  (cost=0.57..138.25 rows=31 width=66) (actual
time=89.523..89.543rows=1 loops=1) 
         Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
   ->  Index Scan using nodes63_idversion_pk on nodes_63  (cost=0.57..119.01 rows=26 width=66) (actual
time=49.978..49.998rows=3 loops=1) 
         Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
   ->  Index Scan using nodes85_idversion_pk on nodes_85  (cost=0.57..53.37 rows=9 width=66) (actual
time=38.600..38.603rows=1 loops=1) 
         Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
 Total runtime: 287.144 ms
--Second attempt;
db=# EXPLAIN ANALYSE SELECT * FROM nodes WHERE id IN(10005000,1000005000,2000005000,3000005000);
 Append  (cost=0.00..933.40 rows=223 width=66) (actual time=0.012..0.065 rows=6 loops=1)
   ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66) (actual time=0.001..0.001 rows=0 loops=1)
         Filter: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
   ->  Index Scan using nodes01_idversion_pk on nodes_01  (cost=0.57..622.78 rows=156 width=66) (actual
time=0.010..0.017rows=1 loops=1) 
         Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
   ->  Index Scan using nodes38_idversion_pk on nodes_38  (cost=0.57..138.25 rows=31 width=66) (actual
time=0.010..0.015rows=1 loops=1) 
         Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
   ->  Index Scan using nodes63_idversion_pk on nodes_63  (cost=0.57..119.01 rows=26 width=66) (actual
time=0.012..0.016rows=3 loops=1) 
         Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
   ->  Index Scan using nodes85_idversion_pk on nodes_85  (cost=0.57..53.37 rows=9 width=66) (actual time=0.013..0.013
rows=1loops=1) 
         Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
 Total runtime: 0.125 ms

--Explain analyse on original table for a query that will look into one partition on the new table
--First attempt;
db=# EXPLAIN ANALYSE SELECT * FROM old_nodes WHERE id IN(723005000,733005000,743005000,753005000);
 Index Scan using nodes_idversion_pk on old_nodes  (cost=0.71..17739.18 rows=6726 width=66) (actual
time=37.366..158.445rows=4 loops=1) 
   Index Cond: (id = ANY ('{723005000,733005000,743005000,753005000}'::bigint[]))
 Total runtime: 158.479 ms
--Second attempt;
db=# EXPLAIN ANALYSE SELECT * FROM old_nodes WHERE id IN(723005000,733005000,743005000,753005000);
 Index Scan using nodes_idversion_pk on old_nodes  (cost=0.71..17739.18 rows=6726 width=66) (actual time=0.014..0.032
rows=4loops=1) 
   Index Cond: (id = ANY ('{723005000,733005000,743005000,753005000}'::bigint[]))
 Total runtime: 0.054 ms

--Explain analyse on partitioned table for a query that will look into one partition
--First attempt;
db=# EXPLAIN ANALYSE SELECT * FROM nodes WHERE id IN(723005000,733005000,743005000,753005000);
 Append  (cost=0.00..196.84 rows=47 width=66) (actual time=163.898..441.497 rows=4 loops=1)
   ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66) (actual time=0.001..0.001 rows=0 loops=1)
         Filter: (id = ANY ('{723005000,733005000,743005000,753005000}'::bigint[]))
   ->  Index Scan using nodes31_idversion_pk on nodes_31  (cost=0.57..196.84 rows=46 width=66) (actual
time=163.894..441.491rows=4 loops=1) 
         Index Cond: (id = ANY ('{723005000,733005000,743005000,753005000}'::bigint[]))
 Total runtime: 441.549 ms
--Second attempt;
db=# EXPLAIN ANALYSE SELECT * FROM nodes WHERE id IN(723005000,733005000,743005000,753005000);
 Append  (cost=0.00..196.84 rows=47 width=66) (actual time=0.011..0.027 rows=4 loops=1)
   ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66) (actual time=0.001..0.001 rows=0 loops=1)
         Filter: (id = ANY ('{723005000,733005000,743005000,753005000}'::bigint[]))
   ->  Index Scan using nodes31_idversion_pk on nodes_31  (cost=0.57..196.84 rows=46 width=66) (actual
time=0.009..0.025rows=4 loops=1) 
         Index Cond: (id = ANY ('{723005000,733005000,743005000,753005000}'::bigint[]))
 Total runtime: 0.062 ms


--Range
ids-------------------------------------------------------------------------------------------------------------------------------------------------
--Explain analyse on original table for a query that will look into different partitions on the new table
--First attempt;
db=# EXPLAIN ANALYSE SELECT * FROM old_nodes WHERE id BETWEEN 1522999949 AND 1523000049;
 Index Scan using nodes_idversion_pk on old_nodes  (cost=0.70..383.51 rows=144 width=66) (actual time=73.115..180.769
rows=53loops=1) 
   Index Cond: ((id >= 1522999949) AND (id <= 1523000049))
 Total runtime: 180.820 ms
--Second attempt;
db=# EXPLAIN ANALYSE SELECT * FROM old_nodes WHERE id BETWEEN 1522999949 AND 1523000049;
 Index Scan using nodes_idversion_pk on old_nodes  (cost=0.70..383.51 rows=144 width=66) (actual time=0.020..0.039
rows=53loops=1) 
   Index Cond: ((id >= 1522999949) AND (id <= 1523000049))
 Total runtime: 0.060 ms

--Explain analyse on partitioned table for a query that will look into different partitions
--First attempt;
db=# EXPLAIN ANALYSE SELECT * FROM nodes WHERE id BETWEEN 1522999949 AND 1523000049;
 Append  (cost=0.00..408.16 rows=104 width=66) (actual time=0.014..46.196 rows=53 loops=1)
   ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66) (actual time=0.001..0.001 rows=0 loops=1)
         Filter: ((id >= 1522999949) AND (id <= 1523000049))
   ->  Index Scan using nodes51_idversion_pk on nodes_51  (cost=0.56..183.52 rows=46 width=66) (actual
time=0.012..20.216rows=18 loops=1) 
         Index Cond: ((id >= 1522999949) AND (id <= 1523000049))
   ->  Index Scan using nodes52_idversion_pk on nodes_52  (cost=0.56..224.64 rows=57 width=66) (actual
time=0.022..25.973rows=35 loops=1) 
         Index Cond: ((id >= 1522999949) AND (id <= 1523000049))
 Total runtime: 46.254 ms
--Second attempt;
db=# EXPLAIN ANALYSE SELECT * FROM nodes WHERE id BETWEEN 1522999949 AND 1523000049;
 Append  (cost=0.00..408.16 rows=104 width=66) (actual time=0.010..0.038 rows=53 loops=1)
   ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66) (actual time=0.001..0.001 rows=0 loops=1)
         Filter: ((id >= 1522999949) AND (id <= 1523000049))
   ->  Index Scan using nodes51_idversion_pk on nodes_51  (cost=0.56..183.52 rows=46 width=66) (actual
time=0.008..0.015rows=18 loops=1) 
         Index Cond: ((id >= 1522999949) AND (id <= 1523000049))
   ->  Index Scan using nodes52_idversion_pk on nodes_52  (cost=0.56..224.64 rows=57 width=66) (actual
time=0.006..0.017rows=35 loops=1) 
         Index Cond: ((id >= 1522999949) AND (id <= 1523000049))
 Total runtime: 0.081 ms


--Select
ids-------------------------------------------------------------------------------------------------------------------------------------------------
--Explain analyse on original table for a query that will look into one partition on the new table but list of ids
providedthrough a select statement 
--First attempt;
db=# EXPLAIN ANALYSE SELECT * FROM old_nodes WHERE id IN(SELECT * FROM subset);
 Nested Loop  (cost=99.93..884823.94 rows=2028512050 width=66) (actual time=97.489..2289.772 rows=5979 loops=1)
   ->  HashAggregate  (cost=99.22..101.22 rows=200 width=8) (actual time=2.155..3.649 rows=5941 loops=1)
         ->  Seq Scan on subset  (cost=0.00..84.78 rows=5778 width=8) (actual time=0.018..0.581 rows=5978 loops=1)
   ->  Index Scan using nodes_idversion_pk on old_nodes  (cost=0.70..4406.68 rows=1693 width=66) (actual
time=0.384..0.384rows=1 loops=5941) 
         Index Cond: (id = subset.id)
 Total runtime: 2290.122 ms
--Second attempt;
db=# EXPLAIN ANALYSE SELECT * FROM old_nodes WHERE id IN(SELECT * FROM subset);
 Nested Loop  (cost=99.93..884823.94 rows=2028512050 width=66) (actual time=1.785..25.730 rows=5979 loops=1)
   ->  HashAggregate  (cost=99.22..101.22 rows=200 width=8) (actual time=1.767..2.661 rows=5941 loops=1)
         ->  Seq Scan on subset  (cost=0.00..84.78 rows=5778 width=8) (actual time=0.009..0.373 rows=5978 loops=1)
   ->  Index Scan using nodes_idversion_pk on old_nodes  (cost=0.70..4406.68 rows=1693 width=66) (actual
time=0.003..0.003rows=1 loops=5941) 
         Index Cond: (id = subset.id)
 Total runtime: 26.005 ms

--Explain analyse on partitioned table for a query that will look into one partition but list of ids provided through a
selectstatement 
--First attempt;
db=# EXPLAIN ANALYSE SELECT * FROM nodes WHERE id IN(SELECT * FROM subset);
 Nested Loop  (cost=99.22..1403193.39 rows=1935067087 width=66) (actual time=12146.666..19140.901 rows=5979 loops=1)
   ->  HashAggregate  (cost=99.22..101.22 rows=200 width=8) (actual time=1.998..4.496 rows=5941 loops=1)
         ->  Seq Scan on subset  (cost=0.00..84.78 rows=5778 width=8) (actual time=0.010..0.390 rows=5978 loops=1)
   ->  Append  (cost=0.00..6997.97 rows=1749 width=66) (actual time=2.925..3.214 rows=1 loops=5941)
         ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66) (actual time=0.000..0.000 rows=0 loops=5941)
               Filter: (subset.id = id)
         ->  Index Scan using nodes01_idversion_pk on nodes_01  (cost=0.56..151.70 rows=39 width=66) (actual
time=0.013..0.013rows=0 loops=5941) 
               Index Cond: (id = subset.id)
         ->  Index Scan using nodes02_idversion_pk on nodes_02  (cost=0.56..219.02 rows=57 width=66) (actual
time=0.012..0.012rows=0 loops=5941) 
               Index Cond: (id = subset.id)
-- skipped for nodes_03 to nodes_84
         ->  Index Scan using nodes85_idversion_pk on nodes_85  (cost=0.56..12.33 rows=2 width=66) (actual
time=0.036..0.036rows=0 loops=5941) 
               Index Cond: (id = subset.id)
         ->  Index Scan using nodes86_idversion_pk on nodes_86  (cost=0.56..12.33 rows=2 width=66) (actual
time=0.688..0.704rows=1 loops=5941) 
               Index Cond: (id = subset.id)
 Total runtime: 19142.983 ms
--Second attempt;
db=# EXPLAIN ANALYSE SELECT * FROM nodes WHERE id IN(SELECT * FROM subset);
 Nested Loop  (cost=99.22..1403193.39 rows=1935067087 width=66) (actual time=2.282..1382.156 rows=5979 loops=1)
   ->  HashAggregate  (cost=99.22..101.22 rows=200 width=8) (actual time=1.834..4.327 rows=5941 loops=1)
         ->  Seq Scan on subset  (cost=0.00..84.78 rows=5778 width=8) (actual time=0.009..0.376 rows=5978 loops=1)
   ->  Append  (cost=0.00..6997.97 rows=1749 width=66) (actual time=0.225..0.226 rows=1 loops=5941)
         ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66) (actual time=0.000..0.000 rows=0 loops=5941)
               Filter: (subset.id = id)
         ->  Index Scan using nodes01_idversion_pk on nodes_01  (cost=0.56..151.70 rows=39 width=66) (actual
time=0.003..0.003rows=0 loops=5941 
               Index Cond: (id = subset.id)
         ->  Index Scan using nodes02_idversion_pk on nodes_02  (cost=0.56..219.02 rows=57 width=66) (actual
time=0.002..0.002rows=0 loops=5941 
               Index Cond: (id = subset.id)
-- skipped for nodes_03 to nodes_84
         ->  Index Scan using nodes85_idversion_pk on nodes_85  (cost=0.56..12.33 rows=2 width=66) (actual
time=0.002..0.002rows=0 loops=5941) 
               Index Cond: (id = subset.id)
         ->  Index Scan using nodes86_idversion_pk on nodes_86  (cost=0.56..12.33 rows=2 width=66) (actual
time=0.004..0.004rows=1 loops=5941) 
               Index Cond: (id = subset.id)
 Total runtime: 1383.929 ms



Re: FW: Constraint exclusion in partitions

От
Francisco Olarte
Дата:
Hi Daniel:

On Mon, May 25, 2015 at 10:03 PM, Daniel Begin <jfd553@hotmail.com> wrote:
...
> Even after doing all this, I did not find any improvement in execution times between my original fat table and the
partitionedversion (sometime even worst). If partitioning the table has improved significantly queries running times, I
couldhave partitioned the tables differently to accommodate other query types I will have to run later in my research
(Ihave the same problem for half a dozen tables). 
> Since it does not seem that partitioning will do the job, I will get back to the original table to run my queries...

Well, at least you've learnt some things about it and you can expect
to repeat the measurements faster shoudl you need it.

> However, just in case someone knows a magical trick that can improve significantly the speed of my queries (but
haven'ttold me yet!-) here are the details about the concerned table/indexes 
.....
> Table size: 369GB
> Indexes size: 425GB
> I am running all this on my personal PC:  Windows 64b, i7 chip, 16GB ram.

Supposing you can dedicate about 12 Gb to shared buffers / caches,
your caches are going to get trashed often with real work, that is why
we recommended repeating the queries.

Anyway, one last remark. Partition is not magic, it helps a lot
depending on the access records. For workloads like mine they help a
lot  ( call records, where I insert frequently ( so position
correlates strongly with indexes ), nearly never update ( and I work
with high fill factors ), and query frequently for unindexed
conditions plus partition-related ranges they work great ( a big table
forces index scans, which due to correlation are fast, but indexed
anyways, plus filters on extra conditions, partitions normally go to
sequential partition scans plus filters, and sequential scans are way
faster, plus the normal queries go normally to the last things
inserted, so partitions help to keep them cached ).

For queries like the ones you've timed/shown ( partition on an integer
PK of unknown origin, queries for single values ( IN queries are
normally just several single repeated ) or small ranges, big table is
normally gonna beat partition hands down ( except if you have
locality, like you are inserting serials and querying frequently in
the vicinity of the inserted ones, in this case partitions keeps used
tables small and cacheable and may give you a small edge ).

> I am using PostgreSQL 9.3 and the database cluster is spread over 2X3TB external drives with write caching.

Well, from your last measurements it seems your disk systems is awful
for database work. I do not know what you mean by external drives (
eSata? firewire? Usb2? usb3? also, any kind of volume management ) but
in your fist query:


> db=# EXPLAIN ANALYSE SELECT * FROM old_nodes WHERE id IN(10005000,1000005000,2000005000,3000005000);
>  Index Scan using nodes_idversion_pk on old_nodes  (cost=0.71..17739.18 rows=6726 width=66) (actual
time=52.226..288.700rows=6 loops=1) 
>    Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
>  Total runtime: 288.732 ms
> --Second attempt;
> db=# EXPLAIN ANALYSE SELECT * FROM old_nodes WHERE id IN(10005000,1000005000,2000005000,3000005000);
>  Index Scan using nodes_idversion_pk on old_nodes  (cost=0.71..17739.18 rows=6726 width=66) (actual time=0.014..0.035
rows=6loops=1) 
>    Index Cond: (id = ANY ('{10005000,1000005000,2000005000,3000005000}'::bigint[]))
>  Total runtime: 0.056 ms

288 ms for a query which should do 8-10 disk reads seems too slow to
me. And you get nearly the same on the second cache. I would normally
expect <100ms for any reasonable drive, and <50 for any db tuned
disks. I do not remember the exact parameters, but if your samples
reprensent your future workload you need to tune well for disk access
time. It's specially noticeable in the last example ( query with ID
from a subquery ), where you got the times:

> --Select
ids-------------------------------------------------------------------------------------------------------------------------------------------------
> --Explain analyse on original table for a query that will look into one partition on the new table but list of ids
providedthrough a select statement 
> --First attempt;
>  Total runtime: 2290.122 ms
> --Second attempt;
>  Total runtime: 26.005 ms

Warm caches help you a lot here.....

> --Explain analyse on partitioned table for a query that will look into one partition but list of ids provided through
aselect statement 
> --First attempt;
>  Total runtime: 19142.983 ms
> --Second attempt;
>  Total runtime: 1383.929 ms

And here too, and also, as parition means always hitting more blocks (
see it in reverse, if you collapse partitions maybe some data ends in
the same block and you save some reads, or not, but you will never be
worse ), slow disks hurt you more.

One last remark. I have not seen your data, but from what you've told
and the shown queries I would go for the single table approach hands
down ( maybe with an intermediate weekend / nigt time cluster/vacuum
full/analyze if it is insert a lot- select a lot, rinse, repeat )
unless you are appending / updating / deleting a lot. You are in a
case ( big data, medium machine, slow disks ) where the real access
patterns are what is going to determine your strategy, no boilerplate
solution is going to apply there.

Francisco Olarte.