Обсуждение: Partitioned Table Index Column Order
All,
I'm on PostgreSQL 13 and have a partitioned table with a primary key.
create table t( a integer, b integer, c varchar, d .. ) partitioned by range( a );
As a best practice is it better to create the primary key starting or ending with the partition column?
e.g.
1) t_pkey primary key (a, b, c)
or
2) t_pkey primary key (b, c, a)
Neither the PostgreSQL documentation nor Google have an answer - at least as far as I could find. I see examples in the Postgres documentation that use 2) but no explanation of why that choice was made.
Does it even make a difference?
Thoughts?
--
Rumpi Gravenstein
On 2021-Jun-23, Rumpi Gravenstein wrote: > As a best practice is it better to create the primary key starting or > ending with the partition column? It is not relevant from the partitioning point of view. Other factors can be used to decide the column order. -- Álvaro Herrera Valdivia, Chile
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > On 2021-Jun-23, Rumpi Gravenstein wrote: >> As a best practice is it better to create the primary key starting or >> ending with the partition column? > It is not relevant from the partitioning point of view. Other factors > can be used to decide the column order. See in particular the hints in https://www.postgresql.org/docs/current/indexes-multicolumn.html The only thing that's different about partitioned situations is that any particular child partition might have only a few values of the partitioning column, which'd suggest putting it last if there are no other relevant considerations. However, if you need a particular column order to match query requirements, that's certainly going to be a more important consideration. regards, tom lane
On Thu, 24 Jun 2021 at 10:55, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2021-Jun-23, Rumpi Gravenstein wrote: > > > As a best practice is it better to create the primary key starting or > > ending with the partition column? > > It is not relevant from the partitioning point of view. Other factors > can be used to decide the column order. I'm not so sure that's really 100% true. There is at least one partitioning feature that will work when the partitioning column is first and won't when it's not. Ordered partition scans work with RANGE and LIST partitioning: create table ab (a int, b int, primary key(a,b)) partition by range(a); create table ab1 partition of ab for values from (0) to (10); create table ab2 partition of ab for values from (10) to (20); explain (costs off) select * from ab order by a; QUERY PLAN -------------------------------------------------- Append -> Index Only Scan using ab1_pkey on ab1 ab_1 -> Index Only Scan using ab2_pkey on ab2 ab_2 Reverse the order and you get: QUERY PLAN ---------------------------------- Sort Sort Key: ab.a -> Append -> Seq Scan on ab1 ab_1 -> Seq Scan on ab2 ab_2 David
David Rowley <dgrowleyml@gmail.com> writes: > On Thu, 24 Jun 2021 at 10:55, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> It is not relevant from the partitioning point of view. Other factors >> can be used to decide the column order. > I'm not so sure that's really 100% true. There is at least one > partitioning feature that will work when the partitioning column is > first and won't when it's not. > Ordered partition scans work with RANGE and LIST partitioning: Sure, but is that any different from the behavior with unpartitioned tables? You have to make the index column order agree with the ORDER BY you want to use, in either case. regards, tom lane
On Thu, 24 Jun 2021 at 11:56, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > On Thu, 24 Jun 2021 at 10:55, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > >> It is not relevant from the partitioning point of view. Other factors > >> can be used to decide the column order. > > > I'm not so sure that's really 100% true. There is at least one > > partitioning feature that will work when the partitioning column is > > first and won't when it's not. > > Ordered partition scans work with RANGE and LIST partitioning: > > Sure, but is that any different from the behavior with unpartitioned > tables? You have to make the index column order agree with the > ORDER BY you want to use, in either case. The reason I mentioned it is that the performance of the ordered partitioned scans pretty good. If the application does ORDER BY a,b just as often as it does ORDER BY b,a and you just get to pick 1 index, then it's better to have the index with the partitioned key first. At least one of the queries can get away without doing a Sort that way. If you have the partition key last in the index then both queries need to sort... You could fix that by adding a 2nd index, but that's not always practical, so it seems worth a mention, at least to me. David