Обсуждение: move PartitionBoundInfo creation code

Поиск
Список
Период
Сортировка

move PartitionBoundInfo creation code

От
Amit Langote
Дата:
Hi,

Currently, the code that creates a PartitionBoundInfo struct from the
PartitionBoundSpec nodes of constituent partitions read from the catalog
is in RelationBuildPartitionDesc that's in partcache.c.  I think that
da6f3e45dd that moved around the partitioning code [1] really missed the
opportunity to move the aforementioned code into partbounds.c.  I think
there is quite a bit of logic contained in that code that can aptly be
said to belong in partbounds.c.  Also, if factored out into a function of
its own with a proper interface, it could be useful to other callers that
may want to build it for, say, fake partitions [2] which are not real
relations.

Attached find a patch that does such refactoring, along with making some
functions in partbounds.c that are not needed outside static.

Thanks,
Amit

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=da6f3e45ddb

[2]
https://www.postgresql.org/message-id/009901d3f34c%2471e1bdc0%2455a53940%24%40lab.ntt.co.jp

Вложения

Re: move PartitionBoundInfo creation code

От
Michael Paquier
Дата:
On Thu, Nov 01, 2018 at 12:58:29PM +0900, Amit Langote wrote:
> Attached find a patch that does such refactoring, along with making some
> functions in partbounds.c that are not needed outside static.

This looks like a very good idea to me.  Thanks for digging into that.
Please just make sure to register it to the next CF if not already
done.
--
Michael

Вложения

Re: move PartitionBoundInfo creation code

От
Amit Langote
Дата:
On 2018/11/01 13:02, Michael Paquier wrote:
> On Thu, Nov 01, 2018 at 12:58:29PM +0900, Amit Langote wrote:
>> Attached find a patch that does such refactoring, along with making some
>> functions in partbounds.c that are not needed outside static.
> 
> This looks like a very good idea to me.  Thanks for digging into that.
> Please just make sure to register it to the next CF if not already
> done.

Done a few moments ago. :)

Thanks,
Amit




Re: move PartitionBoundInfo creation code

От
Michael Paquier
Дата:
On Thu, Nov 01, 2018 at 01:03:00PM +0900, Amit Langote wrote:
> Done a few moments ago. :)

From the file size this move is actually negative.  From what I can see
partcache decreases to 400 lines, while partbounds increases to 3k
lines.

There are a couple of things that this patch is doing:
1) Move the functions comparing two bounds into partbounds.c.
2) Remove the chunk in charge of building PartitionBoundData into
partbounds.c for each method: list, hash and values.

From what I can see, this patch brings actually more confusion by doing
more things than just building all the PartitionBound structures as it
fills in each structure and then builds a mapping which is used to save
each partition OID into the correct mapping position.  Wouldn't it move
on with a logic without this mapping so as the partition OIDs are
directly part of PartitionBound?  It looks wrong to me to have
build_partition_boundinfo create not only partdesc->boundinfo but also
partdesc->oids, and the new routine is here to fill in data for the
former, not the latter.

The first phase building the bounds should switch to a switch/case like
the second phase.

PartitionHashBound & friends can become structures local to
partbounds.c as they are used only there.

To be more consistent with all the other routines, like
partition_bounds_equal/copy, wouldn't it be better to call the new
routine partition_bounds_build or partition_bounds_create?
--
Michael

Вложения

Re: move PartitionBoundInfo creation code

От
Amit Langote
Дата:
Hi,

Thank your for taking a look.

On 2018/11/05 16:21, Michael Paquier wrote:
> On Thu, Nov 01, 2018 at 01:03:00PM +0900, Amit Langote wrote:
>> Done a few moments ago. :)
> 
> From the file size this move is actually negative.  From what I can see
> partcache decreases to 400 lines, while partbounds increases to 3k
> lines.

Hmm, is that because partbounds.c contains more functionality?

> There are a couple of things that this patch is doing:
> 1) Move the functions comparing two bounds into partbounds.c.
> 2) Remove the chunk in charge of building PartitionBoundData into
> partbounds.c for each method: list, hash and values.
> 
> From what I can see, this patch brings actually more confusion by doing
> more things than just building all the PartitionBound structures as it
> fills in each structure and then builds a mapping which is used to save
> each partition OID into the correct mapping position.  Wouldn't it move
> on with a logic without this mapping so as the partition OIDs are
> directly part of PartitionBound?  It looks wrong to me to have
> build_partition_boundinfo create not only partdesc->boundinfo but also
> partdesc->oids, and the new routine is here to fill in data for the
> former, not the latter.

I think we can design the interface of partition_bounds_create such that
it returns information needed to re-arrange OIDs to be in the canonical
partition bound order, so the actual re-ordering of OIDs can be done by
the caller itself.  (It may not always be OIDs, could be something else
like a struct to represent fake partitions.)

The canonical partition bound order is basically partition bounds sorted
in ascending order, or in some manner as defined by the qsort_partition_*
functions of individual partitioning methods.  We need to map the new
canonical positions to the older ones by generating a map, which it only
makes sense to do in partition_bounds_create.

> The first phase building the bounds should switch to a switch/case like
> the second phase.

That bothered me too, so I looked at whether it'd be a good idea to have
just one switch () block and put all the code for different partitioning
methods into it.  The resulting code seems more readable to me as one
doesn't have to shuffle between looking at the first block that generates
Datums from PartitionBoundSpecs and the second block that assigns them to
the Datum array in PartitionBoundInfo, for each partitioning method
respectively.

> PartitionHashBound & friends can become structures local to
> partbounds.c as they are used only there.

Ah, I missed that; moved in.

> To be more consistent with all the other routines, like
> partition_bounds_equal/copy, wouldn't it be better to call the new
> routine partition_bounds_build or partition_bounds_create?

Agree about naming consistency; I went with partition_bounds_create.

I've broken this down this into 3 patches for ease of review:

1. Local re-arrangement of the code in RelationBuildPartitionDesc such
that the code that creates PartitionBoundInfo is clearly separated.  I've
also applied some of the comments above to that piece of code such as
unifying all of that code in one switch() block.  I also found an
opportunity to simplify the code that maps canonical partition indexes to
original partition indexes.

2. Re-arrangement local to partcache.c that moves out the separated code
into partition_bounds_create() with the interface mentioned above.

3. Move partition_bounds_create to partbounds.c and make some functions
that are not needed outside partbounds.c static to that file.

Thanks,
Amit

Вложения

Re: move PartitionBoundInfo creation code

От
Michael Paquier
Дата:
On Wed, Nov 07, 2018 at 03:34:38PM +0900, Amit Langote wrote:
> On 2018/11/05 16:21, Michael Paquier wrote:
>> On Thu, Nov 01, 2018 at 01:03:00PM +0900, Amit Langote wrote:
>>> Done a few moments ago. :)
>>
>> From the file size this move is actually negative.  From what I can see
>> partcache decreases to 400 lines, while partbounds increases to 3k
>> lines.
>
> Hmm, is that because partbounds.c contains more functionality?

The move you are doing here makes sense, now refactoring is better if we
could avoid transforming a large file into a larger one and preserve
some more balance in the force.

> I think we can design the interface of partition_bounds_create such that
> it returns information needed to re-arrange OIDs to be in the canonical
> partition bound order, so the actual re-ordering of OIDs can be done by
> the caller itself.  (It may not always be OIDs, could be something else
> like a struct to represent fake partitions.)

Interesting point.  Do we have already in the code a new case for fake
partitions or a case where the partition OIDs are not used?  I was
thinking why the partitions OIDs are not actually directly embedded in
each PartitionBoundInfo, removing at the same time the index numbers
used for the mapping.  It looks sensible to return the mapping based
on your argument.

>> The first phase building the bounds should switch to a switch/case like
>> the second phase.
>
> That bothered me too, so I looked at whether it'd be a good idea to have
> just one switch () block and put all the code for different partitioning
> methods into it.  The resulting code seems more readable to me as one
> doesn't have to shuffle between looking at the first block that generates
> Datums from PartitionBoundSpecs and the second block that assigns them to
> the Datum array in PartitionBoundInfo, for each partitioning method
> respectively.

Yeah, merging the first and second parts makes the most sense.

> I've broken this down this into 3 patches for ease of review:
>
> 1. Local re-arrangement of the code in RelationBuildPartitionDesc such
> that the code that creates PartitionBoundInfo is clearly separated.  I've
> also applied some of the comments above to that piece of code such as
> unifying all of that code in one switch() block.  I also found an
> opportunity to simplify the code that maps canonical partition indexes to
> original partition indexes.
>
> 2. Re-arrangement local to partcache.c that moves out the separated code
> into partition_bounds_create() with the interface mentioned above.
>
> 3. Move partition_bounds_create to partbounds.c and make some functions
> that are not needed outside partbounds.c static to that file.

That actually helps a lot in seeing how you handle the move, thanks.
What do you think about splitting the code in partition_bounds_create
even a bit more by having one routine per strategy?

Thinking crazy, we could also create a subfolder partitioning/bounds/
which includes three files with this refactoring:x
- range.c
- values.c
- hash.c
Then we keep partbounds.c which is the entry point used by partcache.c,
and partbounds.c relies on the stuff in each other sub-file when
building a strategy.  This move could be interesting in the long term as
there are comparison routines and structures for each strategy if more
strategies are added.
--
Michael

Вложения

Re: move PartitionBoundInfo creation code

От
Alvaro Herrera
Дата:
On 2018-Nov-07, Michael Paquier wrote:

> On Wed, Nov 07, 2018 at 03:34:38PM +0900, Amit Langote wrote:
> > On 2018/11/05 16:21, Michael Paquier wrote:
> >> On Thu, Nov 01, 2018 at 01:03:00PM +0900, Amit Langote wrote:
> >>> Done a few moments ago. :)
> >> 
> >> From the file size this move is actually negative.  From what I can see
> >> partcache decreases to 400 lines, while partbounds increases to 3k
> >> lines.
> > 
> > Hmm, is that because partbounds.c contains more functionality?
> 
> The move you are doing here makes sense, now refactoring is better if we
> could avoid transforming a large file into a larger one and preserve
> some more balance in the force.

I think the code as it's structured in Amit's patch as a whole makes
sense -- partcache is smaller than partbounds, but why do we care?  As
long as each is a reasonably well defined module, it seems better.  In
the current situation we have a bit of a mess there.  A 3k file is okay.
We're not talking about bloating, say, xlog.c which is almost 400kb now.

$ ls -lh src/backend/access/transam/xlog*
-rw-r--r-- 1 alvherre alvherre  23K oct  4 11:40 src/backend/access/transam/xlogarchive.c
-rw-r--r-- 1 alvherre alvherre 389K nov  7 11:49 src/backend/access/transam/xlog.c
-rw-r--r-- 1 alvherre alvherre  21K nov  3 12:15 src/backend/access/transam/xlogfuncs.c
-rw-r--r-- 1 alvherre alvherre  30K sep  3 12:58 src/backend/access/transam/xloginsert.c
-rw-r--r-- 1 alvherre alvherre  40K sep  3 12:58 src/backend/access/transam/xlogreader.c
-rw-r--r-- 1 alvherre alvherre  32K ago  5 21:34 src/backend/access/transam/xlogutils.c

> > I think we can design the interface of partition_bounds_create such that
> > it returns information needed to re-arrange OIDs to be in the canonical
> > partition bound order, so the actual re-ordering of OIDs can be done by
> > the caller itself.  (It may not always be OIDs, could be something else
> > like a struct to represent fake partitions.)

This is interesting.  I don't think the current interface is so bad that
we can't make a few tweaks later; IOW let's focus on the current patch
for now, and we can improve later.  You (and David, and maybe someone
else) already have half a dozen patches touching this code, and I bet
some of these will have to be rebased over this one.


> Thinking crazy, we could also create a subfolder partitioning/bounds/
> which includes three files with this refactoring:x
> - range.c
> - values.c
> - hash.c
> Then we keep partbounds.c which is the entry point used by partcache.c,
> and partbounds.c relies on the stuff in each other sub-file when
> building a strategy.  This move could be interesting in the long term as
> there are comparison routines and structures for each strategy if more
> strategies are added.

I'm not sure this is worthwhile.  You'll create a bunch of independent
translation units in exchange for ...?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: move PartitionBoundInfo creation code

От
Michael Paquier
Дата:
On Wed, Nov 07, 2018 at 12:04:27PM -0300, Alvaro Herrera wrote:
> On 2018-Nov-07, Michael Paquier wrote:
> This is interesting.  I don't think the current interface is so bad that
> we can't make a few tweaks later; IOW let's focus on the current patch
> for now, and we can improve later.  You (and David, and maybe someone
> else) already have half a dozen patches touching this code, and I bet
> some of these will have to be rebased over this one.

Yeah, I am actually quite a fan of having the mapping being returned by
partition_bound_create and just let the caller handle the OID
assignment.  That's neater.  I think that the patch could gain a tad
more in clarity by splitting all sub-processing for each strategy into a
single routine though.  The presented switch/case is a bit hard to
follow in the last patch set, and harder to review in details.

>> Thinking crazy, we could also create a subfolder partitioning/bounds/
>> which includes three files with this refactoring:x
>> - range.c
>> - values.c
>> - hash.c
>> Then we keep partbounds.c which is the entry point used by partcache.c,
>> and partbounds.c relies on the stuff in each other sub-file when
>> building a strategy.  This move could be interesting in the long term as
>> there are comparison routines and structures for each strategy if more
>> strategies are added.
>
> I'm not sure this is worthwhile.  You'll create a bunch of independent
> translation units in exchange for ...?

That's the part about going crazy and wild on this refactoring.  Not all
crazy ideas are worth doing, still I like mentioning all possibilities
so as we miss nothing in the review process.
--
Michael

Вложения

Re: move PartitionBoundInfo creation code

От
Amit Langote
Дата:
On 2018/11/08 11:48, Michael Paquier wrote:
>>> Thinking crazy, we could also create a subfolder partitioning/bounds/
>>> which includes three files with this refactoring:x
>>> - range.c
>>> - values.c
>>> - hash.c
>>> Then we keep partbounds.c which is the entry point used by partcache.c,
>>> and partbounds.c relies on the stuff in each other sub-file when
>>> building a strategy.  This move could be interesting in the long term as
>>> there are comparison routines and structures for each strategy if more
>>> strategies are added.
>>
>> I'm not sure this is worthwhile.  You'll create a bunch of independent
>> translation units in exchange for ...?
> 
> That's the part about going crazy and wild on this refactoring.  Not all
> crazy ideas are worth doing, still I like mentioning all possibilities
> so as we miss nothing in the review process.

It might be okay to split the big switch in partition_bounds_create() into
different functions for different partitioning methods for clarity as you
say, but let's avoid creating new files for range, list, and hash.

I will post an updated patch with that break down.

Thanks,
Amit



Re: move PartitionBoundInfo creation code

От
Amit Langote
Дата:
On 2018/11/08 0:04, Alvaro Herrera wrote:
>> On Wed, Nov 07, 2018 at 03:34:38PM +0900, Amit Langote wrote:
>>> I think we can design the interface of partition_bounds_create such that
>>> it returns information needed to re-arrange OIDs to be in the canonical
>>> partition bound order, so the actual re-ordering of OIDs can be done by
>>> the caller itself.  (It may not always be OIDs, could be something else
>>> like a struct to represent fake partitions.)
> 
> This is interesting.  I don't think the current interface is so bad that
> we can't make a few tweaks later; IOW let's focus on the current patch
> for now, and we can improve later.  You (and David, and maybe someone
> else) already have half a dozen patches touching this code, and I bet
> some of these will have to be rebased over this one.

The patch on ATTACH/DETACH concurrently thread is perhaps touching close
to here, but I'm not sure if any of it should be concerned about how we
generate the PartitionBoundInfo which the patch here trying to make into
its own function?

Thanks,
Amit



Re: move PartitionBoundInfo creation code

От
Amit Langote
Дата:
On 2018/11/08 12:59, Amit Langote wrote:
> It might be okay to split the big switch in partition_bounds_create() into
> different functions for different partitioning methods for clarity as you
> say, but let's avoid creating new files for range, list, and hash.
> 
> I will post an updated patch with that break down.

And here is the new version.  The break down into smaller local functions
for different partitioning methods is in patch 0002.

Thanks,
Amit

Вложения

Re: move PartitionBoundInfo creation code

От
Alvaro Herrera
Дата:
On 2018-Nov-08, Amit Langote wrote:

> On 2018/11/08 0:04, Alvaro Herrera wrote:
> >> On Wed, Nov 07, 2018 at 03:34:38PM +0900, Amit Langote wrote:
> >>> I think we can design the interface of partition_bounds_create such that
> >>> it returns information needed to re-arrange OIDs to be in the canonical
> >>> partition bound order, so the actual re-ordering of OIDs can be done by
> >>> the caller itself.  (It may not always be OIDs, could be something else
> >>> like a struct to represent fake partitions.)
> > 
> > This is interesting.  I don't think the current interface is so bad that
> > we can't make a few tweaks later; IOW let's focus on the current patch
> > for now, and we can improve later.  You (and David, and maybe someone
> > else) already have half a dozen patches touching this code, and I bet
> > some of these will have to be rebased over this one.
> 
> The patch on ATTACH/DETACH concurrently thread is perhaps touching close
> to here, but I'm not sure if any of it should be concerned about how we
> generate the PartitionBoundInfo which the patch here trying to make into
> its own function?

Yeah, definitely not.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: move PartitionBoundInfo creation code

От
Michael Paquier
Дата:
On Thu, Nov 08, 2018 at 03:11:35PM +0900, Amit Langote wrote:
> And here is the new version.  The break down into smaller local
> functions for different partitioning methods is in patch 0002.

Okay, here we go.  I have spent a couple of hours on this patch,
checking the consistency of the new code and the new code, and the main
complain I have about the last version is that the code in charge of
building the mapping was made less robust.  So my notes are:
- The initialization of the mapping should happen in
partition_bounds_create() before going through each routine.  The
proposed patch was only doing the initialization for list bounds, using
an extra layer to track the canonical ordering of indexes associated
with a partition (listpart_canon_idx), and that's something that the
mapping can perfectly do, keeping the code more straight-forward (at
least it seems so to me) with the way null_index and default_index are
assigned.
- Some noise diffs were present, sometimes the indentation being wrong.
- Asserts could be used instead of elog(ERROR) if the strategy is not
the expected one.  That looks cleaner to me.

The result is perhaps less smart than what you did, but that's more
robust.  It may make sense to change the way the mapping is built but
I would suggest to do so after the actual refactoring.  Attached is what
I finish with, and I have spent quite some time making sure that
all the new logic remains consistent with the previous one.
--
Michael

Вложения

Re: move PartitionBoundInfo creation code

От
Amit Langote
Дата:
Hi,

On 2018/11/12 22:55, Michael Paquier wrote:
> On Thu, Nov 08, 2018 at 03:11:35PM +0900, Amit Langote wrote:
>> And here is the new version.  The break down into smaller local
>> functions for different partitioning methods is in patch 0002.
> 
> Okay, here we go.

Thank you for reviewing.

> I have spent a couple of hours on this patch,
> checking the consistency of the new code and the new code, and the main
> complain I have about the last version is that the code in charge of
> building the mapping was made less robust.
> So my notes are:
> - The initialization of the mapping should happen in
> partition_bounds_create() before going through each routine.  The
> proposed patch was only doing the initialization for list bounds, using
> an extra layer to track the canonical ordering of indexes associated
> with a partition (listpart_canon_idx), and that's something that the
> mapping can perfectly do, keeping the code more straight-forward (at
> least it seems so to me) with the way null_index and default_index are
> assigned.

I looked at this and the new implementation seems safe and is a bit faster
due to removing listpart_canon_idx stuff.

> - Some noise diffs were present, sometimes the indentation being wrong.
> - Asserts could be used instead of elog(ERROR) if the strategy is not
> the expected one.  That looks cleaner to me.

Hmm, Assert or elog is your call, but I'd learned that we prefer elog when
checking a value read from the disk?

> The result is perhaps less smart than what you did, but that's more
> robust.  It may make sense to change the way the mapping is built but
> I would suggest to do so after the actual refactoring.  Attached is what
> I finish with, and I have spent quite some time making sure that
> all the new logic remains consistent with the previous one.

Thank you, the new logic seems sane to me.

I noticed a few things while going over the new patch.

I'd accidentally ended up changing the mapping elements to mean that they
map canonical indexes of partitions to their original indexes, but your
rewrite has restored the original meaning, so the following comment that I
wrote is obsolete:

+ * Upon return from this function, *mapping is set to an array of
+ * list_length(boundspecs) elements, each of which maps the canonical
+ * index of a given partition to its 0-based position in the original list.

It should be: "each of which maps the original index of a partition to its
canonical index."

+ /*
+  * For each partitioning method, we first convert the partition bounds
+  * from their parser node representation to the internal representation,
+  * along with any additional preprocessing (such performing de-duplication
+  * on range bounds).  For each bound, we remember its partition's position
+  * (0-based) in the original list, so that we can add it to the *mapping
+  * array.
+  *
+  * Resulting bound datums are then added to the 'datums' array in
+  * PartitionBoundInfo.  For each datum added, an integer indicating the
+  * canonical partition index is added to the 'indexes' array.
+  */

Maybe we can slightly rearrange this comment (along with fixing typos and
obsolete facts) as:

/*
 * For each partitioning method, we first convert the partition bounds f
 * from their parser node representation to the internal representation,
 * along with any additional preprocessing (such as de-duplicating range
 * bounds).  Resulting bound datums are then added to the 'datums' array
 * in PartitionBoundInfo.  For each datum added, an integer indicating the
 * canonical partition index is added to the 'indexes' array.
 *
 * For each bound, we remember its partition's position (0-based) in the
 * original list to later map it to the canonical index.
 */

I'd proposed to rewrite the following comments in create_list_bounds, but
maybe you thought it's related to listmap_canon_idx stuff which no longer
exists.

+ /*
+  * Copy values.  Indexes of individual values are mapped to canonical
+  * values so that they match for any two list partitioned tables with same
+  * number of partitions and same lists per partition.  One way to
+  * canonicalize is to assign the index in all_values[] of the smallest
+  * value of each partition, as the index of all of the partition's values.
+  */

Here's what I'd proposed (fixing typos and removing the sentence that
mentioned listmap_canon_idx).

/*
 * Copy values.  Canonical indexes are values ranging from 0 to nparts - 1
 * assigned to each partition such that all datums of a given partition
 * receive the same value.  The value for a given partition is the index
 * of that partition's smallest datum in the all_values[] array.
 */

Similarly, for:

+    /*
+     * If null-accepting partition has no mapped index yet, assign one.  This
+     * could happen if such partition accepts only null and hence not covered
+     * in the above loop which only handled non-null values.
+     */

I'd proposed:

/*
 * Set the canonical value for null_index, if any.
 *
 * It's possible that the null-accepting partition has not been
 * assigned an index yet, which could happen if such partition
 * accepts only null and hence not handled in the above loop
 * which only looked at non-null values.
 */

For:

+    /* Assign mapped index for the default partition. */

I'd proposed:

/* Set the canonical value for default_index, if any. */

Thanks,
Amit



Re: move PartitionBoundInfo creation code

От
Michael Paquier
Дата:
On Tue, Nov 13, 2018 at 10:34:50AM +0900, Amit Langote wrote:
> Hmm, Assert or elog is your call, but I'd learned that we prefer elog when
> checking a value read from the disk?

Let's keep elog() then, which is consistent with the previous code.  If
there is any meaning in switching to an assert(), it could always be
changed later, if that proves necessary.

> Thank you, the new logic seems sane to me.
>
> I noticed a few things while going over the new patch.

Thanks for going through it.

> I'd accidentally ended up changing the mapping elements to mean that they
> map canonical indexes of partitions to their original indexes, but your
> rewrite has restored the original meaning, so the following comment that I
> wrote is obsolete:
>
> + * Upon return from this function, *mapping is set to an array of
> + * list_length(boundspecs) elements, each of which maps the canonical
> + * index of a given partition to its 0-based position in the original list.
>
> It should be: "each of which maps the original index of a partition to its
> canonical index."

Indeed.  Good point.  I have missed that bit.

> Maybe we can slightly rearrange this comment (along with fixing typos and
> obsolete facts) as:
>
> /*
>  * For each partitioning method, we first convert the partition bounds f
>  * from their parser node representation to the internal representation,
>  * along with any additional preprocessing (such as de-duplicating range
>  * bounds).  Resulting bound datums are then added to the 'datums' array
>  * in PartitionBoundInfo.  For each datum added, an integer indicating the
>  * canonical partition index is added to the 'indexes' array.
>  *
>  * For each bound, we remember its partition's position (0-based) in the
>  * original list to later map it to the canonical index.
>  */

OK, that looks cleaner.

> I'd proposed:
>
> /*
>  * Set the canonical value for null_index, if any.
>  *
>  * It's possible that the null-accepting partition has not been
>  * assigned an index yet, which could happen if such partition
>  * accepts only null and hence not handled in the above loop
>  * which only looked at non-null values.
>  */

No issues with that.  So canonical index is used as a more common term,
which seems fine.

> I'd proposed:
>
> /* Set the canonical value for default_index, if any. */

And this one either, using a conditional at the end is more adapted
anyway.  And there are two occurences of that.

Attached is an updated patch.  Perhaps you are spotting something else?
--
Michael

Вложения

Re: move PartitionBoundInfo creation code

От
Amit Langote
Дата:
On 2018/11/13 11:34, Michael Paquier wrote:
> Attached is an updated patch.  Perhaps you are spotting something else?

Looks good to me.

Thanks,
Amit



Re: move PartitionBoundInfo creation code

От
Amit Langote
Дата:
On 2018/11/13 12:40, Amit Langote wrote:
> On 2018/11/13 11:34, Michael Paquier wrote:
>> Attached is an updated patch.  Perhaps you are spotting something else?
> 
> Looks good to me.

Marked the CF entry as Ready for Committer.

Thanks,
Amit




Re: move PartitionBoundInfo creation code

От
Alvaro Herrera
Дата:
"the context that was active when the function was called" is typically
expressed simply as "the current memory context".  Perhaps the whole
phrase can be reduced to "The object returned by this function is wholly
allocated in the current memory context"?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: move PartitionBoundInfo creation code

От
Michael Paquier
Дата:
On Tue, Nov 13, 2018 at 09:58:08AM -0300, Alvaro Herrera wrote:
> "the context that was active when the function was called" is typically
> expressed simply as "the current memory context".  Perhaps the whole
> phrase can be reduced to "The object returned by this function is wholly
> allocated in the current memory context"?

Yes, that looks cleaner.  I am planning to do a last lookup round on
tomorrow morning my time before committing, so I may still tweak a
couple of other words...
--
Michael

Вложения

Re: move PartitionBoundInfo creation code

От
Alvaro Herrera
Дата:
On 2018-Nov-13, Michael Paquier wrote:

> On Tue, Nov 13, 2018 at 09:58:08AM -0300, Alvaro Herrera wrote:
> > "the context that was active when the function was called" is typically
> > expressed simply as "the current memory context".  Perhaps the whole
> > phrase can be reduced to "The object returned by this function is wholly
> > allocated in the current memory context"?
> 
> Yes, that looks cleaner.  I am planning to do a last lookup round on
> tomorrow morning my time before committing, so I may still tweak a
> couple of other words...

Cool.

I gave the patch a read and it looks reasonable to me.

Memory management in RelationBuildPartitionDesc is crummy, but I don't
think it's this patch's fault.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: move PartitionBoundInfo creation code

От
Michael Paquier
Дата:
On Tue, Nov 13, 2018 at 10:59:15AM -0300, Alvaro Herrera wrote:
> I gave the patch a read and it looks reasonable to me.
>
> Memory management in RelationBuildPartitionDesc is crummy, but I don't
> think it's this patch's fault.

I agree, and that shows up pretty clearly after the refactoring is done.
The mess is partially caused by the handling around the case where there
is no partition data to attach to PartitionDescData.  I would personally
much prefer if we could also avoid using partition_bounds_copy.

It does not prevent the first refactoring step, so I have committed the
patch as it is already doing a lot.
--
Michael

Вложения

Re: move PartitionBoundInfo creation code

От
Amit Langote
Дата:
Hi,

On 2018/11/13 22:59, Alvaro Herrera wrote:
> On 2018-Nov-13, Michael Paquier wrote:
> 
>> On Tue, Nov 13, 2018 at 09:58:08AM -0300, Alvaro Herrera wrote:
>>> "the context that was active when the function was called" is typically
>>> expressed simply as "the current memory context".  Perhaps the whole
>>> phrase can be reduced to "The object returned by this function is wholly
>>> allocated in the current memory context"?

Yeah, don't know why I had to put it in such convoluted manner.

>> Yes, that looks cleaner.  I am planning to do a last lookup round on
>> tomorrow morning my time before committing, so I may still tweak a
>> couple of other words...
> 
> Cool.

Thanks Michael.

> I gave the patch a read and it looks reasonable to me.
> 
> Memory management in RelationBuildPartitionDesc is crummy, but I don't
> think it's this patch's fault.

Are you perhaps referring to the discussion about partitioning cache
management we had a few months back, but decided to postpone any
development work to PG 12?  The following thread, maybe:

https://www.postgresql.org/message-id/143ed9a4-6038-76d4-9a55-502035815e68%40lab.ntt.co.jp

Thanks,
Amit