Re: [HACKERS] Adding support for Default partition in partitioning
От | Ashutosh Bapat |
---|---|
Тема | Re: [HACKERS] Adding support for Default partition in partitioning |
Дата | |
Msg-id | CAFjFpRfY_WKa7rsPn3C=S=6sQxc9AYtabcQRDMc7NR52EjRubw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Adding support for Default partition in partitioning (Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>) |
Список | pgsql-hackers |
Here's some detailed review of the code. @@ -1883,6 +1883,15 @@ heap_drop_with_catalog(Oid relid) if (OidIsValid(parentOid)) { /* + * Default partition constraints are constructed run-time from the + * constraints of its siblings(basically by negating them), so any + * change in the siblings needs to rebuild the constraints of the + * default partition. So, invalidate the sibling default partition's + * relcache. + */ + InvalidateDefaultPartitionRelcache(parentOid); + Do we need a lock on the default partition for doing this? A query might be scanning the default partition directly and we will invalidate the relcache underneath it. What if two partitions are being dropped simultaneously and change default constraints simultaneously. Probably the lock on the parent helps there, but need to check it. What if the default partition cache is invalidated because partition gets added/dropped to the default partition itself. If we need a lock on the default partition, we will need to check the order in which we should be obtaining the locks so as to avoid deadlocks. This also means that we have to test PREPARED statements involving default partition. Any addition/deletion/attach/detach of other partition should invalidate those cached statements. + if (partition_bound_has_default(boundinfo)) + { + overlap = true; + with = boundinfo->default_index; + } You could possibly rewrite this as overlap = partition_bound_has_default(boundinfo); with = boundinfo->default_index; that would save one indentation and a conditional jump. + if (partdesc->nparts > 0 && partition_bound_has_default(boundinfo)) + check_default_allows_bound(parent, spec); If the table has a default partition, nparts > 0, nparts > 0 check looks redundant. The comments above should also explain that this check doesn't trigger when a default partition is added since we don't expect an existing default partition in such a case. + * Checks if there exists any row in the default partition that passes the + * check for constraints of new partition, if any reports an error. grammar two conflicting ifs in the same statement. You may want to rephrase this as "This function checks if there exists a row in the default partition that fits in the new partition and throws an error if it finds one." + if (new_spec->strategy != PARTITION_STRATEGY_LIST) + return; This should probably be an Assert. When default range partition is supported this function would silently return, meaning there is no row in the default partition which fits the new partition. We don't want that behavior. The code in check_default_allows_bound() to check whether the default partition has any rows that would fit new partition looks quite similar to the code in ATExecAttachPartition() checking whether all rows in the table being attached as a partition fit the partition bounds. One thing that check_default_allows_bound() misses is, if there's already a constraint on the default partition refutes the partition constraint on the new partition, we can skip the scan of the default partition since it can not have rows that would fit the new partition. ATExecAttachPartition() has code to deal with a similar case i.e. the table being attached has a constraint which implies the partition constraint. There may be more cases which check_default_allows_bound() does not handle but ATExecAttachPartition() handles. So, I am wondering whether it's better to somehow take out the common code into a function and use it. We will have to deal with a difference through. The first one would throw an error when finding a row that satisfies partition constraints whereas the second one would throw an error when it doesn't find such a row. But this difference can be handled through a flag or by negating the constraint. This would also take care of Amit Langote's complaint about foreign partitions. There's also another difference that the ATExecAttachPartition() queues the table for scan and the actual scan takes place in ATRewriteTable(), but there is not such queue while creating a table as a partition. But we should check if we can reuse the code to scan the heap for checking a constraint. In case of ATTACH PARTITION, probably we should schedule scan of default partition in the alter table's work queue like what ATExecAttachPartition() is doing for the table being attached. That would fit in the way alter table works. make_partition_op_expr(PartitionKey key, int keynum, - uint16 strategy, Expr *arg1, Expr *arg2) + uint16 strategy, Expr *arg1, Expr *arg2, bool is_default) Indentation + if (is_default && + ((operoid = get_negator(operoid)) == InvalidOid)) + ereport(ERROR, (errcode(ERRCODE_RESTRICT_VIOLATION), + errmsg("DEFAULT partition cannot be used without negator of operator %s", + get_opname(operoid)))); + If the existence of default partition depends upon the negator, shouldn't there be a dependency between the default partition and the negator. At the time of creating the default partition, we will try to constuct the partition constraint for the default partition and if the negator doesn't exist that time, it will throw an error. But in an unlikely event when the user drops the negator, the partitioned table will not be usable at all, as every time it will try to create the relcache, it will try to create default partition constraint and will throw error because of missing negator. That's not a very good scenario. Have you tried this case? Apart from that, while restoring a dump, if the default partition gets restored before the negator is created, restore will fail with this error. /* Generate the main expression, i.e., keyCol = ANY (arr) */ opexpr = make_partition_op_expr(key, 0, BTEqualStrategyNumber, - keyCol, (Expr *) arr); + keyCol, (Expr *) arr, spec->is_default); /* Build leftop = ANY (rightop)*/ saopexpr = makeNode(ScalarArrayOpExpr); The comments in both the places need correction, as for default partition the expression will be keyCol <> ALL(arr). + /* + * In case of the default partition for list, the partition constraint + * is basically any value that is not equal to any of the values in + * boundinfo->datums array. So, construct a list of constants from + * boundinfo->datums to pass to function make_partition_op_expr via + * ArrayExpr, which would return a negated expression for the default + * partition. + */ This is misleading, since the actual constraint would also have NOT NULL or IS NULL in there depending upon the existence of a NULL partition. I would simply rephrase this as "For default list partition, collect lists for all the partitions. The default partition constraint should check that the partition key is equal to none of those." + ndatums = (pdesc->nparts > 0) ? boundinfo->ndatums : 0; wouldn't ndatums be simply boundinfo->ndatums? When nparts = 0, ndatums will be 0. + int ndatums = 0; This assignment looks redundant then. + if (boundinfo && partition_bound_accepts_nulls(boundinfo)) You have not checked existence of boundinfo when extracting ndatums out of it and just few lines below you check that. If the later check is required then we will get a segfault while extracting ndatums. + if ((!list_has_null && !spec->is_default) || + (list_has_null && spec->is_default)) Need a comment explaining what's going on here. The condition is no more a simple condition. - result = -1; - *failed_at = parent; - *failed_slot = slot; - break; + if (partition_bound_has_default(partdesc->boundinfo)) + { + result = parent->indexes[partdesc->boundinfo->default_index]; + + if (result >= 0) + break; + else + parent = pd[-result]; + } + else + { + result = -1; + *failed_at = parent; + *failed_slot = slot; + break; + } The code to handle result is duplicated here and few lines below. I think it would be better to not duplicate it by having separate condition blocks to deal with setting result and setting parent. Basically if (cur_index < 0) ... else would set the result breaking when setting result = -1 explicitly. A follow-on block would adjust the parent if result < 0 or break otherwise. Both the places where DEFAULT_PARTITION_INDEX is used, its result is used to fetch OID of the default partition. So, instead of having this macro, may be we should have macro to fetch OID of default partition. But even there I don't see much value in that. Further, the macro and code using that macro fetches rd_partdesc directly from Relation. We have RelationGetPartitionDesc() for that. Probably we should also add Asserts to check that every pointer in the long pointer chain is Non-null. InvalidateDefaultPartitionRelcache() is called in case of drop and detach. Shouldn't the constraint change when we add or attach a new partition. Shouldn't we invalidate the cache then as well? I am not able to find that code in your patch. /* + * Default partition constraints are constructed run-time from the + * constraints of its siblings(basically by negating them), so any + * change in the siblings needs to rebuild the constraints of the + * default partition. So, invalidate the sibling default partition's + * relcache. + */ May be rephrase this as "The default partition constraints depend upon the partition bounds of other partitions. Detaching a partition invalidates the default partition constraints. Invalidate the default partition's relcache so that the constraints are built anew and any plans dependent on those constraints are invalidated as well." + errmsg("default partition is supported only for list partitioned table"))); for "a" list partitioned table. + /* + * A default partition, that can be partition of either LIST or + * RANGE partitioned table. + * Currently this is supported only for LIST partition. + */ Keep everything in single paragraph without line break. } + ; unnecessary extra line. + /* + * The default partition bound does not have any datums to be + * transformed, return the new bound. + */ Probably not needed. + if (spec->is_default && (strategy == PARTITION_STRATEGY_LIST || + strategy == PARTITION_STRATEGY_RANGE)) + { + appendStringInfoString(buf, "DEFAULT"); + break; + } + What happens if strategy is something other than RANGE or LIST. For that matter why not just LIST? Possibly you could write this as + if (spec->is_default) + { + Assert(strategy == PARTITION_STRATEGY_LIST); + appendStringInfoString(buf, "DEFAULT"); + break; + } @@ -2044,7 +2044,7 @@ psql_completion(const char *text, int start, int end) COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables,""); /* Limited completion support for partition bound specification*/ else if (TailMatches3("ATTACH", "PARTITION", MatchAny)) - COMPLETE_WITH_CONST("FOR VALUES"); + COMPLETE_WITH_LIST2("FOR VALUES", "DEFAULT"); else if (TailMatches2("FOR", "VALUES")) COMPLETE_WITH_LIST2("FROM(", "IN ("); @@ -2483,7 +2483,7 @@ psql_completion(const char *text, int start, int end) COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_partitioned_tables,""); /* Limited completion support for partition boundspecification */ else if (TailMatches3("PARTITION", "OF", MatchAny)) - COMPLETE_WITH_CONST("FOR VALUES"); + COMPLETE_WITH_LIST2("FOR VALUES", "DEFAULT"); Do we include psql tab completion in the main feature patch? I have not seen this earlier. But appreciate taking care of these defails. +char *ExecBuildSlotValueDescription(Oid reloid, needs an "extern" declaration. On Fri, Jun 2, 2017 at 1:05 AM, Jeevan Ladhe <jeevan.ladhe@enterprisedb.com> wrote: > Hi, > > I have addressed Ashutosh's and Amit's comments in the attached patch. > > Please let me know if I have missed anything and any further comments. > > PFA. > > Regards, > Jeevan Ladhe > > On Wed, May 31, 2017 at 9:50 AM, Beena Emerson <memissemerson@gmail.com> > wrote: >> >> On Wed, May 31, 2017 at 8:13 AM, Amit Langote >> <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> > On 2017/05/31 9:33, Amit Langote wrote: >> > >> > >> > In get_rule_expr(): >> > >> > case PARTITION_STRATEGY_LIST: >> > Assert(spec->listdatums != NIL); >> > >> > + /* >> > + * If the boundspec is of Default partition, it >> > does >> > + * not have list of datums, but has only one >> > node to >> > + * indicate its a default partition. >> > + */ >> > + if (isDefaultPartitionBound( >> > + (Node *) >> > linitial(spec->listdatums))) >> > + { >> > + appendStringInfoString(buf, "DEFAULT"); >> > + break; >> > + } >> > + >> > >> > How about adding this part before the switch (key->strategy)? That way, >> > we won't have to come back and add this again when we add range default >> > partitions. >> >> I think it is best that we add a bool is_default to PartitionBoundSpec >> and then have a general check for both list and range. Though >> listdatums, upperdatums and lowerdatums are set to default for a >> DEFAULt partition, it does not seem proper that we check listdatums >> for range as well. >> >> >> >> >> -- >> >> Beena Emerson >> >> EnterpriseDB: http://www.enterprisedb.com >> The Enterprise PostgreSQL Company > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Kuntal GhoshДата:
Сообщение: Re: [HACKERS] Why does logical replication launcher set application_name?
Следующее
От: Amit KapilaДата:
Сообщение: Re: [HACKERS] COPY (query) TO ... doesn't allow parallelism