Обсуждение: Allowing extensions to supply operator-/function-specific info
Over in the thread at [1], we realized that PostGIS has been thrashing around trying to fake its way to having "special index operators", ie a way to automatically convert WHERE clauses into lossy index quals. That's existed in a non-extensible way inside indxpath.c for twenty years come July. Since the beginning I've thought we should provide a way for extensions to do similar things, but it never got to the top of the to-do queue. Now I think it's time. One low-effort answer is to add a hook call in indxpath.c that lets extensions manipulate the sets of index clauses extracted from a relation's qual clauses, but I don't especially like that: it dumps all the work onto extensions, resulting in lots of code duplication, plus they have a badly-documented and probably moving target for what they have to do. Another bit of technical debt that's even older is the lack of a way to attach selectivity estimation logic to boolean-returning functions. So that motivates me to think that whatever we do here should be easily extensible to allow different sorts of function- or operator-related knowledge to be supplied by extensions. We already have oprrest, oprjoin, and protransform hooks that allow certain kinds of knowledge to be attached to operators and functions, but we need something a bit more general. What I'm envisioning therefore is that we allow an auxiliary function to be attached to any operator or function that can provide functionality like this, and that we set things up so that the set of tasks that such functions can perform can be extended over time without SQL-level changes. For example, we could say that the function takes a single Node* argument, and that the type of Node tells it what to do, and if it doesn't recognize the type of Node it should just return NULL indicating "use default handling". We'd start out with two relevant Node types, one for the selectivity-estimation case and one for the extract-a-lossy- index-qual case, and we could add more over time. What we can do to attach such a support function to a target function is to repurpose the pg_proc.protransform column to represent the support function. The existing protransform functions already have nearly the sort of API I'm thinking about, but they only accept FuncExpr* not any other node type. It'd be easy to change them though, because there's only about a dozen and they are all in core; we never invented any way for extensions to access that functionality. (So actually, the initial API spec here would include three possibilities, the third one being equivalent to the current protransform behavior.) As for attaching support functions to operators, we could consider widening the pg_operator catalog to add a new column. But I think it might be a cleaner answer to just say "use the support function attached to the operator's implementation function, if there is one". This would require that the support functions be able to cope with either FuncExpr or OpExpr inputs, but that does not seem like much of a burden as long as it's part of the API spec from day one. Since there isn't any SQL API for attaching support functions, we'd have to add one, but adding another clause to CREATE FUNCTION isn't all that hard. (Annoyingly, we haven't created any cheaply extensible syntax for CREATE FUNCTION, so this'd likely require adding another keyword. I'm not interested in doing more than that right now, though.) I'd be inclined to rename pg_proc.protransform to "prosupport" to reflect its wider responsibility, and make the new CREATE FUNCTION clause be "SUPPORT FUNCTION foo" or some such. I'm not wedded to that terminology, if anyone has a better idea. One thing that's not entirely clear to me is what permissions would be required to use that clause. The support functions will have signature "f(internal) returns internal", so creating them at all will require superuser privilege, but it seems like we probably also need to restrict the ability to attach one to a target function --- attaching one to the wrong function could probably have bad consequences. The easy way out is to say "you must be superuser"; maybe that's enough for now, since all the plausible use-cases for this are in extensions containing C functions anyway. (A support function would have to be coded in C, although it seems possible that its target function could be something else.) Thoughts? If we have agreement on this basic design, making it happen seems like a pretty straightforward task. regards, tom lane PS: there is, however, a stumbling block that I'll address in a separate message, as it seems independent of this infrastructure. [1] https://www.postgresql.org/message-id/flat/CACowWR0TXXL0NfPMW2afCKzX++nHHBZLW3-BLusu_B0WjBB1=A@mail.gmail.com
I wrote: > What I'm envisioning therefore is that we allow an auxiliary function to > be attached to any operator or function that can provide functionality > like this, and that we set things up so that the set of tasks that > such functions can perform can be extended over time without SQL-level > changes. Here are some draft patches in pursuit of this goal. 0001 redefines the API for protransform functions, renames that pg_proc column to prosupport, and likewise renames the existing transform functions to be xxx_support. There are no actual functionality changes in this step. I needed to reindent the existing code in the transform functions, so for ease of review, the -review patch uses "git diff -b" to suppress most of the reindentation diffs. If you want to actually apply the patch for testing, use the -apply version. Possibly worth noting is that I chose to just remove timestamp_zone_transform and timestamp_izone_transform, rather than change them from one no-op state to another. We left them in place in commit c22ecc656 to avoid a catversion bump, but that argument no longer applies, and there seems little likelihood that we'll need them soon. 0002 adds the ability to attach a support function via CREATE/ALTER FUNCTION, and adds the necessary pg_dump and ruleutils support for that. The only thing that's not pretty mechanical about that is that ALTER FUNCTION needs the ability to replace a dependency on a previous support function. For that, we should use changeDependencyFor() ... but there's a problem, which is that that function can't cope with the case where the existing dependency is on a pinned object. We'd left that unimplemented, arguing that it wasn't really necessary for the existing usage of that function to change schema dependencies. But it seems fairly likely that the case would occur for support functions, so I went ahead and fixed changeDependencyFor() to handle it. That leads to a change in the alter_table regression test, which was pedantically verifying that the limitation existed. (We could alternatively leave out the ability to set this option in ALTER FUNCTION, requiring people to use CREATE OR REPLACE FUNCTION for it. But I'm figuring that extension update scripts will want to add support functions to existing functions, so it'd be tedious to not be able to do it with a simple ALTER.) 0003 is where something useful happens. It extends the API to allow support functions to define the selectivity estimates, cost estimates, and rowcount estimates (for set-returning functions) of their target functions. I can't overstate how important this is: it's retiring technical debt that has been there for decades. As proof of concept, there is a quick hack in the regression tests that teaches the planner to make accurate rowcount estimates for generate_series(int, int) with constant or estimatable arguments. There's a considerable amount of follow-up work that ought to happen now to make use of these capabilities for places that have been pain points in the past, such as generate_series() and unnest(). But I haven't touched that yet. Still to be done is to provide an API responding to Paul's original problem, i.e. allowing an extension to generate lossy index clauses when one of its operators or functions appears in WHERE. That's going to be more complex than 0003 --- for one thing, I think I'd like to try to refactor the existing hard-wired cases in indxpath.c so that they live in datatype-specific support functions instead of the core index code. But first, I'd like to push forward with committing what I've got. I think this is pretty damn compelling already, even if nothing further got done for v12. Is anybody interested in reviewing? regards, tom lane diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index af4d062..6dd0700 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -5146,11 +5146,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l </row> <row> - <entry><structfield>protransform</structfield></entry> + <entry><structfield>prosupport</structfield></entry> <entry><type>regproc</type></entry> <entry><literal><link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.oid</literal></entry> - <entry>Calls to this function can be simplified by this other function - (see <xref linkend="xfunc-transform-functions"/>)</entry> + <entry>Optional planner support function for this function + (see <xref linkend="xfunc-optimization"/>)</entry> </row> <row> diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index e18272c..d70aa6e 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -3241,40 +3241,6 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray </para> </sect2> - <sect2 id="xfunc-transform-functions"> - <title>Transform Functions</title> - - <para> - Some function calls can be simplified during planning based on - properties specific to the function. For example, - <literal>int4mul(n, 1)</literal> could be simplified to just <literal>n</literal>. - To define such function-specific optimizations, write a - <firstterm>transform function</firstterm> and place its OID in the - <structfield>protransform</structfield> field of the primary function's - <structname>pg_proc</structname> entry. The transform function must have the SQL - signature <literal>protransform(internal) RETURNS internal</literal>. The - argument, actually <type>FuncExpr *</type>, is a dummy node representing a - call to the primary function. If the transform function's study of the - expression tree proves that a simplified expression tree can substitute - for all possible concrete calls represented thereby, build and return - that simplified expression. Otherwise, return a <literal>NULL</literal> - pointer (<emphasis>not</emphasis> a SQL null). - </para> - - <para> - We make no guarantee that <productname>PostgreSQL</productname> will never call the - primary function in cases that the transform function could simplify. - Ensure rigorous equivalence between the simplified expression and an - actual call to the primary function. - </para> - - <para> - Currently, this facility is not exposed to users at the SQL level - because of security concerns, so it is only practical to use for - optimizing built-in functions. - </para> - </sect2> - <sect2> <title>Shared Memory and LWLocks</title> @@ -3388,3 +3354,89 @@ if (!ptr) </sect2> </sect1> + + <sect1 id="xfunc-optimization"> + <title>Function Optimization Information</title> + + <indexterm zone="xfunc-optimization"> + <primary>optimization information</primary> + <secondary>for functions</secondary> + </indexterm> + + <para> + By default, a function is just a <quote>black box</quote> that the + database system knows very little about the behavior of. However, + that means that queries using the function may be executed much less + efficiently than they could be. It is possible to supply additional + knowledge that helps the planner optimize function calls. + </para> + + <para> + Some basic facts can be supplied by declarative annotations provided in + the <xref linkend="sql-createfunction"/> command. Most important of + these is the function's <link linkend="xfunc-volatility">volatility + category</link> (<literal>IMMUTABLE</literal>, <literal>STABLE</literal>, + or <literal>VOLATILE</literal>); one should always be careful to + specify this correctly when defining a function. + The parallel safety property (<literal>PARALLEL + UNSAFE</literal>, <literal>PARALLEL RESTRICTED</literal>, or + <literal>PARALLEL SAFE</literal>) must also be specified if you hope + to use the function in parallelized queries. + It can also be useful to specify the function's estimated execution + cost, and/or the number of rows a set-returning function is estimated + to return. However, the declarative way of specifying those two + facts only allows specifying a constant value, which is often + inadequate. + </para> + + <para> + It is also possible to attach a <firstterm>planner support + function</firstterm> to a SQL-callable function (called + its <firstterm>target function</firstterm>), and thereby provide + knowledge about the target function that is too complex to be + represented declaratively. Planner support functions have to be + written in C (although their target functions might not be), so this is + an advanced feature that relatively few people will use. + </para> + + <para> + A planner support function must have the SQL signature +<programlisting> +supportfn(internal) returns internal +</programlisting> + It is attached to its target function by specifying + the <literal>SUPPORT</literal> clause when creating the target function. + </para> + + <para> + The details of the API for planner support functions can be found in + file <filename>src/include/nodes/supportnodes.h</filename> in the + <productname>PostgreSQL</productname> source code. Here we provide + just an overview of what planner support functions can do. + The set of possible requests to a support function is extensible, + so more things might be possible in future versions. + </para> + + <para> + Some function calls can be simplified during planning based on + properties specific to the function. For example, + <literal>int4mul(n, 1)</literal> could be simplified to + just <literal>n</literal>. This type of transformation can be + performed by a planner support function, by having it implement + the <literal>SupportRequestSimplify</literal> request type. + The support function will be called for each instance of its target + function found in a query parse tree. If it finds that the particular + call can be simplified into some other form, it can build and return a + parse tree representing that expression. This will automatically work + for operators based on the function, too — in the example just + given, <literal>n * 1</literal> would also be simplified to + <literal>n</literal>. + (But note that this is just an example; this particular + optimization is not actually performed by + standard <productname>PostgreSQL</productname>.) + We make no guarantee that <productname>PostgreSQL</productname> will + never call the target function in cases that the support function could + simplify. Ensure rigorous equivalence between the simplified + expression and an actual execution of the target function. + </para> + </sect1> diff --git a/doc/src/sgml/xoper.sgml b/doc/src/sgml/xoper.sgml index 2f5560a..260e43c 100644 --- a/doc/src/sgml/xoper.sgml +++ b/doc/src/sgml/xoper.sgml @@ -78,6 +78,11 @@ SELECT (a + b) AS c FROM test_complex; <sect1 id="xoper-optimization"> <title>Operator Optimization Information</title> + <indexterm zone="xoper-optimization"> + <primary>optimization information</primary> + <secondary>for operators</secondary> + </indexterm> + <para> A <productname>PostgreSQL</productname> operator definition can include several optional clauses that tell the system useful things about how @@ -97,6 +102,13 @@ SELECT (a + b) AS c FROM test_complex; the ones that release &version; understands. </para> + <para> + It is also possible to attach a planner support function to the function + that underlies an operator, providing another way of telling the system + about the behavior of the operator. + See <xref linkend="xfunc-optimization"/> for more information. + </para> + <sect2> <title><literal>COMMUTATOR</literal></title> diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index db78061..3a86f1e 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -319,7 +319,7 @@ ProcedureCreate(const char *procedureName, values[Anum_pg_proc_procost - 1] = Float4GetDatum(procost); values[Anum_pg_proc_prorows - 1] = Float4GetDatum(prorows); values[Anum_pg_proc_provariadic - 1] = ObjectIdGetDatum(variadicType); - values[Anum_pg_proc_protransform - 1] = ObjectIdGetDatum(InvalidOid); + values[Anum_pg_proc_prosupport - 1] = ObjectIdGetDatum(InvalidOid); values[Anum_pg_proc_prokind - 1] = CharGetDatum(prokind); values[Anum_pg_proc_prosecdef - 1] = BoolGetDatum(security_definer); values[Anum_pg_proc_proleakproof - 1] = BoolGetDatum(isLeakProof); diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index f0ef102..061a855 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -32,6 +32,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" #include "optimizer/planmain.h" @@ -4269,13 +4270,16 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod, args, funcvariadic, func_tuple, context); - if (!newexpr && allow_non_const && OidIsValid(func_form->protransform)) + if (!newexpr && allow_non_const && OidIsValid(func_form->prosupport)) { /* - * Build a dummy FuncExpr node containing the simplified arg list. We - * use this approach to present a uniform interface to the transform - * function regardless of how the function is actually being invoked. + * Build a SupportRequestSimplify node to pass to the support + * function, pointing to a dummy FuncExpr node containing the + * simplified arg list. We use this approach to present a uniform + * interface to the support function regardless of how the target + * function is actually being invoked. */ + SupportRequestSimplify req; FuncExpr fexpr; fexpr.xpr.type = T_FuncExpr; @@ -4289,9 +4293,16 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod, fexpr.args = args; fexpr.location = -1; + req.type = T_SupportRequestSimplify; + req.root = context->root; + req.fcall = &fexpr; + newexpr = (Expr *) - DatumGetPointer(OidFunctionCall1(func_form->protransform, - PointerGetDatum(&fexpr))); + DatumGetPointer(OidFunctionCall1(func_form->prosupport, + PointerGetDatum(&req))); + + /* catch a possible API misunderstanding */ + Assert(newexpr != (Expr *) &fexpr); } if (!newexpr && allow_non_const) diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c index 3810e4a..cf5a1c6 100644 --- a/src/backend/utils/adt/date.c +++ b/src/backend/utils/adt/date.c @@ -24,6 +24,7 @@ #include "access/xact.h" #include "libpq/pqformat.h" #include "miscadmin.h" +#include "nodes/supportnodes.h" #include "parser/scansup.h" #include "utils/array.h" #include "utils/builtins.h" @@ -1341,15 +1342,25 @@ make_time(PG_FUNCTION_ARGS) } -/* time_transform() - * Flatten calls to time_scale() and timetz_scale() that solely represent - * increases in allowed precision. +/* time_support() + * + * Planner support function for the time_scale() and timetz_scale() + * length coercion functions (we need not distinguish them here). */ Datum -time_transform(PG_FUNCTION_ARGS) +time_support(PG_FUNCTION_ARGS) { - PG_RETURN_POINTER(TemporalTransform(MAX_TIME_PRECISION, - (Node *) PG_GETARG_POINTER(0))); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + + ret = TemporalSimplify(MAX_TIME_PRECISION, (Node *) req->fcall); + } + + PG_RETURN_POINTER(ret); } /* time_scale() diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c index 61dbd05..0068e71 100644 --- a/src/backend/utils/adt/datetime.c +++ b/src/backend/utils/adt/datetime.c @@ -4462,16 +4462,23 @@ CheckDateTokenTables(void) } /* - * Common code for temporal protransform functions. Types time, timetz, - * timestamp and timestamptz each have a range of allowed precisions. An - * unspecified precision is rigorously equivalent to the highest specifiable - * precision. + * Common code for temporal prosupport functions: simplify, if possible, + * a call to a temporal type's length-coercion function. + * + * Types time, timetz, timestamp and timestamptz each have a range of allowed + * precisions. An unspecified precision is rigorously equivalent to the + * highest specifiable precision. We can replace the function call with a + * no-op RelabelType if it is coercing to the same or higher precision as the + * input is known to have. + * + * The input Node is always a FuncExpr, but to reduce the #include footprint + * of datetime.h, we declare it as Node *. * * Note: timestamp_scale throws an error when the typmod is out of range, but * we can't get there from a cast: our typmodin will have caught it already. */ Node * -TemporalTransform(int32 max_precis, Node *node) +TemporalSimplify(int32 max_precis, Node *node) { FuncExpr *expr = castNode(FuncExpr, node); Node *ret = NULL; diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index 45cd1a0..1c9deeb 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -34,6 +34,7 @@ #include "libpq/pqformat.h" #include "miscadmin.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/float.h" @@ -890,19 +891,25 @@ numeric_send(PG_FUNCTION_ARGS) /* - * numeric_transform() - + * numeric_support() * - * Flatten calls to numeric's length coercion function that solely represent - * increases in allowable precision. Scale changes mutate every datum, so - * they are unoptimizable. Some values, e.g. 1E-1001, can only fit into an - * unconstrained numeric, so a change from an unconstrained numeric to any - * constrained numeric is also unoptimizable. + * Planner support function for the numeric() length coercion function. + * + * Flatten calls that solely represent increases in allowable precision. + * Scale changes mutate every datum, so they are unoptimizable. Some values, + * e.g. 1E-1001, can only fit into an unconstrained numeric, so a change from + * an unconstrained numeric to any constrained numeric is also unoptimizable. */ Datum -numeric_transform(PG_FUNCTION_ARGS) +numeric_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; Node *typmod; Assert(list_length(expr->args) >= 2); @@ -920,16 +927,18 @@ numeric_transform(PG_FUNCTION_ARGS) int32 new_precision = (new_typmod - VARHDRSZ) >> 16 & 0xffff; /* - * If new_typmod < VARHDRSZ, the destination is unconstrained; that's - * always OK. If old_typmod >= VARHDRSZ, the source is constrained, - * and we're OK if the scale is unchanged and the precision is not - * decreasing. See further notes in function header comment. + * If new_typmod < VARHDRSZ, the destination is unconstrained; + * that's always OK. If old_typmod >= VARHDRSZ, the source is + * constrained, and we're OK if the scale is unchanged and the + * precision is not decreasing. See further notes in function + * header comment. */ if (new_typmod < (int32) VARHDRSZ || (old_typmod >= (int32) VARHDRSZ && new_scale == old_scale && new_precision >= old_precision)) ret = relabel_to_typmod(source, new_typmod); } + } PG_RETURN_POINTER(ret); } diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c index 7befb6a..e0ef2f7 100644 --- a/src/backend/utils/adt/timestamp.c +++ b/src/backend/utils/adt/timestamp.c @@ -29,6 +29,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "parser/scansup.h" #include "utils/array.h" #include "utils/builtins.h" @@ -297,15 +298,26 @@ timestamptypmodout(PG_FUNCTION_ARGS) } -/* timestamp_transform() - * Flatten calls to timestamp_scale() and timestamptz_scale() that solely - * represent increases in allowed precision. +/* + * timestamp_support() + * + * Planner support function for the timestamp_scale() and timestamptz_scale() + * length coercion functions (we need not distinguish them here). */ Datum -timestamp_transform(PG_FUNCTION_ARGS) +timestamp_support(PG_FUNCTION_ARGS) { - PG_RETURN_POINTER(TemporalTransform(MAX_TIMESTAMP_PRECISION, - (Node *) PG_GETARG_POINTER(0))); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + + ret = TemporalSimplify(MAX_TIMESTAMP_PRECISION, (Node *) req->fcall); + } + + PG_RETURN_POINTER(ret); } /* timestamp_scale() @@ -1235,16 +1247,25 @@ intervaltypmodleastfield(int32 typmod) } -/* interval_transform() +/* + * interval_support() + * + * Planner support function for interval_scale(). + * * Flatten superfluous calls to interval_scale(). The interval typmod is * complex to permit accepting and regurgitating all SQL standard variations. * For truncation purposes, it boils down to a single, simple granularity. */ Datum -interval_transform(PG_FUNCTION_ARGS) +interval_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; Node *typmod; Assert(list_length(expr->args) >= 2); @@ -1277,9 +1298,9 @@ interval_transform(PG_FUNCTION_ARGS) /* * Cast is a no-op if least field stays the same or decreases - * while precision stays the same or increases. But precision, - * which is to say, sub-second precision, only affects ranges that - * include SECOND. + * while precision stays the same or increases. But + * precision, which is to say, sub-second precision, only + * affects ranges that include SECOND. */ noop = (new_least_field <= old_least_field) && (old_least_field > 0 /* SECOND */ || @@ -1289,6 +1310,7 @@ interval_transform(PG_FUNCTION_ARGS) if (noop) ret = relabel_to_typmod(source, new_typmod); } + } PG_RETURN_POINTER(ret); } @@ -1359,7 +1381,7 @@ AdjustIntervalForTypmod(Interval *interval, int32 typmod) * can't do it consistently. (We cannot enforce a range limit on the * highest expected field, since we do not have any equivalent of * SQL's <interval leading field precision>.) If we ever decide to - * revisit this, interval_transform will likely require adjusting. + * revisit this, interval_support will likely require adjusting. * * Note: before PG 8.4 we interpreted a limited set of fields as * actually causing a "modulo" operation on a given value, potentially @@ -5020,18 +5042,6 @@ interval_part(PG_FUNCTION_ARGS) } -/* timestamp_zone_transform() - * The original optimization here caused problems by relabeling Vars that - * could be matched to index entries. It might be possible to resurrect it - * at some point by teaching the planner to be less cavalier with RelabelType - * nodes, but that will take careful analysis. - */ -Datum -timestamp_zone_transform(PG_FUNCTION_ARGS) -{ - PG_RETURN_POINTER(NULL); -} - /* timestamp_zone() * Encode timestamp type with specified time zone. * This function is just timestamp2timestamptz() except instead of @@ -5125,18 +5135,6 @@ timestamp_zone(PG_FUNCTION_ARGS) PG_RETURN_TIMESTAMPTZ(result); } -/* timestamp_izone_transform() - * The original optimization here caused problems by relabeling Vars that - * could be matched to index entries. It might be possible to resurrect it - * at some point by teaching the planner to be less cavalier with RelabelType - * nodes, but that will take careful analysis. - */ -Datum -timestamp_izone_transform(PG_FUNCTION_ARGS) -{ - PG_RETURN_POINTER(NULL); -} - /* timestamp_izone() * Encode timestamp type with specified time interval as time zone. */ diff --git a/src/backend/utils/adt/varbit.c b/src/backend/utils/adt/varbit.c index 1585da0..fdcc620 100644 --- a/src/backend/utils/adt/varbit.c +++ b/src/backend/utils/adt/varbit.c @@ -20,6 +20,7 @@ #include "common/int.h" #include "libpq/pqformat.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/varbit.h" @@ -672,16 +673,24 @@ varbit_send(PG_FUNCTION_ARGS) } /* - * varbit_transform() - * Flatten calls to varbit's length coercion function that set the new maximum - * length >= the previous maximum length. We can ignore the isExplicit - * argument, since that only affects truncation cases. + * varbit_support() + * + * Planner support function for the varbit() length coercion function. + * + * Currently, the only interesting thing we can do is flatten calls that set + * the new maximum length >= the previous maximum length. We can ignore the + * isExplicit argument, since that only affects truncation cases. */ Datum -varbit_transform(PG_FUNCTION_ARGS) +varbit_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; Node *typmod; Assert(list_length(expr->args) >= 2); @@ -699,6 +708,7 @@ varbit_transform(PG_FUNCTION_ARGS) if (new_max <= 0 || (old_max > 0 && old_max <= new_max)) ret = relabel_to_typmod(source, new_typmod); } + } PG_RETURN_POINTER(ret); } diff --git a/src/backend/utils/adt/varchar.c b/src/backend/utils/adt/varchar.c index 5cf927e..c866af0 100644 --- a/src/backend/utils/adt/varchar.c +++ b/src/backend/utils/adt/varchar.c @@ -21,6 +21,7 @@ #include "catalog/pg_type.h" #include "libpq/pqformat.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/varlena.h" @@ -547,16 +548,24 @@ varcharsend(PG_FUNCTION_ARGS) /* - * varchar_transform() - * Flatten calls to varchar's length coercion function that set the new maximum - * length >= the previous maximum length. We can ignore the isExplicit - * argument, since that only affects truncation cases. + * varchar_support() + * + * Planner support function for the varchar() length coercion function. + * + * Currently, the only interesting thing we can do is flatten calls that set + * the new maximum length >= the previous maximum length. We can ignore the + * isExplicit argument, since that only affects truncation cases. */ Datum -varchar_transform(PG_FUNCTION_ARGS) +varchar_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; Node *typmod; Assert(list_length(expr->args) >= 2); @@ -574,6 +583,7 @@ varchar_transform(PG_FUNCTION_ARGS) if (new_typmod < 0 || (old_typmod >= 0 && old_max <= new_max)) ret = relabel_to_typmod(source, new_typmod); } + } PG_RETURN_POINTER(ret); } diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index 245fcbf..4ff358a 100644 --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl @@ -1883,9 +1883,9 @@ my %tests = ( 'CREATE TRANSFORM FOR int' => { create_order => 34, create_sql => - 'CREATE TRANSFORM FOR int LANGUAGE SQL (FROM SQL WITH FUNCTION varchar_transform(internal), TO SQL WITH FUNCTIONint4recv(internal));', + 'CREATE TRANSFORM FOR int LANGUAGE SQL (FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTIONint4recv(internal));', regexp => - qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog\.varchar_transform\(internal\),TO SQL WITH FUNCTION pg_catalog\.int4recv\(internal\)\);/m, + qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog\.varchar_support\(internal\),TO SQL WITH FUNCTION pg_catalog\.int4recv\(internal\)\);/m, like => { %full_runs, section_pre_data => 1, }, }, @@ -2880,7 +2880,7 @@ my %tests = ( procost, prorows, provariadic, - protransform, + prosupport, prokind, prosecdef, proleakproof, @@ -2912,7 +2912,7 @@ my %tests = ( \QGRANT SELECT(procost) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prorows) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(provariadic) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* - \QGRANT SELECT(protransform) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* + \QGRANT SELECT(prosupport) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prokind) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prosecdef) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(proleakproof) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 3ecc2e1..e5cb5bb 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -1326,11 +1326,11 @@ { oid => '668', descr => 'adjust char() to typmod length', proname => 'bpchar', prorettype => 'bpchar', proargtypes => 'bpchar int4 bool', prosrc => 'bpchar' }, -{ oid => '3097', descr => 'transform a varchar length coercion', - proname => 'varchar_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'varchar_transform' }, +{ oid => '3097', descr => 'planner support for varchar length coercion', + proname => 'varchar_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'varchar_support' }, { oid => '669', descr => 'adjust varchar() to typmod length', - proname => 'varchar', protransform => 'varchar_transform', + proname => 'varchar', prosupport => 'varchar_support', prorettype => 'varchar', proargtypes => 'varchar int4 bool', prosrc => 'varchar' }, @@ -1954,13 +1954,9 @@ # OIDS 1000 - 1999 -{ oid => '3994', descr => 'transform a time zone adjustment', - proname => 'timestamp_izone_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_izone_transform' }, { oid => '1026', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_izone_transform', - prorettype => 'timestamp', proargtypes => 'interval timestamptz', - prosrc => 'timestamptz_izone' }, + proname => 'timezone', prorettype => 'timestamp', + proargtypes => 'interval timestamptz', prosrc => 'timestamptz_izone' }, { oid => '1031', descr => 'I/O', proname => 'aclitemin', provolatile => 's', prorettype => 'aclitem', @@ -2190,13 +2186,9 @@ { oid => '1158', descr => 'convert UNIX epoch to timestamptz', proname => 'to_timestamp', prorettype => 'timestamptz', proargtypes => 'float8', prosrc => 'float8_timestamptz' }, -{ oid => '3995', descr => 'transform a time zone adjustment', - proname => 'timestamp_zone_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_zone_transform' }, { oid => '1159', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_zone_transform', - prorettype => 'timestamp', proargtypes => 'text timestamptz', - prosrc => 'timestamptz_zone' }, + proname => 'timezone', prorettype => 'timestamp', + proargtypes => 'text timestamptz', prosrc => 'timestamptz_zone' }, { oid => '1160', descr => 'I/O', proname => 'interval_in', provolatile => 's', prorettype => 'interval', @@ -2301,11 +2293,11 @@ # OIDS 1200 - 1299 -{ oid => '3918', descr => 'transform an interval length coercion', - proname => 'interval_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'interval_transform' }, +{ oid => '3918', descr => 'planner support for interval length coercion', + proname => 'interval_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'interval_support' }, { oid => '1200', descr => 'adjust interval precision', - proname => 'interval', protransform => 'interval_transform', + proname => 'interval', prosupport => 'interval_support', prorettype => 'interval', proargtypes => 'interval int4', prosrc => 'interval_scale' }, @@ -3713,13 +3705,12 @@ { oid => '1685', descr => 'adjust bit() to typmod length', proname => 'bit', prorettype => 'bit', proargtypes => 'bit int4 bool', prosrc => 'bit' }, -{ oid => '3158', descr => 'transform a varbit length coercion', - proname => 'varbit_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'varbit_transform' }, +{ oid => '3158', descr => 'planner support for varbit length coercion', + proname => 'varbit_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'varbit_support' }, { oid => '1687', descr => 'adjust varbit() to typmod length', - proname => 'varbit', protransform => 'varbit_transform', - prorettype => 'varbit', proargtypes => 'varbit int4 bool', - prosrc => 'varbit' }, + proname => 'varbit', prosupport => 'varbit_support', prorettype => 'varbit', + proargtypes => 'varbit int4 bool', prosrc => 'varbit' }, { oid => '1698', descr => 'position of sub-bitstring', proname => 'position', prorettype => 'int4', proargtypes => 'bit bit', @@ -4081,11 +4072,11 @@ { oid => '2918', descr => 'I/O typmod', proname => 'numerictypmodout', prorettype => 'cstring', proargtypes => 'int4', prosrc => 'numerictypmodout' }, -{ oid => '3157', descr => 'transform a numeric length coercion', - proname => 'numeric_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'numeric_transform' }, +{ oid => '3157', descr => 'planner support for numeric length coercion', + proname => 'numeric_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'numeric_support' }, { oid => '1703', descr => 'adjust numeric to typmod precision/scale', - proname => 'numeric', protransform => 'numeric_transform', + proname => 'numeric', prosupport => 'numeric_support', prorettype => 'numeric', proargtypes => 'numeric int4', prosrc => 'numeric' }, { oid => '1704', proname => 'numeric_abs', prorettype => 'numeric', proargtypes => 'numeric', @@ -5448,15 +5439,15 @@ proname => 'bytea_sortsupport', prorettype => 'void', proargtypes => 'internal', prosrc => 'bytea_sortsupport' }, -{ oid => '3917', descr => 'transform a timestamp length coercion', - proname => 'timestamp_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_transform' }, -{ oid => '3944', descr => 'transform a time length coercion', - proname => 'time_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'time_transform' }, +{ oid => '3917', descr => 'planner support for timestamp length coercion', + proname => 'timestamp_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'timestamp_support' }, +{ oid => '3944', descr => 'planner support for time length coercion', + proname => 'time_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'time_support' }, { oid => '1961', descr => 'adjust timestamp precision', - proname => 'timestamp', protransform => 'timestamp_transform', + proname => 'timestamp', prosupport => 'timestamp_support', prorettype => 'timestamp', proargtypes => 'timestamp int4', prosrc => 'timestamp_scale' }, @@ -5468,14 +5459,14 @@ prosrc => 'oidsmaller' }, { oid => '1967', descr => 'adjust timestamptz precision', - proname => 'timestamptz', protransform => 'timestamp_transform', + proname => 'timestamptz', prosupport => 'timestamp_support', prorettype => 'timestamptz', proargtypes => 'timestamptz int4', prosrc => 'timestamptz_scale' }, { oid => '1968', descr => 'adjust time precision', - proname => 'time', protransform => 'time_transform', prorettype => 'time', + proname => 'time', prosupport => 'time_support', prorettype => 'time', proargtypes => 'time int4', prosrc => 'time_scale' }, { oid => '1969', descr => 'adjust time with time zone precision', - proname => 'timetz', protransform => 'time_transform', prorettype => 'timetz', + proname => 'timetz', prosupport => 'time_support', prorettype => 'timetz', proargtypes => 'timetz int4', prosrc => 'timetz_scale' }, { oid => '2003', @@ -5662,13 +5653,11 @@ prosrc => 'select pg_catalog.age(cast(current_date as timestamp without time zone), $1)' }, { oid => '2069', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_zone_transform', - prorettype => 'timestamptz', proargtypes => 'text timestamp', - prosrc => 'timestamp_zone' }, + proname => 'timezone', prorettype => 'timestamptz', + proargtypes => 'text timestamp', prosrc => 'timestamp_zone' }, { oid => '2070', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_izone_transform', - prorettype => 'timestamptz', proargtypes => 'interval timestamp', - prosrc => 'timestamp_izone' }, + proname => 'timezone', prorettype => 'timestamptz', + proargtypes => 'interval timestamp', prosrc => 'timestamp_izone' }, { oid => '2071', proname => 'date_pl_interval', prorettype => 'timestamp', proargtypes => 'date interval', prosrc => 'date_pl_interval' }, diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index c2bb951..b433769 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -53,8 +53,8 @@ CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,Proce /* element type of variadic array, or 0 */ Oid provariadic BKI_DEFAULT(0) BKI_LOOKUP(pg_type); - /* transforms calls to it during planning */ - regproc protransform BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); + /* planner support function for this function, or 0 if none */ + regproc prosupport BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); /* see PROKIND_ categories below */ char prokind BKI_DEFAULT(f); diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index 10dac60..e029b40 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -505,7 +505,8 @@ typedef enum NodeTag T_IndexAmRoutine, /* in access/amapi.h */ T_TsmRoutine, /* in access/tsmapi.h */ T_ForeignKeyCacheInfo, /* in utils/rel.h */ - T_CallContext /* in nodes/parsenodes.h */ + T_CallContext, /* in nodes/parsenodes.h */ + T_SupportRequestSimplify /* in nodes/supportnodes.h */ } NodeTag; /* diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h new file mode 100644 index 0000000..1f7d02b --- /dev/null +++ b/src/include/nodes/supportnodes.h @@ -0,0 +1,70 @@ +/*------------------------------------------------------------------------- + * + * supportnodes.h + * Definitions for planner support functions. + * + * This file defines the API for "planner support functions", which + * are SQL functions (normally written in C) that can be attached to + * another "target" function to give the system additional knowledge + * about the target function. All the current capabilities have to do + * with planning queries that use the target function, though it is + * possible that future extensions will add functionality to be invoked + * by the parser or executor. + * + * A support function must have the SQL signature + * supportfn(internal) returns internal + * The argument is a pointer to one of the Node types defined in this file. + * The result is usually also a Node pointer, though its type depends on + * which capability is being invoked. In all cases, a NULL pointer result + * (that's PG_RETURN_POINTER(NULL), not PG_RETURN_NULL()) indicates that + * the support function cannot do anything useful for the given request. + * Support functions must return a NULL pointer, not fail, if they do not + * recognize the request node type or cannot handle the given case; this + * allows for future extensions of the set of request cases. + * + * + * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/nodes/supportnodes.h + * + *------------------------------------------------------------------------- + */ +#ifndef SUPPORTNODES_H +#define SUPPORTNODES_H + +#include "nodes/primnodes.h" + +struct PlannerInfo; /* avoid including relation.h here */ + + +/* + * The Simplify request allows the support function to perform plan-time + * simplification of a call to its target function. For example, a varchar + * length coercion that does not decrease the allowed length of its argument + * could be replaced by a RelabelType node, or "x + 0" could be replaced by + * "x". This is invoked during the planner's constant-folding pass, so the + * function's arguments can be presumed already simplified. + * + * The planner's PlannerInfo "root" is typically not needed, but can be + * consulted if it's necessary to obtain info about Vars present in + * the given node tree. Beware that root could be NULL in some usages. + * + * "fcall" will be a FuncExpr invoking the support function's target + * function. (This is true even if the original parsetree node was an + * operator call; a FuncExpr is synthesized for this purpose.) + * + * The result should be a semantically-equivalent transformed node tree, + * or NULL if no simplification could be performed. Do *not* return or + * modify *fcall, as it isn't really a separately allocated Node. But + * it's okay to use fcall->args, or parts of it, in the result tree. + */ +typedef struct SupportRequestSimplify +{ + NodeTag type; + + struct PlannerInfo *root; /* Planner's infrastructure */ + FuncExpr *fcall; /* Function call to be simplified */ +} SupportRequestSimplify; + +#endif /* SUPPORTNODES_H */ diff --git a/src/include/utils/datetime.h b/src/include/utils/datetime.h index f5ec9bb..87f819e 100644 --- a/src/include/utils/datetime.h +++ b/src/include/utils/datetime.h @@ -330,7 +330,7 @@ extern int DecodeUnits(int field, char *lowtoken, int *val); extern int j2day(int jd); -extern Node *TemporalTransform(int32 max_precis, Node *node); +extern Node *TemporalSimplify(int32 max_precis, Node *node); extern bool CheckDateTokenTables(void); diff --git a/src/test/modules/test_ddl_deparse/expected/create_transform.out b/src/test/modules/test_ddl_deparse/expected/create_transform.out index 0d1cc36..da7fea2 100644 --- a/src/test/modules/test_ddl_deparse/expected/create_transform.out +++ b/src/test/modules/test_ddl_deparse/expected/create_transform.out @@ -7,7 +7,7 @@ -- internal and as return argument the datatype of the transform done. -- pl/plpgsql does not authorize the use of internal as data type. CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); NOTICE: DDL test: type simple, tag CREATE TRANSFORM DROP TRANSFORM FOR int LANGUAGE SQL; diff --git a/src/test/modules/test_ddl_deparse/sql/create_transform.sql b/src/test/modules/test_ddl_deparse/sql/create_transform.sql index 0968702..132fc5a 100644 --- a/src/test/modules/test_ddl_deparse/sql/create_transform.sql +++ b/src/test/modules/test_ddl_deparse/sql/create_transform.sql @@ -8,7 +8,7 @@ -- internal and as return argument the datatype of the transform done. -- pl/plpgsql does not authorize the use of internal as data type. CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); DROP TRANSFORM FOR int LANGUAGE SQL; diff --git a/src/test/regress/expected/object_address.out b/src/test/regress/expected/object_address.out index 4085e45..c89ec06 100644 --- a/src/test/regress/expected/object_address.out +++ b/src/test/regress/expected/object_address.out @@ -38,7 +38,7 @@ CREATE USER MAPPING FOR regress_addr_user SERVER "integer"; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user IN SCHEMA public GRANT ALL ON TABLES TO regress_addr_user; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user REVOKE DELETE ON TABLES FROM regress_addr_user; CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (connect = false, slot_name = NONE); diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out index ef268d3..4edc817 100644 --- a/src/test/regress/expected/oidjoins.out +++ b/src/test/regress/expected/oidjoins.out @@ -809,12 +809,12 @@ WHERE provariadic != 0 AND ------+------------- (0 rows) -SELECT ctid, protransform +SELECT ctid, prosupport FROM pg_catalog.pg_proc fk -WHERE protransform != 0 AND - NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.protransform); - ctid | protransform -------+-------------- +WHERE prosupport != 0 AND + NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.prosupport); + ctid | prosupport +------+------------ (0 rows) SELECT ctid, prorettype diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out index 7328095..ce25ee0 100644 --- a/src/test/regress/expected/opr_sanity.out +++ b/src/test/regress/expected/opr_sanity.out @@ -453,10 +453,10 @@ WHERE proallargtypes IS NOT NULL AND -----+---------+-------------+----------------+------------- (0 rows) --- Check for protransform functions with the wrong signature +-- Check for prosupport functions with the wrong signature SELECT p1.oid, p1.proname, p2.oid, p2.proname FROM pg_proc AS p1, pg_proc AS p2 -WHERE p2.oid = p1.protransform AND +WHERE p2.oid = p1.prosupport AND (p2.prorettype != 'internal'::regtype OR p2.proretset OR p2.pronargs != 1 OR p2.proargtypes[0] != 'internal'::regtype); oid | proname | oid | proname diff --git a/src/test/regress/sql/object_address.sql b/src/test/regress/sql/object_address.sql index d7df322..fd79465 100644 --- a/src/test/regress/sql/object_address.sql +++ b/src/test/regress/sql/object_address.sql @@ -41,7 +41,7 @@ CREATE USER MAPPING FOR regress_addr_user SERVER "integer"; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user IN SCHEMA public GRANT ALL ON TABLES TO regress_addr_user; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user REVOKE DELETE ON TABLES FROM regress_addr_user; CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (connect = false, slot_name = NONE); diff --git a/src/test/regress/sql/oidjoins.sql b/src/test/regress/sql/oidjoins.sql index c8291d3..dbe4a58 100644 --- a/src/test/regress/sql/oidjoins.sql +++ b/src/test/regress/sql/oidjoins.sql @@ -405,10 +405,10 @@ SELECT ctid, provariadic FROM pg_catalog.pg_proc fk WHERE provariadic != 0 AND NOT EXISTS(SELECT 1 FROM pg_catalog.pg_type pk WHERE pk.oid = fk.provariadic); -SELECT ctid, protransform +SELECT ctid, prosupport FROM pg_catalog.pg_proc fk -WHERE protransform != 0 AND - NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.protransform); +WHERE prosupport != 0 AND + NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.prosupport); SELECT ctid, prorettype FROM pg_catalog.pg_proc fk WHERE prorettype != 0 AND diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql index 8544cbe..e2014fc 100644 --- a/src/test/regress/sql/opr_sanity.sql +++ b/src/test/regress/sql/opr_sanity.sql @@ -353,10 +353,10 @@ WHERE proallargtypes IS NOT NULL AND FROM generate_series(1, array_length(proallargtypes, 1)) g(i) WHERE proargmodes IS NULL OR proargmodes[i] IN ('i', 'b', 'v')); --- Check for protransform functions with the wrong signature +-- Check for prosupport functions with the wrong signature SELECT p1.oid, p1.proname, p2.oid, p2.proname FROM pg_proc AS p1, pg_proc AS p2 -WHERE p2.oid = p1.protransform AND +WHERE p2.oid = p1.prosupport AND (p2.prorettype != 'internal'::regtype OR p2.proretset OR p2.pronargs != 1 OR p2.proargtypes[0] != 'internal'::regtype); diff --git a/src/tools/findoidjoins/README b/src/tools/findoidjoins/README index 305454a..e5fc310 100644 --- a/src/tools/findoidjoins/README +++ b/src/tools/findoidjoins/README @@ -161,7 +161,7 @@ Join pg_catalog.pg_proc.pronamespace => pg_catalog.pg_namespace.oid Join pg_catalog.pg_proc.proowner => pg_catalog.pg_authid.oid Join pg_catalog.pg_proc.prolang => pg_catalog.pg_language.oid Join pg_catalog.pg_proc.provariadic => pg_catalog.pg_type.oid -Join pg_catalog.pg_proc.protransform => pg_catalog.pg_proc.oid +Join pg_catalog.pg_proc.prosupport => pg_catalog.pg_proc.oid Join pg_catalog.pg_proc.prorettype => pg_catalog.pg_type.oid Join pg_catalog.pg_range.rngtypid => pg_catalog.pg_type.oid Join pg_catalog.pg_range.rngsubtype => pg_catalog.pg_type.oid diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index af4d062..6dd0700 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -5146,11 +5146,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l </row> <row> - <entry><structfield>protransform</structfield></entry> + <entry><structfield>prosupport</structfield></entry> <entry><type>regproc</type></entry> <entry><literal><link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.oid</literal></entry> - <entry>Calls to this function can be simplified by this other function - (see <xref linkend="xfunc-transform-functions"/>)</entry> + <entry>Optional planner support function for this function + (see <xref linkend="xfunc-optimization"/>)</entry> </row> <row> diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index e18272c..d70aa6e 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -3241,40 +3241,6 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray </para> </sect2> - <sect2 id="xfunc-transform-functions"> - <title>Transform Functions</title> - - <para> - Some function calls can be simplified during planning based on - properties specific to the function. For example, - <literal>int4mul(n, 1)</literal> could be simplified to just <literal>n</literal>. - To define such function-specific optimizations, write a - <firstterm>transform function</firstterm> and place its OID in the - <structfield>protransform</structfield> field of the primary function's - <structname>pg_proc</structname> entry. The transform function must have the SQL - signature <literal>protransform(internal) RETURNS internal</literal>. The - argument, actually <type>FuncExpr *</type>, is a dummy node representing a - call to the primary function. If the transform function's study of the - expression tree proves that a simplified expression tree can substitute - for all possible concrete calls represented thereby, build and return - that simplified expression. Otherwise, return a <literal>NULL</literal> - pointer (<emphasis>not</emphasis> a SQL null). - </para> - - <para> - We make no guarantee that <productname>PostgreSQL</productname> will never call the - primary function in cases that the transform function could simplify. - Ensure rigorous equivalence between the simplified expression and an - actual call to the primary function. - </para> - - <para> - Currently, this facility is not exposed to users at the SQL level - because of security concerns, so it is only practical to use for - optimizing built-in functions. - </para> - </sect2> - <sect2> <title>Shared Memory and LWLocks</title> @@ -3388,3 +3354,89 @@ if (!ptr) </sect2> </sect1> + + <sect1 id="xfunc-optimization"> + <title>Function Optimization Information</title> + + <indexterm zone="xfunc-optimization"> + <primary>optimization information</primary> + <secondary>for functions</secondary> + </indexterm> + + <para> + By default, a function is just a <quote>black box</quote> that the + database system knows very little about the behavior of. However, + that means that queries using the function may be executed much less + efficiently than they could be. It is possible to supply additional + knowledge that helps the planner optimize function calls. + </para> + + <para> + Some basic facts can be supplied by declarative annotations provided in + the <xref linkend="sql-createfunction"/> command. Most important of + these is the function's <link linkend="xfunc-volatility">volatility + category</link> (<literal>IMMUTABLE</literal>, <literal>STABLE</literal>, + or <literal>VOLATILE</literal>); one should always be careful to + specify this correctly when defining a function. + The parallel safety property (<literal>PARALLEL + UNSAFE</literal>, <literal>PARALLEL RESTRICTED</literal>, or + <literal>PARALLEL SAFE</literal>) must also be specified if you hope + to use the function in parallelized queries. + It can also be useful to specify the function's estimated execution + cost, and/or the number of rows a set-returning function is estimated + to return. However, the declarative way of specifying those two + facts only allows specifying a constant value, which is often + inadequate. + </para> + + <para> + It is also possible to attach a <firstterm>planner support + function</firstterm> to a SQL-callable function (called + its <firstterm>target function</firstterm>), and thereby provide + knowledge about the target function that is too complex to be + represented declaratively. Planner support functions have to be + written in C (although their target functions might not be), so this is + an advanced feature that relatively few people will use. + </para> + + <para> + A planner support function must have the SQL signature +<programlisting> +supportfn(internal) returns internal +</programlisting> + It is attached to its target function by specifying + the <literal>SUPPORT</literal> clause when creating the target function. + </para> + + <para> + The details of the API for planner support functions can be found in + file <filename>src/include/nodes/supportnodes.h</filename> in the + <productname>PostgreSQL</productname> source code. Here we provide + just an overview of what planner support functions can do. + The set of possible requests to a support function is extensible, + so more things might be possible in future versions. + </para> + + <para> + Some function calls can be simplified during planning based on + properties specific to the function. For example, + <literal>int4mul(n, 1)</literal> could be simplified to + just <literal>n</literal>. This type of transformation can be + performed by a planner support function, by having it implement + the <literal>SupportRequestSimplify</literal> request type. + The support function will be called for each instance of its target + function found in a query parse tree. If it finds that the particular + call can be simplified into some other form, it can build and return a + parse tree representing that expression. This will automatically work + for operators based on the function, too — in the example just + given, <literal>n * 1</literal> would also be simplified to + <literal>n</literal>. + (But note that this is just an example; this particular + optimization is not actually performed by + standard <productname>PostgreSQL</productname>.) + We make no guarantee that <productname>PostgreSQL</productname> will + never call the target function in cases that the support function could + simplify. Ensure rigorous equivalence between the simplified + expression and an actual execution of the target function. + </para> + </sect1> diff --git a/doc/src/sgml/xoper.sgml b/doc/src/sgml/xoper.sgml index 2f5560a..260e43c 100644 --- a/doc/src/sgml/xoper.sgml +++ b/doc/src/sgml/xoper.sgml @@ -78,6 +78,11 @@ SELECT (a + b) AS c FROM test_complex; <sect1 id="xoper-optimization"> <title>Operator Optimization Information</title> + <indexterm zone="xoper-optimization"> + <primary>optimization information</primary> + <secondary>for operators</secondary> + </indexterm> + <para> A <productname>PostgreSQL</productname> operator definition can include several optional clauses that tell the system useful things about how @@ -97,6 +102,13 @@ SELECT (a + b) AS c FROM test_complex; the ones that release &version; understands. </para> + <para> + It is also possible to attach a planner support function to the function + that underlies an operator, providing another way of telling the system + about the behavior of the operator. + See <xref linkend="xfunc-optimization"/> for more information. + </para> + <sect2> <title><literal>COMMUTATOR</literal></title> diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index db78061..3a86f1e 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -319,7 +319,7 @@ ProcedureCreate(const char *procedureName, values[Anum_pg_proc_procost - 1] = Float4GetDatum(procost); values[Anum_pg_proc_prorows - 1] = Float4GetDatum(prorows); values[Anum_pg_proc_provariadic - 1] = ObjectIdGetDatum(variadicType); - values[Anum_pg_proc_protransform - 1] = ObjectIdGetDatum(InvalidOid); + values[Anum_pg_proc_prosupport - 1] = ObjectIdGetDatum(InvalidOid); values[Anum_pg_proc_prokind - 1] = CharGetDatum(prokind); values[Anum_pg_proc_prosecdef - 1] = BoolGetDatum(security_definer); values[Anum_pg_proc_proleakproof - 1] = BoolGetDatum(isLeakProof); diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index f0ef102..061a855 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -32,6 +32,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" #include "optimizer/planmain.h" @@ -4269,13 +4270,16 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod, args, funcvariadic, func_tuple, context); - if (!newexpr && allow_non_const && OidIsValid(func_form->protransform)) + if (!newexpr && allow_non_const && OidIsValid(func_form->prosupport)) { /* - * Build a dummy FuncExpr node containing the simplified arg list. We - * use this approach to present a uniform interface to the transform - * function regardless of how the function is actually being invoked. + * Build a SupportRequestSimplify node to pass to the support + * function, pointing to a dummy FuncExpr node containing the + * simplified arg list. We use this approach to present a uniform + * interface to the support function regardless of how the target + * function is actually being invoked. */ + SupportRequestSimplify req; FuncExpr fexpr; fexpr.xpr.type = T_FuncExpr; @@ -4289,9 +4293,16 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod, fexpr.args = args; fexpr.location = -1; + req.type = T_SupportRequestSimplify; + req.root = context->root; + req.fcall = &fexpr; + newexpr = (Expr *) - DatumGetPointer(OidFunctionCall1(func_form->protransform, - PointerGetDatum(&fexpr))); + DatumGetPointer(OidFunctionCall1(func_form->prosupport, + PointerGetDatum(&req))); + + /* catch a possible API misunderstanding */ + Assert(newexpr != (Expr *) &fexpr); } if (!newexpr && allow_non_const) diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c index 3810e4a..cf5a1c6 100644 --- a/src/backend/utils/adt/date.c +++ b/src/backend/utils/adt/date.c @@ -24,6 +24,7 @@ #include "access/xact.h" #include "libpq/pqformat.h" #include "miscadmin.h" +#include "nodes/supportnodes.h" #include "parser/scansup.h" #include "utils/array.h" #include "utils/builtins.h" @@ -1341,15 +1342,25 @@ make_time(PG_FUNCTION_ARGS) } -/* time_transform() - * Flatten calls to time_scale() and timetz_scale() that solely represent - * increases in allowed precision. +/* time_support() + * + * Planner support function for the time_scale() and timetz_scale() + * length coercion functions (we need not distinguish them here). */ Datum -time_transform(PG_FUNCTION_ARGS) +time_support(PG_FUNCTION_ARGS) { - PG_RETURN_POINTER(TemporalTransform(MAX_TIME_PRECISION, - (Node *) PG_GETARG_POINTER(0))); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + + ret = TemporalSimplify(MAX_TIME_PRECISION, (Node *) req->fcall); + } + + PG_RETURN_POINTER(ret); } /* time_scale() diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c index 61dbd05..0068e71 100644 --- a/src/backend/utils/adt/datetime.c +++ b/src/backend/utils/adt/datetime.c @@ -4462,16 +4462,23 @@ CheckDateTokenTables(void) } /* - * Common code for temporal protransform functions. Types time, timetz, - * timestamp and timestamptz each have a range of allowed precisions. An - * unspecified precision is rigorously equivalent to the highest specifiable - * precision. + * Common code for temporal prosupport functions: simplify, if possible, + * a call to a temporal type's length-coercion function. + * + * Types time, timetz, timestamp and timestamptz each have a range of allowed + * precisions. An unspecified precision is rigorously equivalent to the + * highest specifiable precision. We can replace the function call with a + * no-op RelabelType if it is coercing to the same or higher precision as the + * input is known to have. + * + * The input Node is always a FuncExpr, but to reduce the #include footprint + * of datetime.h, we declare it as Node *. * * Note: timestamp_scale throws an error when the typmod is out of range, but * we can't get there from a cast: our typmodin will have caught it already. */ Node * -TemporalTransform(int32 max_precis, Node *node) +TemporalSimplify(int32 max_precis, Node *node) { FuncExpr *expr = castNode(FuncExpr, node); Node *ret = NULL; diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index 45cd1a0..1c9deeb 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -34,6 +34,7 @@ #include "libpq/pqformat.h" #include "miscadmin.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/float.h" @@ -890,45 +891,53 @@ numeric_send(PG_FUNCTION_ARGS) /* - * numeric_transform() - + * numeric_support() * - * Flatten calls to numeric's length coercion function that solely represent - * increases in allowable precision. Scale changes mutate every datum, so - * they are unoptimizable. Some values, e.g. 1E-1001, can only fit into an - * unconstrained numeric, so a change from an unconstrained numeric to any - * constrained numeric is also unoptimizable. + * Planner support function for the numeric() length coercion function. + * + * Flatten calls that solely represent increases in allowable precision. + * Scale changes mutate every datum, so they are unoptimizable. Some values, + * e.g. 1E-1001, can only fit into an unconstrained numeric, so a change from + * an unconstrained numeric to any constrained numeric is also unoptimizable. */ Datum -numeric_transform(PG_FUNCTION_ARGS) +numeric_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; - Node *typmod; - Assert(list_length(expr->args) >= 2); + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; + Node *typmod; - typmod = (Node *) lsecond(expr->args); + Assert(list_length(expr->args) >= 2); - if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) - { - Node *source = (Node *) linitial(expr->args); - int32 old_typmod = exprTypmod(source); - int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); - int32 old_scale = (old_typmod - VARHDRSZ) & 0xffff; - int32 new_scale = (new_typmod - VARHDRSZ) & 0xffff; - int32 old_precision = (old_typmod - VARHDRSZ) >> 16 & 0xffff; - int32 new_precision = (new_typmod - VARHDRSZ) >> 16 & 0xffff; + typmod = (Node *) lsecond(expr->args); - /* - * If new_typmod < VARHDRSZ, the destination is unconstrained; that's - * always OK. If old_typmod >= VARHDRSZ, the source is constrained, - * and we're OK if the scale is unchanged and the precision is not - * decreasing. See further notes in function header comment. - */ - if (new_typmod < (int32) VARHDRSZ || - (old_typmod >= (int32) VARHDRSZ && - new_scale == old_scale && new_precision >= old_precision)) - ret = relabel_to_typmod(source, new_typmod); + if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) + { + Node *source = (Node *) linitial(expr->args); + int32 old_typmod = exprTypmod(source); + int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); + int32 old_scale = (old_typmod - VARHDRSZ) & 0xffff; + int32 new_scale = (new_typmod - VARHDRSZ) & 0xffff; + int32 old_precision = (old_typmod - VARHDRSZ) >> 16 & 0xffff; + int32 new_precision = (new_typmod - VARHDRSZ) >> 16 & 0xffff; + + /* + * If new_typmod < VARHDRSZ, the destination is unconstrained; + * that's always OK. If old_typmod >= VARHDRSZ, the source is + * constrained, and we're OK if the scale is unchanged and the + * precision is not decreasing. See further notes in function + * header comment. + */ + if (new_typmod < (int32) VARHDRSZ || + (old_typmod >= (int32) VARHDRSZ && + new_scale == old_scale && new_precision >= old_precision)) + ret = relabel_to_typmod(source, new_typmod); + } } PG_RETURN_POINTER(ret); diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c index 7befb6a..e0ef2f7 100644 --- a/src/backend/utils/adt/timestamp.c +++ b/src/backend/utils/adt/timestamp.c @@ -29,6 +29,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "parser/scansup.h" #include "utils/array.h" #include "utils/builtins.h" @@ -297,15 +298,26 @@ timestamptypmodout(PG_FUNCTION_ARGS) } -/* timestamp_transform() - * Flatten calls to timestamp_scale() and timestamptz_scale() that solely - * represent increases in allowed precision. +/* + * timestamp_support() + * + * Planner support function for the timestamp_scale() and timestamptz_scale() + * length coercion functions (we need not distinguish them here). */ Datum -timestamp_transform(PG_FUNCTION_ARGS) +timestamp_support(PG_FUNCTION_ARGS) { - PG_RETURN_POINTER(TemporalTransform(MAX_TIMESTAMP_PRECISION, - (Node *) PG_GETARG_POINTER(0))); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + + ret = TemporalSimplify(MAX_TIMESTAMP_PRECISION, (Node *) req->fcall); + } + + PG_RETURN_POINTER(ret); } /* timestamp_scale() @@ -1235,59 +1247,69 @@ intervaltypmodleastfield(int32 typmod) } -/* interval_transform() +/* + * interval_support() + * + * Planner support function for interval_scale(). + * * Flatten superfluous calls to interval_scale(). The interval typmod is * complex to permit accepting and regurgitating all SQL standard variations. * For truncation purposes, it boils down to a single, simple granularity. */ Datum -interval_transform(PG_FUNCTION_ARGS) +interval_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; - Node *typmod; - Assert(list_length(expr->args) >= 2); + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; + Node *typmod; - typmod = (Node *) lsecond(expr->args); + Assert(list_length(expr->args) >= 2); - if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) - { - Node *source = (Node *) linitial(expr->args); - int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); - bool noop; + typmod = (Node *) lsecond(expr->args); - if (new_typmod < 0) - noop = true; - else + if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) { - int32 old_typmod = exprTypmod(source); - int old_least_field; - int new_least_field; - int old_precis; - int new_precis; - - old_least_field = intervaltypmodleastfield(old_typmod); - new_least_field = intervaltypmodleastfield(new_typmod); - if (old_typmod < 0) - old_precis = INTERVAL_FULL_PRECISION; + Node *source = (Node *) linitial(expr->args); + int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); + bool noop; + + if (new_typmod < 0) + noop = true; else - old_precis = INTERVAL_PRECISION(old_typmod); - new_precis = INTERVAL_PRECISION(new_typmod); - - /* - * Cast is a no-op if least field stays the same or decreases - * while precision stays the same or increases. But precision, - * which is to say, sub-second precision, only affects ranges that - * include SECOND. - */ - noop = (new_least_field <= old_least_field) && - (old_least_field > 0 /* SECOND */ || - new_precis >= MAX_INTERVAL_PRECISION || - new_precis >= old_precis); + { + int32 old_typmod = exprTypmod(source); + int old_least_field; + int new_least_field; + int old_precis; + int new_precis; + + old_least_field = intervaltypmodleastfield(old_typmod); + new_least_field = intervaltypmodleastfield(new_typmod); + if (old_typmod < 0) + old_precis = INTERVAL_FULL_PRECISION; + else + old_precis = INTERVAL_PRECISION(old_typmod); + new_precis = INTERVAL_PRECISION(new_typmod); + + /* + * Cast is a no-op if least field stays the same or decreases + * while precision stays the same or increases. But + * precision, which is to say, sub-second precision, only + * affects ranges that include SECOND. + */ + noop = (new_least_field <= old_least_field) && + (old_least_field > 0 /* SECOND */ || + new_precis >= MAX_INTERVAL_PRECISION || + new_precis >= old_precis); + } + if (noop) + ret = relabel_to_typmod(source, new_typmod); } - if (noop) - ret = relabel_to_typmod(source, new_typmod); } PG_RETURN_POINTER(ret); @@ -1359,7 +1381,7 @@ AdjustIntervalForTypmod(Interval *interval, int32 typmod) * can't do it consistently. (We cannot enforce a range limit on the * highest expected field, since we do not have any equivalent of * SQL's <interval leading field precision>.) If we ever decide to - * revisit this, interval_transform will likely require adjusting. + * revisit this, interval_support will likely require adjusting. * * Note: before PG 8.4 we interpreted a limited set of fields as * actually causing a "modulo" operation on a given value, potentially @@ -5020,18 +5042,6 @@ interval_part(PG_FUNCTION_ARGS) } -/* timestamp_zone_transform() - * The original optimization here caused problems by relabeling Vars that - * could be matched to index entries. It might be possible to resurrect it - * at some point by teaching the planner to be less cavalier with RelabelType - * nodes, but that will take careful analysis. - */ -Datum -timestamp_zone_transform(PG_FUNCTION_ARGS) -{ - PG_RETURN_POINTER(NULL); -} - /* timestamp_zone() * Encode timestamp type with specified time zone. * This function is just timestamp2timestamptz() except instead of @@ -5125,18 +5135,6 @@ timestamp_zone(PG_FUNCTION_ARGS) PG_RETURN_TIMESTAMPTZ(result); } -/* timestamp_izone_transform() - * The original optimization here caused problems by relabeling Vars that - * could be matched to index entries. It might be possible to resurrect it - * at some point by teaching the planner to be less cavalier with RelabelType - * nodes, but that will take careful analysis. - */ -Datum -timestamp_izone_transform(PG_FUNCTION_ARGS) -{ - PG_RETURN_POINTER(NULL); -} - /* timestamp_izone() * Encode timestamp type with specified time interval as time zone. */ diff --git a/src/backend/utils/adt/varbit.c b/src/backend/utils/adt/varbit.c index 1585da0..fdcc620 100644 --- a/src/backend/utils/adt/varbit.c +++ b/src/backend/utils/adt/varbit.c @@ -20,6 +20,7 @@ #include "common/int.h" #include "libpq/pqformat.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/varbit.h" @@ -672,32 +673,41 @@ varbit_send(PG_FUNCTION_ARGS) } /* - * varbit_transform() - * Flatten calls to varbit's length coercion function that set the new maximum - * length >= the previous maximum length. We can ignore the isExplicit - * argument, since that only affects truncation cases. + * varbit_support() + * + * Planner support function for the varbit() length coercion function. + * + * Currently, the only interesting thing we can do is flatten calls that set + * the new maximum length >= the previous maximum length. We can ignore the + * isExplicit argument, since that only affects truncation cases. */ Datum -varbit_transform(PG_FUNCTION_ARGS) +varbit_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; - Node *typmod; - Assert(list_length(expr->args) >= 2); + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; + Node *typmod; - typmod = (Node *) lsecond(expr->args); + Assert(list_length(expr->args) >= 2); - if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) - { - Node *source = (Node *) linitial(expr->args); - int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); - int32 old_max = exprTypmod(source); - int32 new_max = new_typmod; - - /* Note: varbit() treats typmod 0 as invalid, so we do too */ - if (new_max <= 0 || (old_max > 0 && old_max <= new_max)) - ret = relabel_to_typmod(source, new_typmod); + typmod = (Node *) lsecond(expr->args); + + if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) + { + Node *source = (Node *) linitial(expr->args); + int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); + int32 old_max = exprTypmod(source); + int32 new_max = new_typmod; + + /* Note: varbit() treats typmod 0 as invalid, so we do too */ + if (new_max <= 0 || (old_max > 0 && old_max <= new_max)) + ret = relabel_to_typmod(source, new_typmod); + } } PG_RETURN_POINTER(ret); diff --git a/src/backend/utils/adt/varchar.c b/src/backend/utils/adt/varchar.c index 5cf927e..c866af0 100644 --- a/src/backend/utils/adt/varchar.c +++ b/src/backend/utils/adt/varchar.c @@ -21,6 +21,7 @@ #include "catalog/pg_type.h" #include "libpq/pqformat.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/varlena.h" @@ -547,32 +548,41 @@ varcharsend(PG_FUNCTION_ARGS) /* - * varchar_transform() - * Flatten calls to varchar's length coercion function that set the new maximum - * length >= the previous maximum length. We can ignore the isExplicit - * argument, since that only affects truncation cases. + * varchar_support() + * + * Planner support function for the varchar() length coercion function. + * + * Currently, the only interesting thing we can do is flatten calls that set + * the new maximum length >= the previous maximum length. We can ignore the + * isExplicit argument, since that only affects truncation cases. */ Datum -varchar_transform(PG_FUNCTION_ARGS) +varchar_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; - Node *typmod; - Assert(list_length(expr->args) >= 2); + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; + Node *typmod; - typmod = (Node *) lsecond(expr->args); + Assert(list_length(expr->args) >= 2); - if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) - { - Node *source = (Node *) linitial(expr->args); - int32 old_typmod = exprTypmod(source); - int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); - int32 old_max = old_typmod - VARHDRSZ; - int32 new_max = new_typmod - VARHDRSZ; - - if (new_typmod < 0 || (old_typmod >= 0 && old_max <= new_max)) - ret = relabel_to_typmod(source, new_typmod); + typmod = (Node *) lsecond(expr->args); + + if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) + { + Node *source = (Node *) linitial(expr->args); + int32 old_typmod = exprTypmod(source); + int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); + int32 old_max = old_typmod - VARHDRSZ; + int32 new_max = new_typmod - VARHDRSZ; + + if (new_typmod < 0 || (old_typmod >= 0 && old_max <= new_max)) + ret = relabel_to_typmod(source, new_typmod); + } } PG_RETURN_POINTER(ret); diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index 245fcbf..4ff358a 100644 --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl @@ -1883,9 +1883,9 @@ my %tests = ( 'CREATE TRANSFORM FOR int' => { create_order => 34, create_sql => - 'CREATE TRANSFORM FOR int LANGUAGE SQL (FROM SQL WITH FUNCTION varchar_transform(internal), TO SQL WITH FUNCTIONint4recv(internal));', + 'CREATE TRANSFORM FOR int LANGUAGE SQL (FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTIONint4recv(internal));', regexp => - qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog\.varchar_transform\(internal\),TO SQL WITH FUNCTION pg_catalog\.int4recv\(internal\)\);/m, + qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog\.varchar_support\(internal\),TO SQL WITH FUNCTION pg_catalog\.int4recv\(internal\)\);/m, like => { %full_runs, section_pre_data => 1, }, }, @@ -2880,7 +2880,7 @@ my %tests = ( procost, prorows, provariadic, - protransform, + prosupport, prokind, prosecdef, proleakproof, @@ -2912,7 +2912,7 @@ my %tests = ( \QGRANT SELECT(procost) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prorows) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(provariadic) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* - \QGRANT SELECT(protransform) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* + \QGRANT SELECT(prosupport) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prokind) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prosecdef) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(proleakproof) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 3ecc2e1..e5cb5bb 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -1326,11 +1326,11 @@ { oid => '668', descr => 'adjust char() to typmod length', proname => 'bpchar', prorettype => 'bpchar', proargtypes => 'bpchar int4 bool', prosrc => 'bpchar' }, -{ oid => '3097', descr => 'transform a varchar length coercion', - proname => 'varchar_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'varchar_transform' }, +{ oid => '3097', descr => 'planner support for varchar length coercion', + proname => 'varchar_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'varchar_support' }, { oid => '669', descr => 'adjust varchar() to typmod length', - proname => 'varchar', protransform => 'varchar_transform', + proname => 'varchar', prosupport => 'varchar_support', prorettype => 'varchar', proargtypes => 'varchar int4 bool', prosrc => 'varchar' }, @@ -1954,13 +1954,9 @@ # OIDS 1000 - 1999 -{ oid => '3994', descr => 'transform a time zone adjustment', - proname => 'timestamp_izone_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_izone_transform' }, { oid => '1026', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_izone_transform', - prorettype => 'timestamp', proargtypes => 'interval timestamptz', - prosrc => 'timestamptz_izone' }, + proname => 'timezone', prorettype => 'timestamp', + proargtypes => 'interval timestamptz', prosrc => 'timestamptz_izone' }, { oid => '1031', descr => 'I/O', proname => 'aclitemin', provolatile => 's', prorettype => 'aclitem', @@ -2190,13 +2186,9 @@ { oid => '1158', descr => 'convert UNIX epoch to timestamptz', proname => 'to_timestamp', prorettype => 'timestamptz', proargtypes => 'float8', prosrc => 'float8_timestamptz' }, -{ oid => '3995', descr => 'transform a time zone adjustment', - proname => 'timestamp_zone_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_zone_transform' }, { oid => '1159', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_zone_transform', - prorettype => 'timestamp', proargtypes => 'text timestamptz', - prosrc => 'timestamptz_zone' }, + proname => 'timezone', prorettype => 'timestamp', + proargtypes => 'text timestamptz', prosrc => 'timestamptz_zone' }, { oid => '1160', descr => 'I/O', proname => 'interval_in', provolatile => 's', prorettype => 'interval', @@ -2301,11 +2293,11 @@ # OIDS 1200 - 1299 -{ oid => '3918', descr => 'transform an interval length coercion', - proname => 'interval_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'interval_transform' }, +{ oid => '3918', descr => 'planner support for interval length coercion', + proname => 'interval_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'interval_support' }, { oid => '1200', descr => 'adjust interval precision', - proname => 'interval', protransform => 'interval_transform', + proname => 'interval', prosupport => 'interval_support', prorettype => 'interval', proargtypes => 'interval int4', prosrc => 'interval_scale' }, @@ -3713,13 +3705,12 @@ { oid => '1685', descr => 'adjust bit() to typmod length', proname => 'bit', prorettype => 'bit', proargtypes => 'bit int4 bool', prosrc => 'bit' }, -{ oid => '3158', descr => 'transform a varbit length coercion', - proname => 'varbit_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'varbit_transform' }, +{ oid => '3158', descr => 'planner support for varbit length coercion', + proname => 'varbit_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'varbit_support' }, { oid => '1687', descr => 'adjust varbit() to typmod length', - proname => 'varbit', protransform => 'varbit_transform', - prorettype => 'varbit', proargtypes => 'varbit int4 bool', - prosrc => 'varbit' }, + proname => 'varbit', prosupport => 'varbit_support', prorettype => 'varbit', + proargtypes => 'varbit int4 bool', prosrc => 'varbit' }, { oid => '1698', descr => 'position of sub-bitstring', proname => 'position', prorettype => 'int4', proargtypes => 'bit bit', @@ -4081,11 +4072,11 @@ { oid => '2918', descr => 'I/O typmod', proname => 'numerictypmodout', prorettype => 'cstring', proargtypes => 'int4', prosrc => 'numerictypmodout' }, -{ oid => '3157', descr => 'transform a numeric length coercion', - proname => 'numeric_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'numeric_transform' }, +{ oid => '3157', descr => 'planner support for numeric length coercion', + proname => 'numeric_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'numeric_support' }, { oid => '1703', descr => 'adjust numeric to typmod precision/scale', - proname => 'numeric', protransform => 'numeric_transform', + proname => 'numeric', prosupport => 'numeric_support', prorettype => 'numeric', proargtypes => 'numeric int4', prosrc => 'numeric' }, { oid => '1704', proname => 'numeric_abs', prorettype => 'numeric', proargtypes => 'numeric', @@ -5448,15 +5439,15 @@ proname => 'bytea_sortsupport', prorettype => 'void', proargtypes => 'internal', prosrc => 'bytea_sortsupport' }, -{ oid => '3917', descr => 'transform a timestamp length coercion', - proname => 'timestamp_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_transform' }, -{ oid => '3944', descr => 'transform a time length coercion', - proname => 'time_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'time_transform' }, +{ oid => '3917', descr => 'planner support for timestamp length coercion', + proname => 'timestamp_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'timestamp_support' }, +{ oid => '3944', descr => 'planner support for time length coercion', + proname => 'time_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'time_support' }, { oid => '1961', descr => 'adjust timestamp precision', - proname => 'timestamp', protransform => 'timestamp_transform', + proname => 'timestamp', prosupport => 'timestamp_support', prorettype => 'timestamp', proargtypes => 'timestamp int4', prosrc => 'timestamp_scale' }, @@ -5468,14 +5459,14 @@ prosrc => 'oidsmaller' }, { oid => '1967', descr => 'adjust timestamptz precision', - proname => 'timestamptz', protransform => 'timestamp_transform', + proname => 'timestamptz', prosupport => 'timestamp_support', prorettype => 'timestamptz', proargtypes => 'timestamptz int4', prosrc => 'timestamptz_scale' }, { oid => '1968', descr => 'adjust time precision', - proname => 'time', protransform => 'time_transform', prorettype => 'time', + proname => 'time', prosupport => 'time_support', prorettype => 'time', proargtypes => 'time int4', prosrc => 'time_scale' }, { oid => '1969', descr => 'adjust time with time zone precision', - proname => 'timetz', protransform => 'time_transform', prorettype => 'timetz', + proname => 'timetz', prosupport => 'time_support', prorettype => 'timetz', proargtypes => 'timetz int4', prosrc => 'timetz_scale' }, { oid => '2003', @@ -5662,13 +5653,11 @@ prosrc => 'select pg_catalog.age(cast(current_date as timestamp without time zone), $1)' }, { oid => '2069', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_zone_transform', - prorettype => 'timestamptz', proargtypes => 'text timestamp', - prosrc => 'timestamp_zone' }, + proname => 'timezone', prorettype => 'timestamptz', + proargtypes => 'text timestamp', prosrc => 'timestamp_zone' }, { oid => '2070', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_izone_transform', - prorettype => 'timestamptz', proargtypes => 'interval timestamp', - prosrc => 'timestamp_izone' }, + proname => 'timezone', prorettype => 'timestamptz', + proargtypes => 'interval timestamp', prosrc => 'timestamp_izone' }, { oid => '2071', proname => 'date_pl_interval', prorettype => 'timestamp', proargtypes => 'date interval', prosrc => 'date_pl_interval' }, diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index c2bb951..b433769 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -53,8 +53,8 @@ CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,Proce /* element type of variadic array, or 0 */ Oid provariadic BKI_DEFAULT(0) BKI_LOOKUP(pg_type); - /* transforms calls to it during planning */ - regproc protransform BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); + /* planner support function for this function, or 0 if none */ + regproc prosupport BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); /* see PROKIND_ categories below */ char prokind BKI_DEFAULT(f); diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index 10dac60..e029b40 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -505,7 +505,8 @@ typedef enum NodeTag T_IndexAmRoutine, /* in access/amapi.h */ T_TsmRoutine, /* in access/tsmapi.h */ T_ForeignKeyCacheInfo, /* in utils/rel.h */ - T_CallContext /* in nodes/parsenodes.h */ + T_CallContext, /* in nodes/parsenodes.h */ + T_SupportRequestSimplify /* in nodes/supportnodes.h */ } NodeTag; /* diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h new file mode 100644 index 0000000..1f7d02b --- /dev/null +++ b/src/include/nodes/supportnodes.h @@ -0,0 +1,70 @@ +/*------------------------------------------------------------------------- + * + * supportnodes.h + * Definitions for planner support functions. + * + * This file defines the API for "planner support functions", which + * are SQL functions (normally written in C) that can be attached to + * another "target" function to give the system additional knowledge + * about the target function. All the current capabilities have to do + * with planning queries that use the target function, though it is + * possible that future extensions will add functionality to be invoked + * by the parser or executor. + * + * A support function must have the SQL signature + * supportfn(internal) returns internal + * The argument is a pointer to one of the Node types defined in this file. + * The result is usually also a Node pointer, though its type depends on + * which capability is being invoked. In all cases, a NULL pointer result + * (that's PG_RETURN_POINTER(NULL), not PG_RETURN_NULL()) indicates that + * the support function cannot do anything useful for the given request. + * Support functions must return a NULL pointer, not fail, if they do not + * recognize the request node type or cannot handle the given case; this + * allows for future extensions of the set of request cases. + * + * + * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/nodes/supportnodes.h + * + *------------------------------------------------------------------------- + */ +#ifndef SUPPORTNODES_H +#define SUPPORTNODES_H + +#include "nodes/primnodes.h" + +struct PlannerInfo; /* avoid including relation.h here */ + + +/* + * The Simplify request allows the support function to perform plan-time + * simplification of a call to its target function. For example, a varchar + * length coercion that does not decrease the allowed length of its argument + * could be replaced by a RelabelType node, or "x + 0" could be replaced by + * "x". This is invoked during the planner's constant-folding pass, so the + * function's arguments can be presumed already simplified. + * + * The planner's PlannerInfo "root" is typically not needed, but can be + * consulted if it's necessary to obtain info about Vars present in + * the given node tree. Beware that root could be NULL in some usages. + * + * "fcall" will be a FuncExpr invoking the support function's target + * function. (This is true even if the original parsetree node was an + * operator call; a FuncExpr is synthesized for this purpose.) + * + * The result should be a semantically-equivalent transformed node tree, + * or NULL if no simplification could be performed. Do *not* return or + * modify *fcall, as it isn't really a separately allocated Node. But + * it's okay to use fcall->args, or parts of it, in the result tree. + */ +typedef struct SupportRequestSimplify +{ + NodeTag type; + + struct PlannerInfo *root; /* Planner's infrastructure */ + FuncExpr *fcall; /* Function call to be simplified */ +} SupportRequestSimplify; + +#endif /* SUPPORTNODES_H */ diff --git a/src/include/utils/datetime.h b/src/include/utils/datetime.h index f5ec9bb..87f819e 100644 --- a/src/include/utils/datetime.h +++ b/src/include/utils/datetime.h @@ -330,7 +330,7 @@ extern int DecodeUnits(int field, char *lowtoken, int *val); extern int j2day(int jd); -extern Node *TemporalTransform(int32 max_precis, Node *node); +extern Node *TemporalSimplify(int32 max_precis, Node *node); extern bool CheckDateTokenTables(void); diff --git a/src/test/modules/test_ddl_deparse/expected/create_transform.out b/src/test/modules/test_ddl_deparse/expected/create_transform.out index 0d1cc36..da7fea2 100644 --- a/src/test/modules/test_ddl_deparse/expected/create_transform.out +++ b/src/test/modules/test_ddl_deparse/expected/create_transform.out @@ -7,7 +7,7 @@ -- internal and as return argument the datatype of the transform done. -- pl/plpgsql does not authorize the use of internal as data type. CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); NOTICE: DDL test: type simple, tag CREATE TRANSFORM DROP TRANSFORM FOR int LANGUAGE SQL; diff --git a/src/test/modules/test_ddl_deparse/sql/create_transform.sql b/src/test/modules/test_ddl_deparse/sql/create_transform.sql index 0968702..132fc5a 100644 --- a/src/test/modules/test_ddl_deparse/sql/create_transform.sql +++ b/src/test/modules/test_ddl_deparse/sql/create_transform.sql @@ -8,7 +8,7 @@ -- internal and as return argument the datatype of the transform done. -- pl/plpgsql does not authorize the use of internal as data type. CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); DROP TRANSFORM FOR int LANGUAGE SQL; diff --git a/src/test/regress/expected/object_address.out b/src/test/regress/expected/object_address.out index 4085e45..c89ec06 100644 --- a/src/test/regress/expected/object_address.out +++ b/src/test/regress/expected/object_address.out @@ -38,7 +38,7 @@ CREATE USER MAPPING FOR regress_addr_user SERVER "integer"; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user IN SCHEMA public GRANT ALL ON TABLES TO regress_addr_user; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user REVOKE DELETE ON TABLES FROM regress_addr_user; CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (connect = false, slot_name = NONE); diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out index ef268d3..4edc817 100644 --- a/src/test/regress/expected/oidjoins.out +++ b/src/test/regress/expected/oidjoins.out @@ -809,12 +809,12 @@ WHERE provariadic != 0 AND ------+------------- (0 rows) -SELECT ctid, protransform +SELECT ctid, prosupport FROM pg_catalog.pg_proc fk -WHERE protransform != 0 AND - NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.protransform); - ctid | protransform -------+-------------- +WHERE prosupport != 0 AND + NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.prosupport); + ctid | prosupport +------+------------ (0 rows) SELECT ctid, prorettype diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out index 7328095..ce25ee0 100644 --- a/src/test/regress/expected/opr_sanity.out +++ b/src/test/regress/expected/opr_sanity.out @@ -453,10 +453,10 @@ WHERE proallargtypes IS NOT NULL AND -----+---------+-------------+----------------+------------- (0 rows) --- Check for protransform functions with the wrong signature +-- Check for prosupport functions with the wrong signature SELECT p1.oid, p1.proname, p2.oid, p2.proname FROM pg_proc AS p1, pg_proc AS p2 -WHERE p2.oid = p1.protransform AND +WHERE p2.oid = p1.prosupport AND (p2.prorettype != 'internal'::regtype OR p2.proretset OR p2.pronargs != 1 OR p2.proargtypes[0] != 'internal'::regtype); oid | proname | oid | proname diff --git a/src/test/regress/sql/object_address.sql b/src/test/regress/sql/object_address.sql index d7df322..fd79465 100644 --- a/src/test/regress/sql/object_address.sql +++ b/src/test/regress/sql/object_address.sql @@ -41,7 +41,7 @@ CREATE USER MAPPING FOR regress_addr_user SERVER "integer"; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user IN SCHEMA public GRANT ALL ON TABLES TO regress_addr_user; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user REVOKE DELETE ON TABLES FROM regress_addr_user; CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (connect = false, slot_name = NONE); diff --git a/src/test/regress/sql/oidjoins.sql b/src/test/regress/sql/oidjoins.sql index c8291d3..dbe4a58 100644 --- a/src/test/regress/sql/oidjoins.sql +++ b/src/test/regress/sql/oidjoins.sql @@ -405,10 +405,10 @@ SELECT ctid, provariadic FROM pg_catalog.pg_proc fk WHERE provariadic != 0 AND NOT EXISTS(SELECT 1 FROM pg_catalog.pg_type pk WHERE pk.oid = fk.provariadic); -SELECT ctid, protransform +SELECT ctid, prosupport FROM pg_catalog.pg_proc fk -WHERE protransform != 0 AND - NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.protransform); +WHERE prosupport != 0 AND + NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.prosupport); SELECT ctid, prorettype FROM pg_catalog.pg_proc fk WHERE prorettype != 0 AND diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql index 8544cbe..e2014fc 100644 --- a/src/test/regress/sql/opr_sanity.sql +++ b/src/test/regress/sql/opr_sanity.sql @@ -353,10 +353,10 @@ WHERE proallargtypes IS NOT NULL AND FROM generate_series(1, array_length(proallargtypes, 1)) g(i) WHERE proargmodes IS NULL OR proargmodes[i] IN ('i', 'b', 'v')); --- Check for protransform functions with the wrong signature +-- Check for prosupport functions with the wrong signature SELECT p1.oid, p1.proname, p2.oid, p2.proname FROM pg_proc AS p1, pg_proc AS p2 -WHERE p2.oid = p1.protransform AND +WHERE p2.oid = p1.prosupport AND (p2.prorettype != 'internal'::regtype OR p2.proretset OR p2.pronargs != 1 OR p2.proargtypes[0] != 'internal'::regtype); diff --git a/src/tools/findoidjoins/README b/src/tools/findoidjoins/README index 305454a..e5fc310 100644 --- a/src/tools/findoidjoins/README +++ b/src/tools/findoidjoins/README @@ -161,7 +161,7 @@ Join pg_catalog.pg_proc.pronamespace => pg_catalog.pg_namespace.oid Join pg_catalog.pg_proc.proowner => pg_catalog.pg_authid.oid Join pg_catalog.pg_proc.prolang => pg_catalog.pg_language.oid Join pg_catalog.pg_proc.provariadic => pg_catalog.pg_type.oid -Join pg_catalog.pg_proc.protransform => pg_catalog.pg_proc.oid +Join pg_catalog.pg_proc.prosupport => pg_catalog.pg_proc.oid Join pg_catalog.pg_proc.prorettype => pg_catalog.pg_type.oid Join pg_catalog.pg_range.rngtypid => pg_catalog.pg_type.oid Join pg_catalog.pg_range.rngsubtype => pg_catalog.pg_type.oid diff --git a/doc/src/sgml/keywords.sgml b/doc/src/sgml/keywords.sgml index a37d0b7..fa32a88 100644 --- a/doc/src/sgml/keywords.sgml +++ b/doc/src/sgml/keywords.sgml @@ -4522,6 +4522,13 @@ <entry>reserved</entry> </row> <row> + <entry><token>SUPPORT</token></entry> + <entry>non-reserved</entry> + <entry></entry> + <entry></entry> + <entry></entry> + </row> + <row> <entry><token>SYMMETRIC</token></entry> <entry>reserved</entry> <entry>reserved</entry> diff --git a/doc/src/sgml/ref/alter_function.sgml b/doc/src/sgml/ref/alter_function.sgml index d8747e0..03ffa59 100644 --- a/doc/src/sgml/ref/alter_function.sgml +++ b/doc/src/sgml/ref/alter_function.sgml @@ -40,6 +40,7 @@ ALTER FUNCTION <replaceable>name</replaceable> [ ( [ [ <replaceable class="param PARALLEL { UNSAFE | RESTRICTED | SAFE } COST <replaceable class="parameter">execution_cost</replaceable> ROWS <replaceable class="parameter">result_rows</replaceable> + SUPPORT <replaceable class="parameter">support_function</replaceable> SET <replaceable class="parameter">configuration_parameter</replaceable> { TO | = } { <replaceable class="parameter">value</replaceable>| DEFAULT } SET <replaceable class="parameter">configuration_parameter</replaceable> FROM CURRENT RESET <replaceable class="parameter">configuration_parameter</replaceable> @@ -248,6 +249,24 @@ ALTER FUNCTION <replaceable>name</replaceable> [ ( [ [ <replaceable class="param </listitem> </varlistentry> + <varlistentry> + <term><literal>SUPPORT</literal> <replaceable class="parameter">support_function</replaceable></term> + + <listitem> + <para> + Set or change the planner support function to use for this function. + See <xref linkend="xfunc-optimization"/> for details. You must be + superuser to use this option. + </para> + + <para> + This option cannot be used to remove the support function altogether, + since it must name a new support function. Use <command>CREATE OR + REPLACE FUNCTION</command> if you need to do that. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><replaceable>configuration_parameter</replaceable></term> <term><replaceable>value</replaceable></term> diff --git a/doc/src/sgml/ref/create_function.sgml b/doc/src/sgml/ref/create_function.sgml index 4072543..dd6a2f7 100644 --- a/doc/src/sgml/ref/create_function.sgml +++ b/doc/src/sgml/ref/create_function.sgml @@ -33,6 +33,7 @@ CREATE [ OR REPLACE ] FUNCTION | PARALLEL { UNSAFE | RESTRICTED | SAFE } | COST <replaceable class="parameter">execution_cost</replaceable> | ROWS <replaceable class="parameter">result_rows</replaceable> + | SUPPORT <replaceable class="parameter">support_function</replaceable> | SET <replaceable class="parameter">configuration_parameter</replaceable> { TO <replaceable class="parameter">value</replaceable>| = <replaceable class="parameter">value</replaceable> | FROM CURRENT } | AS '<replaceable class="parameter">definition</replaceable>' | AS '<replaceable class="parameter">obj_file</replaceable>', '<replaceable class="parameter">link_symbol</replaceable>' @@ -478,6 +479,19 @@ CREATE [ OR REPLACE ] FUNCTION </varlistentry> <varlistentry> + <term><literal>SUPPORT</literal> <replaceable class="parameter">support_function</replaceable></term> + + <listitem> + <para> + The name (optionally schema-qualified) of a <firstterm>planner support + function</firstterm> to use for this function. See + <xref linkend="xfunc-optimization"/> for details. + You must be superuser to use this option. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><replaceable>configuration_parameter</replaceable></term> <term><replaceable>value</replaceable></term> <listitem> diff --git a/src/backend/catalog/pg_aggregate.c b/src/backend/catalog/pg_aggregate.c index cc3806e..19e3171 100644 --- a/src/backend/catalog/pg_aggregate.c +++ b/src/backend/catalog/pg_aggregate.c @@ -632,6 +632,7 @@ AggregateCreate(const char *aggName, parameterDefaults, /* parameterDefaults */ PointerGetDatum(NULL), /* trftypes */ PointerGetDatum(NULL), /* proconfig */ + InvalidOid, /* no prosupport */ 1, /* procost */ 0); /* prorows */ procOid = myself.objectId; diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c index 2b8f651..23b01f8 100644 --- a/src/backend/catalog/pg_depend.c +++ b/src/backend/catalog/pg_depend.c @@ -286,9 +286,12 @@ deleteDependencyRecordsForClass(Oid classId, Oid objectId, * newRefObjectId is the new referenced object (must be of class refClassId). * * Note the lack of objsubid parameters. If there are subobject references - * they will all be readjusted. + * they will all be readjusted. Also, there is an expectation that we are + * dealing with NORMAL dependencies: if we have to replace an (implicit) + * dependency on a pinned object with an explicit dependency on an unpinned + * one, the new one will be NORMAL. * - * Returns the number of records updated. + * Returns the number of records updated -- zero indicates a problem. */ long changeDependencyFor(Oid classId, Oid objectId, @@ -301,35 +304,52 @@ changeDependencyFor(Oid classId, Oid objectId, SysScanDesc scan; HeapTuple tup; ObjectAddress objAddr; + ObjectAddress depAddr; + bool oldIsPinned; bool newIsPinned; depRel = table_open(DependRelationId, RowExclusiveLock); /* - * If oldRefObjectId is pinned, there won't be any dependency entries on - * it --- we can't cope in that case. (This isn't really worth expending - * code to fix, in current usage; it just means you can't rename stuff out - * of pg_catalog, which would likely be a bad move anyway.) + * Check to see if either oldRefObjectId or newRefObjectId is pinned. + * Pinned objects should not have any dependency entries pointing to them, + * so in these cases we should add or remove a pg_depend entry, or do + * nothing at all, rather than update an entry as in the normal case. */ objAddr.classId = refClassId; objAddr.objectId = oldRefObjectId; objAddr.objectSubId = 0; - if (isObjectPinned(&objAddr, depRel)) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("cannot remove dependency on %s because it is a system object", - getObjectDescription(&objAddr)))); + oldIsPinned = isObjectPinned(&objAddr, depRel); - /* - * We can handle adding a dependency on something pinned, though, since - * that just means deleting the dependency entry. - */ objAddr.objectId = newRefObjectId; newIsPinned = isObjectPinned(&objAddr, depRel); - /* Now search for dependency records */ + if (oldIsPinned) + { + table_close(depRel, RowExclusiveLock); + + /* + * If both are pinned, we need do nothing. However, return 1 not 0, + * else callers will think this is an error case. + */ + if (newIsPinned) + return 1; + + /* + * There is no old dependency record, but we should insert a new one. + * Assume a normal dependency is wanted. + */ + depAddr.classId = classId; + depAddr.objectId = objectId; + depAddr.objectSubId = 0; + recordDependencyOn(&depAddr, &objAddr, DEPENDENCY_NORMAL); + + return 1; + } + + /* There should be existing dependency record(s), so search. */ ScanKeyInit(&key[0], Anum_pg_depend_classid, BTEqualStrategyNumber, F_OIDEQ, diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index 3a86f1e..557e0ea 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -88,6 +88,7 @@ ProcedureCreate(const char *procedureName, List *parameterDefaults, Datum trftypes, Datum proconfig, + Oid prosupport, float4 procost, float4 prorows) { @@ -319,7 +320,7 @@ ProcedureCreate(const char *procedureName, values[Anum_pg_proc_procost - 1] = Float4GetDatum(procost); values[Anum_pg_proc_prorows - 1] = Float4GetDatum(prorows); values[Anum_pg_proc_provariadic - 1] = ObjectIdGetDatum(variadicType); - values[Anum_pg_proc_prosupport - 1] = ObjectIdGetDatum(InvalidOid); + values[Anum_pg_proc_prosupport - 1] = ObjectIdGetDatum(prosupport); values[Anum_pg_proc_prokind - 1] = CharGetDatum(prokind); values[Anum_pg_proc_prosecdef - 1] = BoolGetDatum(security_definer); values[Anum_pg_proc_proleakproof - 1] = BoolGetDatum(isLeakProof); @@ -656,6 +657,15 @@ ProcedureCreate(const char *procedureName, recordDependencyOnExpr(&myself, (Node *) parameterDefaults, NIL, DEPENDENCY_NORMAL); + /* dependency on support function, if any */ + if (OidIsValid(prosupport)) + { + referenced.classId = ProcedureRelationId; + referenced.objectId = prosupport; + referenced.objectSubId = 0; + recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL); + } + /* dependency on owner */ if (!is_update) recordDependencyOnOwner(ProcedureRelationId, retval, proowner); diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index eae2b09..4b13fdb 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -480,6 +480,7 @@ compute_common_attribute(ParseState *pstate, List **set_items, DefElem **cost_item, DefElem **rows_item, + DefElem **support_item, DefElem **parallel_item) { if (strcmp(defel->defname, "volatility") == 0) @@ -538,6 +539,15 @@ compute_common_attribute(ParseState *pstate, *rows_item = defel; } + else if (strcmp(defel->defname, "support") == 0) + { + if (is_procedure) + goto procedure_error; + if (*support_item) + goto duplicate_error; + + *support_item = defel; + } else if (strcmp(defel->defname, "parallel") == 0) { if (is_procedure) @@ -636,6 +646,45 @@ update_proconfig_value(ArrayType *a, List *set_items) return a; } +static Oid +interpret_func_support(DefElem *defel) +{ + List *procName = defGetQualifiedName(defel); + Oid procOid; + Oid argList[1]; + + /* + * Support functions always take one INTERNAL argument and return + * INTERNAL. + */ + argList[0] = INTERNALOID; + + procOid = LookupFuncName(procName, 1, argList, true); + if (!OidIsValid(procOid)) + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_FUNCTION), + errmsg("function %s does not exist", + func_signature_string(procName, 1, NIL, argList)))); + + if (get_func_rettype(procOid) != INTERNALOID) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("support function %s must return type %s", + NameListToString(procName), "internal"))); + + /* + * Someday we might want an ACL check here; but for now, we insist that + * you be superuser to specify a support function, so privilege on the + * support function is moot. + */ + if (!superuser()) + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to specify a support function"))); + + return procOid; +} + /* * Dissect the list of options assembled in gram.y into function @@ -656,6 +705,7 @@ compute_function_attributes(ParseState *pstate, ArrayType **proconfig, float4 *procost, float4 *prorows, + Oid *prosupport, char *parallel_p) { ListCell *option; @@ -670,6 +720,7 @@ compute_function_attributes(ParseState *pstate, List *set_items = NIL; DefElem *cost_item = NULL; DefElem *rows_item = NULL; + DefElem *support_item = NULL; DefElem *parallel_item = NULL; foreach(option, options) @@ -727,6 +778,7 @@ compute_function_attributes(ParseState *pstate, &set_items, &cost_item, &rows_item, + &support_item, ¶llel_item)) { /* recognized common option */ @@ -789,6 +841,8 @@ compute_function_attributes(ParseState *pstate, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("ROWS must be positive"))); } + if (support_item) + *prosupport = interpret_func_support(support_item); if (parallel_item) *parallel_p = interpret_func_parallel(parallel_item); } @@ -894,6 +948,7 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt) ArrayType *proconfig; float4 procost; float4 prorows; + Oid prosupport; HeapTuple languageTuple; Form_pg_language languageStruct; List *as_clause; @@ -918,6 +973,7 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt) proconfig = NULL; procost = -1; /* indicates not set */ prorows = -1; /* indicates not set */ + prosupport = InvalidOid; parallel = PROPARALLEL_UNSAFE; /* Extract non-default attributes from stmt->options list */ @@ -927,7 +983,8 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt) &as_clause, &language, &transformDefElem, &isWindowFunc, &volatility, &isStrict, &security, &isLeakProof, - &proconfig, &procost, &prorows, ¶llel); + &proconfig, &procost, &prorows, + &prosupport, ¶llel); /* Look up the language and validate permissions */ languageTuple = SearchSysCache1(LANGNAME, PointerGetDatum(language)); @@ -1114,6 +1171,7 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt) parameterDefaults, PointerGetDatum(trftypes), PointerGetDatum(proconfig), + prosupport, procost, prorows); } @@ -1188,6 +1246,7 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt) List *set_items = NIL; DefElem *cost_item = NULL; DefElem *rows_item = NULL; + DefElem *support_item = NULL; DefElem *parallel_item = NULL; ObjectAddress address; @@ -1195,6 +1254,8 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt) funcOid = LookupFuncWithArgs(stmt->objtype, stmt->func, false); + ObjectAddressSet(address, ProcedureRelationId, funcOid); + tup = SearchSysCacheCopy1(PROCOID, ObjectIdGetDatum(funcOid)); if (!HeapTupleIsValid(tup)) /* should not happen */ elog(ERROR, "cache lookup failed for function %u", funcOid); @@ -1229,6 +1290,7 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt) &set_items, &cost_item, &rows_item, + &support_item, ¶llel_item) == false) elog(ERROR, "option \"%s\" not recognized", defel->defname); } @@ -1267,6 +1329,28 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt) (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("ROWS is not applicable when function does not return a set"))); } + if (support_item) + { + /* interpret_func_support handles the privilege check */ + Oid newsupport = interpret_func_support(support_item); + + /* Add or replace dependency on support function */ + if (OidIsValid(procForm->prosupport)) + changeDependencyFor(ProcedureRelationId, funcOid, + ProcedureRelationId, procForm->prosupport, + newsupport); + else + { + ObjectAddress referenced; + + referenced.classId = ProcedureRelationId; + referenced.objectId = newsupport; + referenced.objectSubId = 0; + recordDependencyOn(&address, &referenced, DEPENDENCY_NORMAL); + } + + procForm->prosupport = newsupport; + } if (set_items) { Datum datum; @@ -1309,8 +1393,6 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt) InvokeObjectPostAlterHook(ProcedureRelationId, funcOid, 0); - ObjectAddressSet(address, ProcedureRelationId, funcOid); - table_close(rel, NoLock); heap_freetuple(tup); diff --git a/src/backend/commands/proclang.c b/src/backend/commands/proclang.c index c2e9e41..59c4e8d 100644 --- a/src/backend/commands/proclang.c +++ b/src/backend/commands/proclang.c @@ -141,6 +141,7 @@ CreateProceduralLanguage(CreatePLangStmt *stmt) NIL, PointerGetDatum(NULL), PointerGetDatum(NULL), + InvalidOid, 1, 0); handlerOid = tmpAddr.objectId; @@ -180,6 +181,7 @@ CreateProceduralLanguage(CreatePLangStmt *stmt) NIL, PointerGetDatum(NULL), PointerGetDatum(NULL), + InvalidOid, 1, 0); inlineOid = tmpAddr.objectId; @@ -222,6 +224,7 @@ CreateProceduralLanguage(CreatePLangStmt *stmt) NIL, PointerGetDatum(NULL), PointerGetDatum(NULL), + InvalidOid, 1, 0); valOid = tmpAddr.objectId; diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c index 35a6485..b0a61a3 100644 --- a/src/backend/commands/typecmds.c +++ b/src/backend/commands/typecmds.c @@ -1664,6 +1664,7 @@ makeRangeConstructors(const char *name, Oid namespace, NIL, /* parameterDefaults */ PointerGetDatum(NULL), /* trftypes */ PointerGetDatum(NULL), /* proconfig */ + InvalidOid, /* prosupport */ 1.0, /* procost */ 0.0); /* prorows */ diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index d8a3c2d..4c5f00d 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -677,7 +677,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); SERIALIZABLE SERVER SESSION SESSION_USER SET SETS SETOF SHARE SHOW SIMILAR SIMPLE SKIP SMALLINT SNAPSHOT SOME SQL_P STABLE STANDALONE_P START STATEMENT STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P - SUBSCRIPTION SUBSTRING SYMMETRIC SYSID SYSTEM_P + SUBSCRIPTION SUBSTRING SUPPORT SYMMETRIC SYSID SYSTEM_P TABLE TABLES TABLESAMPLE TABLESPACE TEMP TEMPLATE TEMPORARY TEXT_P THEN TIES TIME TIMESTAMP TO TRAILING TRANSACTION TRANSFORM @@ -7888,6 +7888,10 @@ common_func_opt_item: { $$ = makeDefElem("rows", (Node *)$2, @1); } + | SUPPORT any_name + { + $$ = makeDefElem("support", (Node *)$2, @1); + } | FunctionSetResetClause { /* we abuse the normal content of a DefElem here */ @@ -15218,6 +15222,7 @@ unreserved_keyword: | STRICT_P | STRIP_P | SUBSCRIPTION + | SUPPORT | SYSID | SYSTEM_P | TABLES diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 302df16..4bda11c 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -2638,6 +2638,21 @@ pg_get_functiondef(PG_FUNCTION_ARGS) if (proc->prorows > 0 && proc->prorows != 1000) appendStringInfo(&buf, " ROWS %g", proc->prorows); + if (proc->prosupport) + { + Oid argtypes[1]; + + /* + * We should qualify the support function's name if it wouldn't be + * resolved by lookup in the current search path. + */ + argtypes[0] = INTERNALOID; + appendStringInfo(&buf, " SUPPORT %s", + generate_function_name(proc->prosupport, 1, + NIL, argtypes, + false, NULL, EXPR_KIND_NONE)); + } + if (oldlen != buf.len) appendStringInfoChar(&buf, '\n'); diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 2b1a947..615d997 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -11466,6 +11466,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) char *proconfig; char *procost; char *prorows; + char *prosupport; char *proparallel; char *lanname; char *rettypename; @@ -11488,7 +11489,26 @@ dumpFunc(Archive *fout, FuncInfo *finfo) asPart = createPQExpBuffer(); /* Fetch function-specific details */ - if (fout->remoteVersion >= 110000) + if (fout->remoteVersion >= 120000) + { + /* + * prosupport was added in 12 + */ + appendPQExpBuffer(query, + "SELECT proretset, prosrc, probin, " + "pg_catalog.pg_get_function_arguments(oid) AS funcargs, " + "pg_catalog.pg_get_function_identity_arguments(oid) AS funciargs, " + "pg_catalog.pg_get_function_result(oid) AS funcresult, " + "array_to_string(protrftypes, ' ') AS protrftypes, " + "prokind, provolatile, proisstrict, prosecdef, " + "proleakproof, proconfig, procost, prorows, " + "prosupport, proparallel, " + "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " + "FROM pg_catalog.pg_proc " + "WHERE oid = '%u'::pg_catalog.oid", + finfo->dobj.catId.oid); + } + else if (fout->remoteVersion >= 110000) { /* * prokind was added in 11 @@ -11501,7 +11521,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "array_to_string(protrftypes, ' ') AS protrftypes, " "prokind, provolatile, proisstrict, prosecdef, " "proleakproof, proconfig, procost, prorows, " - "proparallel, " + "'-' AS prosupport, proparallel, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11521,7 +11541,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "CASE WHEN proiswindow THEN 'w' ELSE 'f' END AS prokind, " "provolatile, proisstrict, prosecdef, " "proleakproof, proconfig, procost, prorows, " - "proparallel, " + "'-' AS prosupport, proparallel, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11541,6 +11561,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "CASE WHEN proiswindow THEN 'w' ELSE 'f' END AS prokind, " "provolatile, proisstrict, prosecdef, " "proleakproof, proconfig, procost, prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11559,6 +11580,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "CASE WHEN proiswindow THEN 'w' ELSE 'f' END AS prokind, " "provolatile, proisstrict, prosecdef, " "proleakproof, proconfig, procost, prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11579,6 +11601,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "provolatile, proisstrict, prosecdef, " "false AS proleakproof, " " proconfig, procost, prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11593,6 +11616,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "provolatile, proisstrict, prosecdef, " "false AS proleakproof, " "proconfig, procost, prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11607,6 +11631,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "provolatile, proisstrict, prosecdef, " "false AS proleakproof, " "null AS proconfig, 0 AS procost, 0 AS prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11623,6 +11648,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "provolatile, proisstrict, prosecdef, " "false AS proleakproof, " "null AS proconfig, 0 AS procost, 0 AS prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11660,6 +11686,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) proconfig = PQgetvalue(res, 0, PQfnumber(res, "proconfig")); procost = PQgetvalue(res, 0, PQfnumber(res, "procost")); prorows = PQgetvalue(res, 0, PQfnumber(res, "prorows")); + prosupport = PQgetvalue(res, 0, PQfnumber(res, "prosupport")); if (PQfnumber(res, "proparallel") != -1) proparallel = PQgetvalue(res, 0, PQfnumber(res, "proparallel")); @@ -11873,6 +11900,12 @@ dumpFunc(Archive *fout, FuncInfo *finfo) strcmp(prorows, "0") != 0 && strcmp(prorows, "1000") != 0) appendPQExpBuffer(q, " ROWS %s", prorows); + if (strcmp(prosupport, "-") != 0) + { + /* We rely on regprocout to provide quoting and qualification */ + appendPQExpBuffer(q, " SUPPORT %s", prosupport); + } + if (proparallel != NULL && proparallel[0] != PROPARALLEL_UNSAFE) { if (proparallel[0] == PROPARALLEL_SAFE) diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index 4ff358a..d22ca73 100644 --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl @@ -1774,6 +1774,20 @@ my %tests = ( unlike => { exclude_dump_test_schema => 1, }, }, + 'CREATE FUNCTION ... SUPPORT' => { + create_order => 41, + create_sql => + 'CREATE FUNCTION dump_test.func_with_support() RETURNS int LANGUAGE sql AS $$ SELECT 1 $$ SUPPORT varchar_support;', + regexp => qr/^ + \QCREATE FUNCTION dump_test.func_with_support() RETURNS integer\E + \n\s+\QLANGUAGE sql SUPPORT varchar_support\E + \n\s+AS\ \$\$\Q SELECT 1 \E\$\$; + /xm, + like => + { %full_runs, %dump_test_schema_runs, section_pre_data => 1, }, + unlike => { exclude_dump_test_schema => 1, }, + }, + 'CREATE PROCEDURE dump_test.ptest1' => { create_order => 41, create_sql => 'CREATE PROCEDURE dump_test.ptest1(a int) diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index b433769..e5270d2 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -201,6 +201,7 @@ extern ObjectAddress ProcedureCreate(const char *procedureName, List *parameterDefaults, Datum trftypes, Datum proconfig, + Oid prosupport, float4 procost, float4 prorows); diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h index adeb834..f054440 100644 --- a/src/include/parser/kwlist.h +++ b/src/include/parser/kwlist.h @@ -387,6 +387,7 @@ PG_KEYWORD("strict", STRICT_P, UNRESERVED_KEYWORD) PG_KEYWORD("strip", STRIP_P, UNRESERVED_KEYWORD) PG_KEYWORD("subscription", SUBSCRIPTION, UNRESERVED_KEYWORD) PG_KEYWORD("substring", SUBSTRING, COL_NAME_KEYWORD) +PG_KEYWORD("support", SUPPORT, UNRESERVED_KEYWORD) PG_KEYWORD("symmetric", SYMMETRIC, RESERVED_KEYWORD) PG_KEYWORD("sysid", SYSID, UNRESERVED_KEYWORD) PG_KEYWORD("system", SYSTEM_P, UNRESERVED_KEYWORD) diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index 7bb8ca9..4db792c 100644 --- a/src/test/regress/expected/alter_table.out +++ b/src/test/regress/expected/alter_table.out @@ -3050,10 +3050,9 @@ DETAIL: System catalog modifications are currently disallowed. -- instead create in public first, move to catalog CREATE TABLE new_system_table(id serial primary key, othercol text); ALTER TABLE new_system_table SET SCHEMA pg_catalog; --- XXX: it's currently impossible to move relations out of pg_catalog ALTER TABLE new_system_table SET SCHEMA public; -ERROR: cannot remove dependency on schema pg_catalog because it is a system object --- move back, will be ignored -- already there +ALTER TABLE new_system_table SET SCHEMA pg_catalog; +-- will be ignored -- already there: ALTER TABLE new_system_table SET SCHEMA pg_catalog; ALTER TABLE new_system_table RENAME TO old_system_table; CREATE INDEX old_system_table__othercol ON old_system_table (othercol); diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql index a498e4e..d806430 100644 --- a/src/test/regress/sql/alter_table.sql +++ b/src/test/regress/sql/alter_table.sql @@ -1896,10 +1896,9 @@ CREATE TABLE pg_catalog.new_system_table(); -- instead create in public first, move to catalog CREATE TABLE new_system_table(id serial primary key, othercol text); ALTER TABLE new_system_table SET SCHEMA pg_catalog; - --- XXX: it's currently impossible to move relations out of pg_catalog ALTER TABLE new_system_table SET SCHEMA public; --- move back, will be ignored -- already there +ALTER TABLE new_system_table SET SCHEMA pg_catalog; +-- will be ignored -- already there: ALTER TABLE new_system_table SET SCHEMA pg_catalog; ALTER TABLE new_system_table RENAME TO old_system_table; CREATE INDEX old_system_table__othercol ON old_system_table (othercol); diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index d85a83a..3a906c7 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -2756,6 +2756,7 @@ estimate_path_cost_size(PlannerInfo *root, startup_cost = ofpinfo->rel_startup_cost; startup_cost += aggcosts.transCost.startup; startup_cost += aggcosts.transCost.per_tuple * input_rows; + startup_cost += aggcosts.finalCost.startup; startup_cost += (cpu_operator_cost * numGroupCols) * input_rows; startup_cost += ptarget->cost.startup; @@ -2767,7 +2768,7 @@ estimate_path_cost_size(PlannerInfo *root, *----- */ run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost; - run_cost += aggcosts.finalCost * numGroups; + run_cost += aggcosts.finalCost.per_tuple * numGroups; run_cost += cpu_tuple_cost * numGroups; run_cost += ptarget->cost.per_tuple * numGroups; diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index d70aa6e..b486ef3 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -3439,4 +3439,25 @@ supportfn(internal) returns internal simplify. Ensure rigorous equivalence between the simplified expression and an actual execution of the target function. </para> + + <para> + For target functions that return boolean, it is often useful to estimate + the fraction of rows that will be selected by a WHERE clause using that + function. This can be done by a support function that implements + the <literal>SupportRequestSelectivity</literal> request type. + </para> + + <para> + If the target function's runtime is highly dependent on its inputs, + it may be useful to provide a non-constant cost estimate for it. + This can be done by a support function that implements + the <literal>SupportRequestCost</literal> request type. + </para> + + <para> + For target functions that return sets, it is often useful to provide + a non-constant estimate for the number of rows that will be returned. + This can be done by a support function that implements + the <literal>SupportRequestRows</literal> request type. + </para> </sect1> diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c index 3739b98..9e3f708 100644 --- a/src/backend/optimizer/path/clausesel.c +++ b/src/backend/optimizer/path/clausesel.c @@ -760,6 +760,21 @@ clause_selectivity(PlannerInfo *root, if (IsA(clause, DistinctExpr)) s1 = 1.0 - s1; } + else if (is_funcclause(clause)) + { + FuncExpr *funcclause = (FuncExpr *) clause; + + /* Try to get an estimate from the support function, if any */ + s1 = function_selectivity(root, + funcclause->funcid, + funcclause->args, + funcclause->inputcollid, + treat_as_join_clause(clause, rinfo, + varRelid, sjinfo), + varRelid, + jointype, + sjinfo); + } else if (IsA(clause, ScalarArrayOpExpr)) { /* Use node specific selectivity calculation function */ diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index 99c5ad9..f4f1c95 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -2080,9 +2080,9 @@ cost_agg(Path *path, PlannerInfo *root, /* * The transCost.per_tuple component of aggcosts should be charged once * per input tuple, corresponding to the costs of evaluating the aggregate - * transfns and their input expressions (with any startup cost of course - * charged but once). The finalCost component is charged once per output - * tuple, corresponding to the costs of evaluating the finalfns. + * transfns and their input expressions. The finalCost.per_tuple component + * is charged once per output tuple, corresponding to the costs of + * evaluating the finalfns. Startup costs are of course charged but once. * * If we are grouping, we charge an additional cpu_operator_cost per * grouping column per input tuple for grouping comparisons. @@ -2104,7 +2104,8 @@ cost_agg(Path *path, PlannerInfo *root, startup_cost = input_total_cost; startup_cost += aggcosts->transCost.startup; startup_cost += aggcosts->transCost.per_tuple * input_tuples; - startup_cost += aggcosts->finalCost; + startup_cost += aggcosts->finalCost.startup; + startup_cost += aggcosts->finalCost.per_tuple; /* we aren't grouping */ total_cost = startup_cost + cpu_tuple_cost; output_tuples = 1; @@ -2123,7 +2124,8 @@ cost_agg(Path *path, PlannerInfo *root, total_cost += aggcosts->transCost.startup; total_cost += aggcosts->transCost.per_tuple * input_tuples; total_cost += (cpu_operator_cost * numGroupCols) * input_tuples; - total_cost += aggcosts->finalCost * numGroups; + total_cost += aggcosts->finalCost.startup; + total_cost += aggcosts->finalCost.per_tuple * numGroups; total_cost += cpu_tuple_cost * numGroups; output_tuples = numGroups; } @@ -2136,8 +2138,9 @@ cost_agg(Path *path, PlannerInfo *root, startup_cost += aggcosts->transCost.startup; startup_cost += aggcosts->transCost.per_tuple * input_tuples; startup_cost += (cpu_operator_cost * numGroupCols) * input_tuples; + startup_cost += aggcosts->finalCost.startup; total_cost = startup_cost; - total_cost += aggcosts->finalCost * numGroups; + total_cost += aggcosts->finalCost.per_tuple * numGroups; total_cost += cpu_tuple_cost * numGroups; output_tuples = numGroups; } @@ -2202,7 +2205,11 @@ cost_windowagg(Path *path, PlannerInfo *root, Cost wfunccost; QualCost argcosts; - wfunccost = get_func_cost(wfunc->winfnoid) * cpu_operator_cost; + argcosts.startup = argcosts.per_tuple = 0; + add_function_cost(root, wfunc->winfnoid, (Node *) wfunc, + &argcosts); + startup_cost += argcosts.startup; + wfunccost = argcosts.per_tuple; /* also add the input expressions' cost to per-input-row costs */ cost_qual_eval_node(&argcosts, (Node *) wfunc->args, root); @@ -3832,8 +3839,8 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) */ if (IsA(node, FuncExpr)) { - context->total.per_tuple += - get_func_cost(((FuncExpr *) node)->funcid) * cpu_operator_cost; + add_function_cost(context->root, ((FuncExpr *) node)->funcid, node, + &context->total); } else if (IsA(node, OpExpr) || IsA(node, DistinctExpr) || @@ -3841,8 +3848,8 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) { /* rely on struct equivalence to treat these all alike */ set_opfuncid((OpExpr *) node); - context->total.per_tuple += - get_func_cost(((OpExpr *) node)->opfuncid) * cpu_operator_cost; + add_function_cost(context->root, ((OpExpr *) node)->opfuncid, node, + &context->total); } else if (IsA(node, ScalarArrayOpExpr)) { @@ -3852,10 +3859,15 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) */ ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) node; Node *arraynode = (Node *) lsecond(saop->args); + QualCost sacosts; set_sa_opfuncid(saop); - context->total.per_tuple += get_func_cost(saop->opfuncid) * - cpu_operator_cost * estimate_array_length(arraynode) * 0.5; + sacosts.startup = sacosts.per_tuple = 0; + add_function_cost(context->root, saop->opfuncid, NULL, + &sacosts); + context->total.startup += sacosts.startup; + context->total.per_tuple += sacosts.per_tuple * + estimate_array_length(arraynode) * 0.5; } else if (IsA(node, Aggref) || IsA(node, WindowFunc)) @@ -3881,11 +3893,13 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) /* check the result type's input function */ getTypeInputInfo(iocoerce->resulttype, &iofunc, &typioparam); - context->total.per_tuple += get_func_cost(iofunc) * cpu_operator_cost; + add_function_cost(context->root, iofunc, NULL, + &context->total); /* check the input type's output function */ getTypeOutputInfo(exprType((Node *) iocoerce->arg), &iofunc, &typisvarlena); - context->total.per_tuple += get_func_cost(iofunc) * cpu_operator_cost; + add_function_cost(context->root, iofunc, NULL, + &context->total); } else if (IsA(node, ArrayCoerceExpr)) { @@ -3909,8 +3923,8 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) { Oid opid = lfirst_oid(lc); - context->total.per_tuple += get_func_cost(get_opcode(opid)) * - cpu_operator_cost; + add_function_cost(context->root, get_opcode(opid), NULL, + &context->total); } } else if (IsA(node, MinMaxExpr) || @@ -4910,7 +4924,7 @@ set_function_size_estimates(PlannerInfo *root, RelOptInfo *rel) foreach(lc, rte->functions) { RangeTblFunction *rtfunc = (RangeTblFunction *) lfirst(lc); - double ntup = expression_returns_set_rows(rtfunc->funcexpr); + double ntup = expression_returns_set_rows(root, rtfunc->funcexpr); if (ntup > rel->tuples) rel->tuples = ntup; diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 061a855..93eddd3 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -35,6 +35,7 @@ #include "nodes/supportnodes.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" +#include "optimizer/plancat.h" #include "optimizer/planmain.h" #include "optimizer/prep.h" #include "optimizer/var.h" @@ -583,19 +584,24 @@ get_agg_clause_costs_walker(Node *node, get_agg_clause_costs_context *context) if (DO_AGGSPLIT_COMBINE(context->aggsplit)) { /* charge for combining previously aggregated states */ - costs->transCost.per_tuple += get_func_cost(aggcombinefn) * cpu_operator_cost; + add_function_cost(context->root, aggcombinefn, NULL, + &costs->transCost); } else - costs->transCost.per_tuple += get_func_cost(aggtransfn) * cpu_operator_cost; + add_function_cost(context->root, aggtransfn, NULL, + &costs->transCost); if (DO_AGGSPLIT_DESERIALIZE(context->aggsplit) && OidIsValid(aggdeserialfn)) - costs->transCost.per_tuple += get_func_cost(aggdeserialfn) * cpu_operator_cost; + add_function_cost(context->root, aggdeserialfn, NULL, + &costs->transCost); if (DO_AGGSPLIT_SERIALIZE(context->aggsplit) && OidIsValid(aggserialfn)) - costs->finalCost += get_func_cost(aggserialfn) * cpu_operator_cost; + add_function_cost(context->root, aggserialfn, NULL, + &costs->finalCost); if (!DO_AGGSPLIT_SKIPFINAL(context->aggsplit) && OidIsValid(aggfinalfn)) - costs->finalCost += get_func_cost(aggfinalfn) * cpu_operator_cost; + add_function_cost(context->root, aggfinalfn, NULL, + &costs->finalCost); /* * These costs are incurred only by the initial aggregate node, so we @@ -632,8 +638,8 @@ get_agg_clause_costs_walker(Node *node, get_agg_clause_costs_context *context) { cost_qual_eval_node(&argcosts, (Node *) aggref->aggdirectargs, context->root); - costs->transCost.startup += argcosts.startup; - costs->finalCost += argcosts.per_tuple; + costs->finalCost.startup += argcosts.startup; + costs->finalCost.per_tuple += argcosts.per_tuple; } /* @@ -801,7 +807,7 @@ find_window_functions_walker(Node *node, WindowFuncLists *lists) * Note: keep this in sync with expression_returns_set() in nodes/nodeFuncs.c. */ double -expression_returns_set_rows(Node *clause) +expression_returns_set_rows(PlannerInfo *root, Node *clause) { if (clause == NULL) return 1.0; @@ -810,7 +816,7 @@ expression_returns_set_rows(Node *clause) FuncExpr *expr = (FuncExpr *) clause; if (expr->funcretset) - return clamp_row_est(get_func_rows(expr->funcid)); + return clamp_row_est(get_function_rows(root, expr->funcid, clause)); } if (IsA(clause, OpExpr)) { @@ -819,7 +825,7 @@ expression_returns_set_rows(Node *clause) if (expr->opretset) { set_opfuncid(expr); - return clamp_row_est(get_func_rows(expr->opfuncid)); + return clamp_row_est(get_function_rows(root, expr->opfuncid, clause)); } } return 1.0; diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index b2637d0..999571a 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -2596,7 +2596,7 @@ create_set_projection_path(PlannerInfo *root, Node *node = (Node *) lfirst(lc); double itemrows; - itemrows = expression_returns_set_rows(node); + itemrows = expression_returns_set_rows(root, node); if (tlist_rows < itemrows) tlist_rows = itemrows; } diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 261492e..f2250eb 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -29,10 +29,12 @@ #include "catalog/heap.h" #include "catalog/partition.h" #include "catalog/pg_am.h" +#include "catalog/pg_proc.h" #include "catalog/pg_statistic_ext.h" #include "foreign/fdwapi.h" #include "miscadmin.h" #include "nodes/makefuncs.h" +#include "nodes/supportnodes.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" #include "optimizer/plancat.h" @@ -1771,6 +1773,8 @@ restriction_selectivity(PlannerInfo *root, * Returns the selectivity of a specified join operator clause. * This code executes registered procedures stored in the * operator relation, by calling the function manager. + * + * See clause_selectivity() for the meaning of the additional parameters. */ Selectivity join_selectivity(PlannerInfo *root, @@ -1805,6 +1809,184 @@ join_selectivity(PlannerInfo *root, } /* + * function_selectivity + * + * Returns the selectivity of a specified boolean function clause. + * This code executes registered procedures stored in the + * pg_proc relation, by calling the function manager. + * + * See clause_selectivity() for the meaning of the additional parameters. + */ +Selectivity +function_selectivity(PlannerInfo *root, + Oid funcid, + List *args, + Oid inputcollid, + bool is_join, + int varRelid, + JoinType jointype, + SpecialJoinInfo *sjinfo) +{ + RegProcedure prosupport = get_func_support(funcid); + SupportRequestSelectivity req; + SupportRequestSelectivity *sresult; + + /* + * If no support function is provided, use our historical default + * estimate, 0.3333333. This seems a pretty unprincipled choice, but + * Postgres has been using that estimate for function calls since 1992. + * The hoariness of this behavior suggests that we should not be in too + * much hurry to use another value. + */ + if (!prosupport) + return (Selectivity) 0.3333333; + + req.type = T_SupportRequestSelectivity; + req.root = root; + req.funcid = funcid; + req.args = args; + req.inputcollid = inputcollid; + req.is_join = is_join; + req.varRelid = varRelid; + req.jointype = jointype; + req.sjinfo = sjinfo; + req.selectivity = -1; /* to catch failure to set the value */ + + sresult = (SupportRequestSelectivity *) + DatumGetPointer(OidFunctionCall1(prosupport, + PointerGetDatum(&req))); + + /* If support function fails, use default */ + if (sresult != &req) + return (Selectivity) 0.3333333; + + if (req.selectivity < 0.0 || req.selectivity > 1.0) + elog(ERROR, "invalid function selectivity: %f", req.selectivity); + + return (Selectivity) req.selectivity; +} + +/* + * add_function_cost + * + * Get an estimate of the execution cost of a function, and *add* it to + * the contents of *cost. The estimate may include both one-time and + * per-tuple components, since QualCost does. + * + * The funcid must always be supplied. If it is being called as the + * implementation of a specific parsetree node (FuncExpr, OpExpr, + * WindowFunc, etc), pass that as "node", else pass NULL. + * + * In some usages root might be NULL, too. + */ +void +add_function_cost(PlannerInfo *root, Oid funcid, Node *node, + QualCost *cost) +{ + HeapTuple proctup; + Form_pg_proc procform; + + proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); + if (!HeapTupleIsValid(proctup)) + elog(ERROR, "cache lookup failed for function %u", funcid); + procform = (Form_pg_proc) GETSTRUCT(proctup); + + if (OidIsValid(procform->prosupport)) + { + SupportRequestCost req; + SupportRequestCost *sresult; + + req.type = T_SupportRequestCost; + req.root = root; + req.funcid = funcid; + req.node = node; + + /* Initialize cost fields so that support function doesn't have to */ + req.startup = 0; + req.per_tuple = 0; + + sresult = (SupportRequestCost *) + DatumGetPointer(OidFunctionCall1(procform->prosupport, + PointerGetDatum(&req))); + + if (sresult == &req) + { + /* Success, so accumulate support function's estimate into *cost */ + cost->startup += req.startup; + cost->per_tuple += req.per_tuple; + ReleaseSysCache(proctup); + return; + } + } + + /* No support function, or it failed, so rely on procost */ + cost->per_tuple += procform->procost * cpu_operator_cost; + + ReleaseSysCache(proctup); +} + +/* + * get_function_rows + * + * Get an estimate of the number of rows returned by a set-returning function. + * + * The funcid must always be supplied. In current usage, the calling node + * will always be supplied, and will be either a FuncExpr or OpExpr. + * But it's a good idea to not fail if it's NULL. + * + * In some usages root might be NULL, too. + * + * Note: this returns the unfiltered result of the support function, if any. + * It's usually a good idea to apply clamp_row_est() to the result, but we + * leave it to the caller to do so. + */ +double +get_function_rows(PlannerInfo *root, Oid funcid, Node *node) +{ + HeapTuple proctup; + Form_pg_proc procform; + double result; + + proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); + if (!HeapTupleIsValid(proctup)) + elog(ERROR, "cache lookup failed for function %u", funcid); + procform = (Form_pg_proc) GETSTRUCT(proctup); + + Assert(procform->proretset); /* else caller error */ + + if (OidIsValid(procform->prosupport)) + { + SupportRequestRows req; + SupportRequestRows *sresult; + + req.type = T_SupportRequestRows; + req.root = root; + req.funcid = funcid; + req.node = node; + + req.rows = 0; /* just for sanity */ + + sresult = (SupportRequestRows *) + DatumGetPointer(OidFunctionCall1(procform->prosupport, + PointerGetDatum(&req))); + + if (sresult == &req) + { + /* Success */ + ReleaseSysCache(proctup); + return req.rows; + } + } + + /* No support function, or it failed, so rely on prorows */ + result = procform->prorows; + + ReleaseSysCache(proctup); + + return result; +} + +/* * has_unique_index * * Detect whether there is a unique index on the specified attribute diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index dcb35d8..7b15581 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -1576,17 +1576,6 @@ boolvarsel(PlannerInfo *root, Node *arg, int varRelid) selec = var_eq_const(&vardata, BooleanEqualOperator, BoolGetDatum(true), false, true, false); } - else if (is_funcclause(arg)) - { - /* - * If we have no stats and it's a function call, estimate 0.3333333. - * This seems a pretty unprincipled choice, but Postgres has been - * using that estimate for function calls since 1992. The hoariness - * of this behavior suggests that we should not be in too much hurry - * to use another value. - */ - selec = 0.3333333; - } else { /* Otherwise, the default estimate is 0.5 */ @@ -3492,7 +3481,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows, * pointless to worry too much about this without much better * estimates for SRF output rowcounts than we have today.) */ - this_srf_multiplier = expression_returns_set_rows(groupexpr); + this_srf_multiplier = expression_returns_set_rows(root, groupexpr); if (srf_multiplier < this_srf_multiplier) srf_multiplier = this_srf_multiplier; diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c index fba0ee8..e88c45d 100644 --- a/src/backend/utils/cache/lsyscache.c +++ b/src/backend/utils/cache/lsyscache.c @@ -1605,41 +1605,28 @@ get_func_leakproof(Oid funcid) } /* - * get_func_cost - * Given procedure id, return the function's procost field. - */ -float4 -get_func_cost(Oid funcid) -{ - HeapTuple tp; - float4 result; - - tp = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); - if (!HeapTupleIsValid(tp)) - elog(ERROR, "cache lookup failed for function %u", funcid); - - result = ((Form_pg_proc) GETSTRUCT(tp))->procost; - ReleaseSysCache(tp); - return result; -} - -/* - * get_func_rows - * Given procedure id, return the function's prorows field. + * get_func_support + * + * Returns the support function OID associated with a given function, + * or InvalidOid if there is none. */ -float4 -get_func_rows(Oid funcid) +RegProcedure +get_func_support(Oid funcid) { HeapTuple tp; - float4 result; tp = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); - if (!HeapTupleIsValid(tp)) - elog(ERROR, "cache lookup failed for function %u", funcid); + if (HeapTupleIsValid(tp)) + { + Form_pg_proc functup = (Form_pg_proc) GETSTRUCT(tp); + RegProcedure result; - result = ((Form_pg_proc) GETSTRUCT(tp))->prorows; - ReleaseSysCache(tp); - return result; + result = functup->prosupport; + ReleaseSysCache(tp); + return result; + } + else + return (RegProcedure) InvalidOid; } /* ---------- RELATION CACHE ---------- */ diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index e029b40..e33b2ee 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -506,7 +506,10 @@ typedef enum NodeTag T_TsmRoutine, /* in access/tsmapi.h */ T_ForeignKeyCacheInfo, /* in utils/rel.h */ T_CallContext, /* in nodes/parsenodes.h */ - T_SupportRequestSimplify /* in nodes/supportnodes.h */ + T_SupportRequestSimplify, /* in nodes/supportnodes.h */ + T_SupportRequestSelectivity, /* in nodes/supportnodes.h */ + T_SupportRequestCost, /* in nodes/supportnodes.h */ + T_SupportRequestRows /* in nodes/supportnodes.h */ } NodeTag; /* diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index 3430061..af046f5 100644 --- a/src/include/nodes/relation.h +++ b/src/include/nodes/relation.h @@ -61,7 +61,7 @@ typedef struct AggClauseCosts bool hasNonPartial; /* does any agg not support partial mode? */ bool hasNonSerial; /* is any partial agg non-serializable? */ QualCost transCost; /* total per-input-row execution costs */ - Cost finalCost; /* total per-aggregated-row costs */ + QualCost finalCost; /* total per-aggregated-row costs */ Size transitionSpace; /* space for pass-by-ref transition data */ } AggClauseCosts; diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h index 1f7d02b..1a3a36b 100644 --- a/src/include/nodes/supportnodes.h +++ b/src/include/nodes/supportnodes.h @@ -36,6 +36,7 @@ #include "nodes/primnodes.h" struct PlannerInfo; /* avoid including relation.h here */ +struct SpecialJoinInfo; /* @@ -67,4 +68,103 @@ typedef struct SupportRequestSimplify FuncExpr *fcall; /* Function call to be simplified */ } SupportRequestSimplify; +/* + * The Selectivity request allows the support function to provide a + * selectivity estimate for a function appearing at top level of a WHERE + * clause (so it applies only to functions returning boolean). + * + * The input arguments are the same as are supplied to operator restriction + * and join estimators, except that we unify those two APIs into just one + * request type. See clause_selectivity() for the details. + * + * If an estimate can be made, store it into the "selectivity" field and + * return the address of the SupportRequestSelectivity node; the estimate + * must be between 0 and 1 inclusive. Return NULL if no estimate can be + * made (in which case the planner will fall back to a default estimate, + * traditionally 1/3). + * + * If the target function is being used as the implementation of an operator, + * the support function will not be used for this purpose; the operator's + * restriction or join estimator is consulted instead. + */ +typedef struct SupportRequestSelectivity +{ + NodeTag type; + + /* Input fields: */ + struct PlannerInfo *root; /* Planner's infrastructure */ + Oid funcid; /* function we are inquiring about */ + List *args; /* pre-simplified arguments to function */ + Oid inputcollid; /* function's input collation */ + bool is_join; /* is this a join or restriction case? */ + int varRelid; /* if restriction, RTI of target relation */ + JoinType jointype; /* if join, outer join type */ + struct SpecialJoinInfo *sjinfo; /* if outer join, info about join */ + + /* Output fields: */ + Selectivity selectivity; /* returned selectivity estimate */ +} SupportRequestSelectivity; + +/* + * The Cost request allows the support function to provide an execution + * cost estimate for its target function. The cost estimate can include + * both a one-time (query startup) component and a per-execution component. + * The estimate should *not* include the costs of evaluating the target + * function's arguments, only the target function itself. + * + * The "node" argument is normally the parse node that is invoking the + * target function. This is a FuncExpr in the simplest case, but it could + * also be an OpExpr, DistinctExpr, NullIfExpr, or WindowFunc, or possibly + * other cases in future. NULL is passed if the function cannot presume + * its arguments to be equivalent to what the calling node presents as + * arguments; that happens for, e.g., aggregate support functions and + * per-column comparison operators used by RowExprs. + * + * If an estimate can be made, store it into the cost fields and return the + * address of the SupportRequestCost node. Return NULL if no estimate can be + * made, in which case the planner will rely on the target function's procost + * field. (Note: while procost is automatically scaled by cpu_operator_cost, + * this is not the case for the outputs of the Cost request; the support + * function must scale its results appropriately on its own.) + */ +typedef struct SupportRequestCost +{ + NodeTag type; + + /* Input fields: */ + struct PlannerInfo *root; /* Planner's infrastructure (could be NULL) */ + Oid funcid; /* function we are inquiring about */ + Node *node; /* parse node invoking function, or NULL */ + + /* Output fields: */ + Cost startup; /* one-time cost */ + Cost per_tuple; /* per-evaluation cost */ +} SupportRequestCost; + +/* + * The Rows request allows the support function to provide an output rowcount + * estimate for its target function (so it applies only to set-returning + * functions). + * + * The "node" argument is the parse node that is invoking the target function; + * currently this will always be a FuncExpr or OpExpr. + * + * If an estimate can be made, store it into the rows field and return the + * address of the SupportRequestRows node. Return NULL if no estimate can be + * made, in which case the planner will rely on the target function's prorows + * field. + */ +typedef struct SupportRequestRows +{ + NodeTag type; + + /* Input fields: */ + struct PlannerInfo *root; /* Planner's infrastructure (could be NULL) */ + Oid funcid; /* function we are inquiring about */ + Node *node; /* parse node invoking function */ + + /* Output fields: */ + double rows; /* number of rows expected to be returned */ +} SupportRequestRows; + #endif /* SUPPORTNODES_H */ diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h index 5c8580e..626fb1c 100644 --- a/src/include/optimizer/clauses.h +++ b/src/include/optimizer/clauses.h @@ -53,7 +53,7 @@ extern void get_agg_clause_costs(PlannerInfo *root, Node *clause, extern bool contain_window_function(Node *clause); extern WindowFuncLists *find_window_functions(Node *clause, Index maxWinRef); -extern double expression_returns_set_rows(Node *clause); +extern double expression_returns_set_rows(PlannerInfo *root, Node *clause); extern bool contain_subplans(Node *clause); diff --git a/src/include/optimizer/plancat.h b/src/include/optimizer/plancat.h index a1b2325..4eb2e42 100644 --- a/src/include/optimizer/plancat.h +++ b/src/include/optimizer/plancat.h @@ -55,6 +55,20 @@ extern Selectivity join_selectivity(PlannerInfo *root, JoinType jointype, SpecialJoinInfo *sjinfo); +extern Selectivity function_selectivity(PlannerInfo *root, + Oid funcid, + List *args, + Oid inputcollid, + bool is_join, + int varRelid, + JoinType jointype, + SpecialJoinInfo *sjinfo); + +extern void add_function_cost(PlannerInfo *root, Oid funcid, Node *node, + QualCost *cost); + +extern double get_function_rows(PlannerInfo *root, Oid funcid, Node *node); + extern bool has_row_triggers(PlannerInfo *root, Index rti, CmdType event); #endif /* PLANCAT_H */ diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h index ceec85d..16b0b1d 100644 --- a/src/include/utils/lsyscache.h +++ b/src/include/utils/lsyscache.h @@ -120,8 +120,7 @@ extern char func_volatile(Oid funcid); extern char func_parallel(Oid funcid); extern char get_func_prokind(Oid funcid); extern bool get_func_leakproof(Oid funcid); -extern float4 get_func_cost(Oid funcid); -extern float4 get_func_rows(Oid funcid); +extern RegProcedure get_func_support(Oid funcid); extern Oid get_relname_relid(const char *relname, Oid relnamespace); extern char *get_rel_name(Oid relid); extern Oid get_rel_namespace(Oid relid); diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out index 130a0e4..0879c88 100644 --- a/src/test/regress/expected/misc_functions.out +++ b/src/test/regress/expected/misc_functions.out @@ -133,3 +133,63 @@ ERROR: function num_nulls() does not exist LINE 1: SELECT num_nulls(); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. +-- +-- Test adding a support function to a subject function +-- +CREATE FUNCTION my_int_eq(int, int) RETURNS bool + LANGUAGE internal STRICT IMMUTABLE PARALLEL SAFE + AS $$int4eq$$; +-- By default, planner does not think that's selective +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN tenk1 b ON a.unique1 = b.unique1 +WHERE my_int_eq(a.unique2, 42); + QUERY PLAN +---------------------------------------------- + Hash Join + Hash Cond: (b.unique1 = a.unique1) + -> Seq Scan on tenk1 b + -> Hash + -> Seq Scan on tenk1 a + Filter: my_int_eq(unique2, 42) +(6 rows) + +-- With support function that knows it's int4eq, we get a different plan +ALTER FUNCTION my_int_eq(int, int) SUPPORT test_support_func; +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN tenk1 b ON a.unique1 = b.unique1 +WHERE my_int_eq(a.unique2, 42); + QUERY PLAN +------------------------------------------------- + Nested Loop + -> Seq Scan on tenk1 a + Filter: my_int_eq(unique2, 42) + -> Index Scan using tenk1_unique1 on tenk1 b + Index Cond: (unique1 = a.unique1) +(5 rows) + +-- Also test non-default rowcount estimate +CREATE FUNCTION my_gen_series(int, int) RETURNS SETOF integer + LANGUAGE internal STRICT IMMUTABLE PARALLEL SAFE + AS $$generate_series_int4$$ + SUPPORT test_support_func; +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN my_gen_series(1,1000) g ON a.unique1 = g; + QUERY PLAN +---------------------------------------- + Hash Join + Hash Cond: (g.g = a.unique1) + -> Function Scan on my_gen_series g + -> Hash + -> Seq Scan on tenk1 a +(5 rows) + +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g; + QUERY PLAN +------------------------------------------------- + Nested Loop + -> Function Scan on my_gen_series g + -> Index Scan using tenk1_unique1 on tenk1 a + Index Cond: (unique1 = g.g) +(4 rows) + diff --git a/src/test/regress/input/create_function_1.source b/src/test/regress/input/create_function_1.source index 26e2227..223454a 100644 --- a/src/test/regress/input/create_function_1.source +++ b/src/test/regress/input/create_function_1.source @@ -68,6 +68,11 @@ CREATE FUNCTION test_fdw_handler() AS '@libdir@/regress@DLSUFFIX@', 'test_fdw_handler' LANGUAGE C; +CREATE FUNCTION test_support_func(internal) + RETURNS internal + AS '@libdir@/regress@DLSUFFIX@', 'test_support_func' + LANGUAGE C STRICT; + -- Things that shouldn't work: CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL diff --git a/src/test/regress/output/create_function_1.source b/src/test/regress/output/create_function_1.source index 8c50d9b..5f43e8d 100644 --- a/src/test/regress/output/create_function_1.source +++ b/src/test/regress/output/create_function_1.source @@ -60,6 +60,10 @@ CREATE FUNCTION test_fdw_handler() RETURNS fdw_handler AS '@libdir@/regress@DLSUFFIX@', 'test_fdw_handler' LANGUAGE C; +CREATE FUNCTION test_support_func(internal) + RETURNS internal + AS '@libdir@/regress@DLSUFFIX@', 'test_support_func' + LANGUAGE C STRICT; -- Things that shouldn't work: CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL AS 'SELECT ''not an integer'';'; diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c index 7072728..ec14f2c 100644 --- a/src/test/regress/regress.c +++ b/src/test/regress/regress.c @@ -23,12 +23,16 @@ #include "access/transam.h" #include "access/tuptoaster.h" #include "access/xact.h" +#include "catalog/pg_operator.h" #include "catalog/pg_type.h" #include "commands/sequence.h" #include "commands/trigger.h" #include "executor/executor.h" #include "executor/spi.h" #include "miscadmin.h" +#include "nodes/supportnodes.h" +#include "optimizer/cost.h" +#include "optimizer/plancat.h" #include "port/atomics.h" #include "utils/builtins.h" #include "utils/geo_decls.h" @@ -863,3 +867,76 @@ test_fdw_handler(PG_FUNCTION_ARGS) elog(ERROR, "test_fdw_handler is not implemented"); PG_RETURN_NULL(); } + +PG_FUNCTION_INFO_V1(test_support_func); +Datum +test_support_func(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSelectivity)) + { + /* + * Assume that the target is int4eq; that's safe as long as we don't + * attach this to any other boolean-returning function. + */ + SupportRequestSelectivity *req = (SupportRequestSelectivity *) rawreq; + Selectivity s1; + + if (req->is_join) + s1 = join_selectivity(req->root, Int4EqualOperator, + req->args, + req->inputcollid, + req->jointype, + req->sjinfo); + else + s1 = restriction_selectivity(req->root, Int4EqualOperator, + req->args, + req->inputcollid, + req->varRelid); + + req->selectivity = s1; + ret = (Node *) req; + } + + if (IsA(rawreq, SupportRequestCost)) + { + /* Provide some generic estimate */ + SupportRequestCost *req = (SupportRequestCost *) rawreq; + + req->startup = 0; + req->per_tuple = 2 * cpu_operator_cost; + ret = (Node *) req; + } + + if (IsA(rawreq, SupportRequestRows)) + { + /* + * Assume that the target is generate_series_int4; that's safe as long + * as we don't attach this to any other set-returning function. + */ + SupportRequestRows *req = (SupportRequestRows *) rawreq; + + if (req->node && IsA(req->node, FuncExpr)) /* be paranoid */ + { + List *args = ((FuncExpr *) req->node)->args; + Node *arg1 = linitial(args); + Node *arg2 = lsecond(args); + + if (IsA(arg1, Const) && + !((Const *) arg1)->constisnull && + IsA(arg2, Const) && + !((Const *) arg2)->constisnull) + { + int32 val1 = DatumGetInt32(((Const *) arg1)->constvalue); + int32 val2 = DatumGetInt32(((Const *) arg2)->constvalue); + + req->rows = val2 - val1 + 1; + ret = (Node *) req; + } + } + } + + PG_RETURN_POINTER(ret); +} diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql index 1a20c1f..7a71f76 100644 --- a/src/test/regress/sql/misc_functions.sql +++ b/src/test/regress/sql/misc_functions.sql @@ -29,3 +29,35 @@ SELECT num_nulls(VARIADIC '{}'::int[]); -- should fail, one or more arguments is required SELECT num_nonnulls(); SELECT num_nulls(); + +-- +-- Test adding a support function to a subject function +-- + +CREATE FUNCTION my_int_eq(int, int) RETURNS bool + LANGUAGE internal STRICT IMMUTABLE PARALLEL SAFE + AS $$int4eq$$; + +-- By default, planner does not think that's selective +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN tenk1 b ON a.unique1 = b.unique1 +WHERE my_int_eq(a.unique2, 42); + +-- With support function that knows it's int4eq, we get a different plan +ALTER FUNCTION my_int_eq(int, int) SUPPORT test_support_func; + +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN tenk1 b ON a.unique1 = b.unique1 +WHERE my_int_eq(a.unique2, 42); + +-- Also test non-default rowcount estimate +CREATE FUNCTION my_gen_series(int, int) RETURNS SETOF integer + LANGUAGE internal STRICT IMMUTABLE PARALLEL SAFE + AS $$generate_series_int4$$ + SUPPORT test_support_func; + +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN my_gen_series(1,1000) g ON a.unique1 = g; + +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g;
I wrote: > There's a considerable amount of follow-up work that ought to happen > now to make use of these capabilities for places that have been > pain points in the past, such as generate_series() and unnest(). > But I haven't touched that yet. Attached is an 0004 that makes a stab at providing some intelligence for unnest() and the integer cases of generate_series(). This only affects one plan choice in the existing regression tests; I tweaked that test to keep the plan the same. I didn't add new test cases demonstrating the functionality, since it's a bit hard to show it directly within the constraints of EXPLAIN (COSTS OFF). We could do something along the lines of the quick-hack rowcount test in 0003, perhaps, but that's pretty indirect. Looking at this, I'm dissatisfied with the amount of new #include's being dragged into datatype-specific .c files. I don't really want to end up with most of utils/adt/ having dependencies on planner data structures, but that's where we would be headed. I can think of a couple of possibilities: * Instead of putting support functions beside their target function, group all the core's support functions into one new .c file. I'm afraid this would lead to the reverse problem of having to import lots of datatype-private info into that file. * Try to refactor the planner's .h files so that there's just one "external use" header providing stuff like estimate_expression_value, while keeping PlannerInfo as an opaque struct. Then importing that into utils/adt/ files would not represent such a big dependency footprint. I find the second choice more appealing, though it's getting a bit far afield from where this started. OTOH, lots of other header refactoring is going on right now, so why not ... Thoughts? regards, tom lane diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index e457d81..14cc202 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -22,12 +22,15 @@ #include "catalog/pg_type.h" #include "funcapi.h" #include "libpq/pqformat.h" +#include "nodes/supportnodes.h" +#include "optimizer/clauses.h" #include "utils/array.h" #include "utils/arrayaccess.h" #include "utils/builtins.h" #include "utils/datum.h" #include "utils/lsyscache.h" #include "utils/memutils.h" +#include "utils/selfuncs.h" #include "utils/typcache.h" @@ -6026,6 +6029,36 @@ array_unnest(PG_FUNCTION_ARGS) } } +/* + * Planner support function for array_unnest(anyarray) + */ +Datum +array_unnest_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestRows)) + { + /* Try to estimate the number of rows returned */ + SupportRequestRows *req = (SupportRequestRows *) rawreq; + + if (is_funcclause(req->node)) /* be paranoid */ + { + List *args = ((FuncExpr *) req->node)->args; + Node *arg1; + + /* We can use estimated argument values here */ + arg1 = estimate_expression_value(req->root, linitial(args)); + + req->rows = estimate_array_length(arg1); + ret = (Node *) req; + } + } + + PG_RETURN_POINTER(ret); +} + /* * array_replace/array_remove support diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c index fd82a83..263920c 100644 --- a/src/backend/utils/adt/int.c +++ b/src/backend/utils/adt/int.c @@ -30,11 +30,14 @@ #include <ctype.h> #include <limits.h> +#include <math.h> #include "catalog/pg_type.h" #include "common/int.h" #include "funcapi.h" #include "libpq/pqformat.h" +#include "nodes/supportnodes.h" +#include "optimizer/clauses.h" #include "utils/array.h" #include "utils/builtins.h" @@ -1427,3 +1430,73 @@ generate_series_step_int4(PG_FUNCTION_ARGS) /* do when there is no more left */ SRF_RETURN_DONE(funcctx); } + +/* + * Planner support function for generate_series(int4, int4 [, int4]) + */ +Datum +generate_series_int4_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestRows)) + { + /* Try to estimate the number of rows returned */ + SupportRequestRows *req = (SupportRequestRows *) rawreq; + + if (is_funcclause(req->node)) /* be paranoid */ + { + List *args = ((FuncExpr *) req->node)->args; + Node *arg1, + *arg2, + *arg3; + + /* We can use estimated argument values here */ + arg1 = estimate_expression_value(req->root, linitial(args)); + arg2 = estimate_expression_value(req->root, lsecond(args)); + if (list_length(args) >= 3) + arg3 = estimate_expression_value(req->root, lthird(args)); + else + arg3 = NULL; + + /* + * If any argument is constant NULL, we can safely assume that + * zero rows are returned. Otherwise, if they're all non-NULL + * constants, we can calculate the number of rows that will be + * returned. Use double arithmetic to avoid overflow hazards. + */ + if ((IsA(arg1, Const) && + ((Const *) arg1)->constisnull) || + (IsA(arg2, Const) && + ((Const *) arg2)->constisnull) || + (arg3 != NULL && IsA(arg3, Const) && + ((Const *) arg3)->constisnull)) + { + req->rows = 0; + ret = (Node *) req; + } + else if (IsA(arg1, Const) && + IsA(arg2, Const) && + (arg3 == NULL || IsA(arg3, Const))) + { + double start, + finish, + step; + + start = DatumGetInt32(((Const *) arg1)->constvalue); + finish = DatumGetInt32(((Const *) arg2)->constvalue); + step = arg3 ? DatumGetInt32(((Const *) arg3)->constvalue) : 1; + + /* This equation works for either sign of step */ + if (step != 0) + { + req->rows = floor((finish - start + step) / step); + ret = (Node *) req; + } + } + } + } + + PG_RETURN_POINTER(ret); +} diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c index d16cc9e..5157de4 100644 --- a/src/backend/utils/adt/int8.c +++ b/src/backend/utils/adt/int8.c @@ -20,6 +20,8 @@ #include "common/int.h" #include "funcapi.h" #include "libpq/pqformat.h" +#include "nodes/supportnodes.h" +#include "optimizer/clauses.h" #include "utils/int8.h" #include "utils/builtins.h" @@ -1373,3 +1375,73 @@ generate_series_step_int8(PG_FUNCTION_ARGS) /* do when there is no more left */ SRF_RETURN_DONE(funcctx); } + +/* + * Planner support function for generate_series(int8, int8 [, int8]) + */ +Datum +generate_series_int8_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestRows)) + { + /* Try to estimate the number of rows returned */ + SupportRequestRows *req = (SupportRequestRows *) rawreq; + + if (is_funcclause(req->node)) /* be paranoid */ + { + List *args = ((FuncExpr *) req->node)->args; + Node *arg1, + *arg2, + *arg3; + + /* We can use estimated argument values here */ + arg1 = estimate_expression_value(req->root, linitial(args)); + arg2 = estimate_expression_value(req->root, lsecond(args)); + if (list_length(args) >= 3) + arg3 = estimate_expression_value(req->root, lthird(args)); + else + arg3 = NULL; + + /* + * If any argument is constant NULL, we can safely assume that + * zero rows are returned. Otherwise, if they're all non-NULL + * constants, we can calculate the number of rows that will be + * returned. Use double arithmetic to avoid overflow hazards. + */ + if ((IsA(arg1, Const) && + ((Const *) arg1)->constisnull) || + (IsA(arg2, Const) && + ((Const *) arg2)->constisnull) || + (arg3 != NULL && IsA(arg3, Const) && + ((Const *) arg3)->constisnull)) + { + req->rows = 0; + ret = (Node *) req; + } + else if (IsA(arg1, Const) && + IsA(arg2, Const) && + (arg3 == NULL || IsA(arg3, Const))) + { + double start, + finish, + step; + + start = DatumGetInt64(((Const *) arg1)->constvalue); + finish = DatumGetInt64(((Const *) arg2)->constvalue); + step = arg3 ? DatumGetInt64(((Const *) arg3)->constvalue) : 1; + + /* This equation works for either sign of step */ + if (step != 0) + { + req->rows = floor((finish - start + step) / step); + ret = (Node *) req; + } + } + } + } + + PG_RETURN_POINTER(ret); +} diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index e5cb5bb..039b596 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -1530,9 +1530,12 @@ proargtypes => 'anyelement _int4 _int4', prosrc => 'array_fill_with_lower_bounds' }, { oid => '2331', descr => 'expand array to set of rows', - proname => 'unnest', prorows => '100', proretset => 't', - prorettype => 'anyelement', proargtypes => 'anyarray', + proname => 'unnest', prorows => '100', prosupport => 'array_unnest_support', + proretset => 't', prorettype => 'anyelement', proargtypes => 'anyarray', prosrc => 'array_unnest' }, +{ oid => '3996', descr => 'planner support for array_unnest', + proname => 'array_unnest_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'array_unnest_support' }, { oid => '3167', descr => 'remove any occurrences of an element from an array', proname => 'array_remove', proisstrict => 'f', prorettype => 'anyarray', @@ -7536,21 +7539,31 @@ # non-persistent series generator { oid => '1066', descr => 'non-persistent series generator', - proname => 'generate_series', prorows => '1000', proretset => 't', + proname => 'generate_series', prorows => '1000', + prosupport => 'generate_series_int4_support', proretset => 't', prorettype => 'int4', proargtypes => 'int4 int4 int4', prosrc => 'generate_series_step_int4' }, { oid => '1067', descr => 'non-persistent series generator', - proname => 'generate_series', prorows => '1000', proretset => 't', + proname => 'generate_series', prorows => '1000', + prosupport => 'generate_series_int4_support', proretset => 't', prorettype => 'int4', proargtypes => 'int4 int4', prosrc => 'generate_series_int4' }, +{ oid => '3994', descr => 'planner support for generate_series', + proname => 'generate_series_int4_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'generate_series_int4_support' }, { oid => '1068', descr => 'non-persistent series generator', - proname => 'generate_series', prorows => '1000', proretset => 't', + proname => 'generate_series', prorows => '1000', + prosupport => 'generate_series_int8_support', proretset => 't', prorettype => 'int8', proargtypes => 'int8 int8 int8', prosrc => 'generate_series_step_int8' }, { oid => '1069', descr => 'non-persistent series generator', - proname => 'generate_series', prorows => '1000', proretset => 't', + proname => 'generate_series', prorows => '1000', + prosupport => 'generate_series_int8_support', proretset => 't', prorettype => 'int8', proargtypes => 'int8 int8', prosrc => 'generate_series_int8' }, +{ oid => '3995', descr => 'planner support for generate_series', + proname => 'generate_series_int8_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'generate_series_int8_support' }, { oid => '3259', descr => 'non-persistent series generator', proname => 'generate_series', prorows => '1000', proretset => 't', prorettype => 'numeric', proargtypes => 'numeric numeric numeric', diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out index 588d069..4056afa 100644 --- a/src/test/regress/expected/subselect.out +++ b/src/test/regress/expected/subselect.out @@ -904,7 +904,7 @@ select * from int4_tbl where -- explain (verbose, costs off) select * from int4_tbl o where (f1, f1) in - (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1); + (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1); QUERY PLAN ------------------------------------------------------------------- Nested Loop Semi Join @@ -918,9 +918,9 @@ select * from int4_tbl o where (f1, f1) in Output: "ANY_subquery".f1, "ANY_subquery".g Filter: ("ANY_subquery".f1 = "ANY_subquery".g) -> Result - Output: i.f1, ((generate_series(1, 2)) / 10) + Output: i.f1, ((generate_series(1, 50)) / 10) -> ProjectSet - Output: generate_series(1, 2), i.f1 + Output: generate_series(1, 50), i.f1 -> HashAggregate Output: i.f1 Group Key: i.f1 @@ -929,7 +929,7 @@ select * from int4_tbl o where (f1, f1) in (19 rows) select * from int4_tbl o where (f1, f1) in - (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1); + (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1); f1 ---- 0 diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql index 843f511..ccbe8a1 100644 --- a/src/test/regress/sql/subselect.sql +++ b/src/test/regress/sql/subselect.sql @@ -498,9 +498,9 @@ select * from int4_tbl where -- explain (verbose, costs off) select * from int4_tbl o where (f1, f1) in - (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1); + (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1); select * from int4_tbl o where (f1, f1) in - (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1); + (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1); -- -- check for over-optimization of whole-row Var referencing an Append plan
On Sun, 20 Jan 2019 at 23:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:
What I'm envisioning therefore is that we allow an auxiliary function to
be attached to any operator or function that can provide functionality
like this, and that we set things up so that the set of tasks that
such functions can perform can be extended over time without SQL-level
changes. For example, we could say that the function takes a single
Node* argument, and that the type of Node tells it what to do, and if it
doesn't recognize the type of Node it should just return NULL indicating
"use default handling". We'd start out with two relevant Node types,
one for the selectivity-estimation case and one for the extract-a-lossy-
index-qual case, and we could add more over time.
Does this help with these cases?
* Allow a set returning function to specify number of output rows, in cases where that is variable and dependent upon the input params?
* Allow a normal term to match a functional index, e.g. WHERE x = 'abcdefgh' => WHERE substr(x, 1 , 5) = 'abcde' AND x = 'abcdefgh'
* Allow us to realise that ORDER BY f(x) => ORDER BY x so we can use ordered paths from indexes, or avoid sorts.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs <simon@2ndquadrant.com> writes: > On Sun, 20 Jan 2019 at 23:48, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> What I'm envisioning therefore is that we allow an auxiliary function ... > Does this help with these cases? > * Allow a set returning function to specify number of output rows, in cases > where that is variable and dependent upon the input params? Yes, within the usual limits of what the planner can know. The 0004 patch I posted yesterday correctly estimates the number of rows for constant-arguments cases of generate_series() and unnest(anyarray), and it also understands unnest(array[x,y,z,...]) even when some of the array[] elements aren't constants. There's room to add knowledge about other SRFs, but those are cases I can recall hearing complaints about. > * Allow a normal term to match a functional index, e.g. WHERE x = > 'abcdefgh' => WHERE substr(x, 1 , 5) = 'abcde' AND x = 'abcdefgh' I'm a bit confused about what you think this example means. I do intend to work on letting extensions define rules for extracting index clauses from function calls, because that's the requirement that PostGIS is after in the thread that started this. I don't know whether that would satisfy your concern, because I'm not clear on what your concern is. > * Allow us to realise that ORDER BY f(x) => ORDER BY x so we can use > ordered paths from indexes, or avoid sorts. Hm. That's not part of what I'm hoping to get done for v12, but you could imagine a future extension to add a support request type that allows deriving related pathkeys. There would be a lot of work to do to make that happen, but the aspect of it that requires adding function-specific knowledge could usefully be packaged as a support-function request. regards, tom lane
On Sat, Jan 26, 2019 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Attached is an 0004 that makes a stab at providing some intelligence > for unnest() and the integer cases of generate_series(). That looks awesome. I'm somewhat dubious about whole API. It's basically -- if you have a problem and a PhD in PostgreSQL-ology, you can write some C code to fix it. On the other hand, the status quo is that you may as well just forget about fixing it, which is clearly even worse. And I don't really know how to do better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Jan 26, 2019 at 12:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Attached is an 0004 that makes a stab at providing some intelligence >> for unnest() and the integer cases of generate_series(). > That looks awesome. > I'm somewhat dubious about whole API. It's basically -- if you have a > problem and a PhD in PostgreSQL-ology, you can write some C code to > fix it. On the other hand, the status quo is that you may as well > just forget about fixing it, which is clearly even worse. And I don't > really know how to do better. Well, you need to be able to write a C extension :-(. I kinda wish that were not a requirement, but in practice I think the main audience is people like PostGIS, who already cleared that bar. I hope that we'll soon have a bunch of examples, like those in the 0004 patch, that people can look at to see how to do things in this area. I see no reason to believe it'll be all that much harder than anything else extension authors have to do. regards, tom lane
On Sun, 27 Jan 2019 at 19:17, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> * Allow a normal term to match a functional index, e.g. WHERE x =
> 'abcdefgh' => WHERE substr(x, 1 , 5) = 'abcde' AND x = 'abcdefgh'
I'm a bit confused about what you think this example means. I do
intend to work on letting extensions define rules for extracting
index clauses from function calls, because that's the requirement
that PostGIS is after in the thread that started this. I don't
know whether that would satisfy your concern, because I'm not clear
on what your concern is.
To be able to extract indexable clauses where none existed before.
Hash functions assume that x = N => hash(x) = hash(N) AND x = N
so I want to be able to assume
x = K => f(x) = f(K) AND x = K
for specific f()
to allow indexable operations when we have an index on f(x) only
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs <simon@2ndquadrant.com> writes: > On Sun, 27 Jan 2019 at 19:17, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> ... I don't >> know whether that would satisfy your concern, because I'm not clear >> on what your concern is. > To be able to extract indexable clauses where none existed before. That's a pretty vague statement, because it describes what I want to do perfectly, but this doesn't: > Hash functions assume that x = N => hash(x) = hash(N) AND x = N > so I want to be able to assume > x = K => f(x) = f(K) AND x = K > for specific f() > to allow indexable operations when we have an index on f(x) only The problem with that is that if the only thing that's in the query is "x = K" then there is nothing to cue the planner that it'd be worth expending cycles thinking about f(x). Sure, you could hang a planner support function on the equals operator that would go off and expend arbitrary amounts of computation looking for speculative matches ... but nobody is going to accept that as a patch, because the cost/benefit ratio is going to be awful for 99% of users. The mechanism I'm proposing is based on the thought that for specialized functions (or operators) like PostGIS' ST_Intersects(), it'll be worth expending extra cycles when one of those shows up in WHERE. I don't think that scales to plain-vanilla equality though. Conceivably, you could turn that around and look for support functions attached to the functions/operators that are in an index expression, and give them the opportunity to derive lossy indexquals based on comparing the index expression to query quals. I have no particular interest in working on that right now, because it doesn't respond to what I understand PostGIS' need to be, and there are only so many hours in the day. But maybe it could be made workable in the future. regards, tom lane
On Tue, 29 Jan 2019 at 09:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Sun, 27 Jan 2019 at 19:17, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ... I don't
>> know whether that would satisfy your concern, because I'm not clear
>> on what your concern is.
> To be able to extract indexable clauses where none existed before.
That's a pretty vague statement, because it describes what I want
to do perfectly, but this doesn't:
> Hash functions assume that x = N => hash(x) = hash(N) AND x = N
> so I want to be able to assume
> x = K => f(x) = f(K) AND x = K
> for specific f()
> to allow indexable operations when we have an index on f(x) only
The problem with that is that if the only thing that's in the query is
"x = K" then there is nothing to cue the planner that it'd be worth
expending cycles thinking about f(x).
I agree. That is the equivalent of a SeqScan; the wrong way to approach it.
Sure, you could hang a planner
support function on the equals operator that would go off and expend
arbitrary amounts of computation looking for speculative matches ...
but nobody is going to accept that as a patch, because the cost/benefit
ratio is going to be awful for 99% of users.
The mechanism I'm proposing is based on the thought that for
specialized functions (or operators) like PostGIS' ST_Intersects(),
it'll be worth expending extra cycles when one of those shows up
in WHERE. I don't think that scales to plain-vanilla equality though.
Conceivably, you could turn that around and look for support functions
attached to the functions/operators that are in an index expression,
and give them the opportunity to derive lossy indexquals based on
comparing the index expression to query quals.
That way around is the right way. If an index exists, explore whether it can be used or not. If there are no indexes with appropriate support functions, it will cost almost nothing to normal queries.
The problem of deriving potentially useful indexes is more expensive, I understand.
I have no particular
interest in working on that right now, because it doesn't respond to
what I understand PostGIS' need to be, and there are only so many
hours in the day. But maybe it could be made workable in the future.
I thought the whole exercise was about adding generic tools for everybody to use. The Tom I've worked with for more than a few years would not have said that; that is my normal line! You said PostGIS was looking to "automatically convert WHERE clauses into lossy index quals." which sounds very similar to what I outlined.
Either way, thanks.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Simon Riggs <simon@2ndquadrant.com> writes: > On Tue, 29 Jan 2019 at 09:55, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I have no particular >> interest in working on that right now, because it doesn't respond to >> what I understand PostGIS' need to be, and there are only so many >> hours in the day. But maybe it could be made workable in the future. > I thought the whole exercise was about adding generic tools for everybody > to use. Well, I'm building infrastructure plus a small subset of what might someday sit atop that infrastructure. I'm not prepared to commit right now to building stuff I can't finish for v12. > You said PostGIS was looking to > "automatically convert WHERE clauses into lossy index quals." which sounds > very similar to what I outlined. As I understand it, what they have is complex WHERE clauses from which they want to extract clauses usable with simple (non-expression) indexes. The case you seem to be worried about is the reverse: complicated index definition and simple WHERE clause. I think we're agreed that these two cases can't be solved with the very same facility. The support-function mechanism probably can be used to provide extensibility for logic that tries to attack the complicated-index case, but its mere existence won't cause that logic to spring into being. regards, tom lane
Just to show I'm not completely crazy, here's a more or less feature-complete patch set for doing $SUBJECT. Patches 0001-0005 are the same as previously posted, either in this thread or <22182.1549124950@sss.pgh.pa.us>, but rebased over the planner header refactoring I committed recently. Patch 0006 is the new work: it removes all the "special index operator" cruft from indxpath.c and puts it into planner support functions. I need to write (a lot) more about the API specification for this support request type, but I think the code is pretty much OK. I'm still dithering about where to put these planner support functions. 0006 drops them into a new file "utils/adt/likesupport.c", but I'm not sold on that as a final answer. The LIKE and regex support functions should share code, but the execution functions for those are in different files (like.c and regexp.c), so the "put it beside the execution function" heuristic isn't much help. Also, those functions rely on the pattern_fixed_prefix() functionality that's currently in selfuncs.c. I'd kind of like to end up with that in the same file as its callers. In any case, the network-subset support code need not stay beside the LIKE/regex functions, but I didn't bother to find a new home for it yet. Another thing worth commenting about is that I'd intended to have all the LIKE/regex functions share one support function, using a switch on function OID to determine what to do exactly, much as the existing code used a switch on operator OID. That crashed and burned though, because some of those functions have multiple aliases in pg_proc, but fmgroids.h has a macro for only one of the aliases. Maybe it's time to do something about that? The factorization I used instead, with a separate support function for each pattern-matching rule, isn't awful; but I can foresee that this won't be a great answer for all cases. Barring objections, I hope to push forward and commit this soon. regards, tom lane diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index af4d062..6dd0700 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -5146,11 +5146,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l </row> <row> - <entry><structfield>protransform</structfield></entry> + <entry><structfield>prosupport</structfield></entry> <entry><type>regproc</type></entry> <entry><literal><link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.oid</literal></entry> - <entry>Calls to this function can be simplified by this other function - (see <xref linkend="xfunc-transform-functions"/>)</entry> + <entry>Optional planner support function for this function + (see <xref linkend="xfunc-optimization"/>)</entry> </row> <row> diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index e18272c..d70aa6e 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -3241,40 +3241,6 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray </para> </sect2> - <sect2 id="xfunc-transform-functions"> - <title>Transform Functions</title> - - <para> - Some function calls can be simplified during planning based on - properties specific to the function. For example, - <literal>int4mul(n, 1)</literal> could be simplified to just <literal>n</literal>. - To define such function-specific optimizations, write a - <firstterm>transform function</firstterm> and place its OID in the - <structfield>protransform</structfield> field of the primary function's - <structname>pg_proc</structname> entry. The transform function must have the SQL - signature <literal>protransform(internal) RETURNS internal</literal>. The - argument, actually <type>FuncExpr *</type>, is a dummy node representing a - call to the primary function. If the transform function's study of the - expression tree proves that a simplified expression tree can substitute - for all possible concrete calls represented thereby, build and return - that simplified expression. Otherwise, return a <literal>NULL</literal> - pointer (<emphasis>not</emphasis> a SQL null). - </para> - - <para> - We make no guarantee that <productname>PostgreSQL</productname> will never call the - primary function in cases that the transform function could simplify. - Ensure rigorous equivalence between the simplified expression and an - actual call to the primary function. - </para> - - <para> - Currently, this facility is not exposed to users at the SQL level - because of security concerns, so it is only practical to use for - optimizing built-in functions. - </para> - </sect2> - <sect2> <title>Shared Memory and LWLocks</title> @@ -3388,3 +3354,89 @@ if (!ptr) </sect2> </sect1> + + <sect1 id="xfunc-optimization"> + <title>Function Optimization Information</title> + + <indexterm zone="xfunc-optimization"> + <primary>optimization information</primary> + <secondary>for functions</secondary> + </indexterm> + + <para> + By default, a function is just a <quote>black box</quote> that the + database system knows very little about the behavior of. However, + that means that queries using the function may be executed much less + efficiently than they could be. It is possible to supply additional + knowledge that helps the planner optimize function calls. + </para> + + <para> + Some basic facts can be supplied by declarative annotations provided in + the <xref linkend="sql-createfunction"/> command. Most important of + these is the function's <link linkend="xfunc-volatility">volatility + category</link> (<literal>IMMUTABLE</literal>, <literal>STABLE</literal>, + or <literal>VOLATILE</literal>); one should always be careful to + specify this correctly when defining a function. + The parallel safety property (<literal>PARALLEL + UNSAFE</literal>, <literal>PARALLEL RESTRICTED</literal>, or + <literal>PARALLEL SAFE</literal>) must also be specified if you hope + to use the function in parallelized queries. + It can also be useful to specify the function's estimated execution + cost, and/or the number of rows a set-returning function is estimated + to return. However, the declarative way of specifying those two + facts only allows specifying a constant value, which is often + inadequate. + </para> + + <para> + It is also possible to attach a <firstterm>planner support + function</firstterm> to a SQL-callable function (called + its <firstterm>target function</firstterm>), and thereby provide + knowledge about the target function that is too complex to be + represented declaratively. Planner support functions have to be + written in C (although their target functions might not be), so this is + an advanced feature that relatively few people will use. + </para> + + <para> + A planner support function must have the SQL signature +<programlisting> +supportfn(internal) returns internal +</programlisting> + It is attached to its target function by specifying + the <literal>SUPPORT</literal> clause when creating the target function. + </para> + + <para> + The details of the API for planner support functions can be found in + file <filename>src/include/nodes/supportnodes.h</filename> in the + <productname>PostgreSQL</productname> source code. Here we provide + just an overview of what planner support functions can do. + The set of possible requests to a support function is extensible, + so more things might be possible in future versions. + </para> + + <para> + Some function calls can be simplified during planning based on + properties specific to the function. For example, + <literal>int4mul(n, 1)</literal> could be simplified to + just <literal>n</literal>. This type of transformation can be + performed by a planner support function, by having it implement + the <literal>SupportRequestSimplify</literal> request type. + The support function will be called for each instance of its target + function found in a query parse tree. If it finds that the particular + call can be simplified into some other form, it can build and return a + parse tree representing that expression. This will automatically work + for operators based on the function, too — in the example just + given, <literal>n * 1</literal> would also be simplified to + <literal>n</literal>. + (But note that this is just an example; this particular + optimization is not actually performed by + standard <productname>PostgreSQL</productname>.) + We make no guarantee that <productname>PostgreSQL</productname> will + never call the target function in cases that the support function could + simplify. Ensure rigorous equivalence between the simplified + expression and an actual execution of the target function. + </para> + </sect1> diff --git a/doc/src/sgml/xoper.sgml b/doc/src/sgml/xoper.sgml index 2f5560a..260e43c 100644 --- a/doc/src/sgml/xoper.sgml +++ b/doc/src/sgml/xoper.sgml @@ -78,6 +78,11 @@ SELECT (a + b) AS c FROM test_complex; <sect1 id="xoper-optimization"> <title>Operator Optimization Information</title> + <indexterm zone="xoper-optimization"> + <primary>optimization information</primary> + <secondary>for operators</secondary> + </indexterm> + <para> A <productname>PostgreSQL</productname> operator definition can include several optional clauses that tell the system useful things about how @@ -97,6 +102,13 @@ SELECT (a + b) AS c FROM test_complex; the ones that release &version; understands. </para> + <para> + It is also possible to attach a planner support function to the function + that underlies an operator, providing another way of telling the system + about the behavior of the operator. + See <xref linkend="xfunc-optimization"/> for more information. + </para> + <sect2> <title><literal>COMMUTATOR</literal></title> diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index db78061..3a86f1e 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -319,7 +319,7 @@ ProcedureCreate(const char *procedureName, values[Anum_pg_proc_procost - 1] = Float4GetDatum(procost); values[Anum_pg_proc_prorows - 1] = Float4GetDatum(prorows); values[Anum_pg_proc_provariadic - 1] = ObjectIdGetDatum(variadicType); - values[Anum_pg_proc_protransform - 1] = ObjectIdGetDatum(InvalidOid); + values[Anum_pg_proc_prosupport - 1] = ObjectIdGetDatum(InvalidOid); values[Anum_pg_proc_prokind - 1] = CharGetDatum(prokind); values[Anum_pg_proc_prosecdef - 1] = BoolGetDatum(security_definer); values[Anum_pg_proc_proleakproof - 1] = BoolGetDatum(isLeakProof); diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 86c346b..1f60be2 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -32,6 +32,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" #include "optimizer/optimizer.h" @@ -4046,13 +4047,16 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod, args, funcvariadic, func_tuple, context); - if (!newexpr && allow_non_const && OidIsValid(func_form->protransform)) + if (!newexpr && allow_non_const && OidIsValid(func_form->prosupport)) { /* - * Build a dummy FuncExpr node containing the simplified arg list. We - * use this approach to present a uniform interface to the transform - * function regardless of how the function is actually being invoked. + * Build a SupportRequestSimplify node to pass to the support + * function, pointing to a dummy FuncExpr node containing the + * simplified arg list. We use this approach to present a uniform + * interface to the support function regardless of how the target + * function is actually being invoked. */ + SupportRequestSimplify req; FuncExpr fexpr; fexpr.xpr.type = T_FuncExpr; @@ -4066,9 +4070,16 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod, fexpr.args = args; fexpr.location = -1; + req.type = T_SupportRequestSimplify; + req.root = context->root; + req.fcall = &fexpr; + newexpr = (Expr *) - DatumGetPointer(OidFunctionCall1(func_form->protransform, - PointerGetDatum(&fexpr))); + DatumGetPointer(OidFunctionCall1(func_form->prosupport, + PointerGetDatum(&req))); + + /* catch a possible API misunderstanding */ + Assert(newexpr != (Expr *) &fexpr); } if (!newexpr && allow_non_const) diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c index 3810e4a..cf5a1c6 100644 --- a/src/backend/utils/adt/date.c +++ b/src/backend/utils/adt/date.c @@ -24,6 +24,7 @@ #include "access/xact.h" #include "libpq/pqformat.h" #include "miscadmin.h" +#include "nodes/supportnodes.h" #include "parser/scansup.h" #include "utils/array.h" #include "utils/builtins.h" @@ -1341,15 +1342,25 @@ make_time(PG_FUNCTION_ARGS) } -/* time_transform() - * Flatten calls to time_scale() and timetz_scale() that solely represent - * increases in allowed precision. +/* time_support() + * + * Planner support function for the time_scale() and timetz_scale() + * length coercion functions (we need not distinguish them here). */ Datum -time_transform(PG_FUNCTION_ARGS) +time_support(PG_FUNCTION_ARGS) { - PG_RETURN_POINTER(TemporalTransform(MAX_TIME_PRECISION, - (Node *) PG_GETARG_POINTER(0))); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + + ret = TemporalSimplify(MAX_TIME_PRECISION, (Node *) req->fcall); + } + + PG_RETURN_POINTER(ret); } /* time_scale() diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c index 61dbd05..0068e71 100644 --- a/src/backend/utils/adt/datetime.c +++ b/src/backend/utils/adt/datetime.c @@ -4462,16 +4462,23 @@ CheckDateTokenTables(void) } /* - * Common code for temporal protransform functions. Types time, timetz, - * timestamp and timestamptz each have a range of allowed precisions. An - * unspecified precision is rigorously equivalent to the highest specifiable - * precision. + * Common code for temporal prosupport functions: simplify, if possible, + * a call to a temporal type's length-coercion function. + * + * Types time, timetz, timestamp and timestamptz each have a range of allowed + * precisions. An unspecified precision is rigorously equivalent to the + * highest specifiable precision. We can replace the function call with a + * no-op RelabelType if it is coercing to the same or higher precision as the + * input is known to have. + * + * The input Node is always a FuncExpr, but to reduce the #include footprint + * of datetime.h, we declare it as Node *. * * Note: timestamp_scale throws an error when the typmod is out of range, but * we can't get there from a cast: our typmodin will have caught it already. */ Node * -TemporalTransform(int32 max_precis, Node *node) +TemporalSimplify(int32 max_precis, Node *node) { FuncExpr *expr = castNode(FuncExpr, node); Node *ret = NULL; diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index 45cd1a0..1c9deeb 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -34,6 +34,7 @@ #include "libpq/pqformat.h" #include "miscadmin.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/float.h" @@ -890,45 +891,53 @@ numeric_send(PG_FUNCTION_ARGS) /* - * numeric_transform() - + * numeric_support() * - * Flatten calls to numeric's length coercion function that solely represent - * increases in allowable precision. Scale changes mutate every datum, so - * they are unoptimizable. Some values, e.g. 1E-1001, can only fit into an - * unconstrained numeric, so a change from an unconstrained numeric to any - * constrained numeric is also unoptimizable. + * Planner support function for the numeric() length coercion function. + * + * Flatten calls that solely represent increases in allowable precision. + * Scale changes mutate every datum, so they are unoptimizable. Some values, + * e.g. 1E-1001, can only fit into an unconstrained numeric, so a change from + * an unconstrained numeric to any constrained numeric is also unoptimizable. */ Datum -numeric_transform(PG_FUNCTION_ARGS) +numeric_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; - Node *typmod; - Assert(list_length(expr->args) >= 2); + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; + Node *typmod; - typmod = (Node *) lsecond(expr->args); + Assert(list_length(expr->args) >= 2); - if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) - { - Node *source = (Node *) linitial(expr->args); - int32 old_typmod = exprTypmod(source); - int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); - int32 old_scale = (old_typmod - VARHDRSZ) & 0xffff; - int32 new_scale = (new_typmod - VARHDRSZ) & 0xffff; - int32 old_precision = (old_typmod - VARHDRSZ) >> 16 & 0xffff; - int32 new_precision = (new_typmod - VARHDRSZ) >> 16 & 0xffff; + typmod = (Node *) lsecond(expr->args); - /* - * If new_typmod < VARHDRSZ, the destination is unconstrained; that's - * always OK. If old_typmod >= VARHDRSZ, the source is constrained, - * and we're OK if the scale is unchanged and the precision is not - * decreasing. See further notes in function header comment. - */ - if (new_typmod < (int32) VARHDRSZ || - (old_typmod >= (int32) VARHDRSZ && - new_scale == old_scale && new_precision >= old_precision)) - ret = relabel_to_typmod(source, new_typmod); + if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) + { + Node *source = (Node *) linitial(expr->args); + int32 old_typmod = exprTypmod(source); + int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); + int32 old_scale = (old_typmod - VARHDRSZ) & 0xffff; + int32 new_scale = (new_typmod - VARHDRSZ) & 0xffff; + int32 old_precision = (old_typmod - VARHDRSZ) >> 16 & 0xffff; + int32 new_precision = (new_typmod - VARHDRSZ) >> 16 & 0xffff; + + /* + * If new_typmod < VARHDRSZ, the destination is unconstrained; + * that's always OK. If old_typmod >= VARHDRSZ, the source is + * constrained, and we're OK if the scale is unchanged and the + * precision is not decreasing. See further notes in function + * header comment. + */ + if (new_typmod < (int32) VARHDRSZ || + (old_typmod >= (int32) VARHDRSZ && + new_scale == old_scale && new_precision >= old_precision)) + ret = relabel_to_typmod(source, new_typmod); + } } PG_RETURN_POINTER(ret); diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c index 7befb6a..e0ef2f7 100644 --- a/src/backend/utils/adt/timestamp.c +++ b/src/backend/utils/adt/timestamp.c @@ -29,6 +29,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "parser/scansup.h" #include "utils/array.h" #include "utils/builtins.h" @@ -297,15 +298,26 @@ timestamptypmodout(PG_FUNCTION_ARGS) } -/* timestamp_transform() - * Flatten calls to timestamp_scale() and timestamptz_scale() that solely - * represent increases in allowed precision. +/* + * timestamp_support() + * + * Planner support function for the timestamp_scale() and timestamptz_scale() + * length coercion functions (we need not distinguish them here). */ Datum -timestamp_transform(PG_FUNCTION_ARGS) +timestamp_support(PG_FUNCTION_ARGS) { - PG_RETURN_POINTER(TemporalTransform(MAX_TIMESTAMP_PRECISION, - (Node *) PG_GETARG_POINTER(0))); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + + ret = TemporalSimplify(MAX_TIMESTAMP_PRECISION, (Node *) req->fcall); + } + + PG_RETURN_POINTER(ret); } /* timestamp_scale() @@ -1235,59 +1247,69 @@ intervaltypmodleastfield(int32 typmod) } -/* interval_transform() +/* + * interval_support() + * + * Planner support function for interval_scale(). + * * Flatten superfluous calls to interval_scale(). The interval typmod is * complex to permit accepting and regurgitating all SQL standard variations. * For truncation purposes, it boils down to a single, simple granularity. */ Datum -interval_transform(PG_FUNCTION_ARGS) +interval_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; - Node *typmod; - Assert(list_length(expr->args) >= 2); + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; + Node *typmod; - typmod = (Node *) lsecond(expr->args); + Assert(list_length(expr->args) >= 2); - if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) - { - Node *source = (Node *) linitial(expr->args); - int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); - bool noop; + typmod = (Node *) lsecond(expr->args); - if (new_typmod < 0) - noop = true; - else + if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) { - int32 old_typmod = exprTypmod(source); - int old_least_field; - int new_least_field; - int old_precis; - int new_precis; - - old_least_field = intervaltypmodleastfield(old_typmod); - new_least_field = intervaltypmodleastfield(new_typmod); - if (old_typmod < 0) - old_precis = INTERVAL_FULL_PRECISION; + Node *source = (Node *) linitial(expr->args); + int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); + bool noop; + + if (new_typmod < 0) + noop = true; else - old_precis = INTERVAL_PRECISION(old_typmod); - new_precis = INTERVAL_PRECISION(new_typmod); - - /* - * Cast is a no-op if least field stays the same or decreases - * while precision stays the same or increases. But precision, - * which is to say, sub-second precision, only affects ranges that - * include SECOND. - */ - noop = (new_least_field <= old_least_field) && - (old_least_field > 0 /* SECOND */ || - new_precis >= MAX_INTERVAL_PRECISION || - new_precis >= old_precis); + { + int32 old_typmod = exprTypmod(source); + int old_least_field; + int new_least_field; + int old_precis; + int new_precis; + + old_least_field = intervaltypmodleastfield(old_typmod); + new_least_field = intervaltypmodleastfield(new_typmod); + if (old_typmod < 0) + old_precis = INTERVAL_FULL_PRECISION; + else + old_precis = INTERVAL_PRECISION(old_typmod); + new_precis = INTERVAL_PRECISION(new_typmod); + + /* + * Cast is a no-op if least field stays the same or decreases + * while precision stays the same or increases. But + * precision, which is to say, sub-second precision, only + * affects ranges that include SECOND. + */ + noop = (new_least_field <= old_least_field) && + (old_least_field > 0 /* SECOND */ || + new_precis >= MAX_INTERVAL_PRECISION || + new_precis >= old_precis); + } + if (noop) + ret = relabel_to_typmod(source, new_typmod); } - if (noop) - ret = relabel_to_typmod(source, new_typmod); } PG_RETURN_POINTER(ret); @@ -1359,7 +1381,7 @@ AdjustIntervalForTypmod(Interval *interval, int32 typmod) * can't do it consistently. (We cannot enforce a range limit on the * highest expected field, since we do not have any equivalent of * SQL's <interval leading field precision>.) If we ever decide to - * revisit this, interval_transform will likely require adjusting. + * revisit this, interval_support will likely require adjusting. * * Note: before PG 8.4 we interpreted a limited set of fields as * actually causing a "modulo" operation on a given value, potentially @@ -5020,18 +5042,6 @@ interval_part(PG_FUNCTION_ARGS) } -/* timestamp_zone_transform() - * The original optimization here caused problems by relabeling Vars that - * could be matched to index entries. It might be possible to resurrect it - * at some point by teaching the planner to be less cavalier with RelabelType - * nodes, but that will take careful analysis. - */ -Datum -timestamp_zone_transform(PG_FUNCTION_ARGS) -{ - PG_RETURN_POINTER(NULL); -} - /* timestamp_zone() * Encode timestamp type with specified time zone. * This function is just timestamp2timestamptz() except instead of @@ -5125,18 +5135,6 @@ timestamp_zone(PG_FUNCTION_ARGS) PG_RETURN_TIMESTAMPTZ(result); } -/* timestamp_izone_transform() - * The original optimization here caused problems by relabeling Vars that - * could be matched to index entries. It might be possible to resurrect it - * at some point by teaching the planner to be less cavalier with RelabelType - * nodes, but that will take careful analysis. - */ -Datum -timestamp_izone_transform(PG_FUNCTION_ARGS) -{ - PG_RETURN_POINTER(NULL); -} - /* timestamp_izone() * Encode timestamp type with specified time interval as time zone. */ diff --git a/src/backend/utils/adt/varbit.c b/src/backend/utils/adt/varbit.c index 1585da0..fdcc620 100644 --- a/src/backend/utils/adt/varbit.c +++ b/src/backend/utils/adt/varbit.c @@ -20,6 +20,7 @@ #include "common/int.h" #include "libpq/pqformat.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/varbit.h" @@ -672,32 +673,41 @@ varbit_send(PG_FUNCTION_ARGS) } /* - * varbit_transform() - * Flatten calls to varbit's length coercion function that set the new maximum - * length >= the previous maximum length. We can ignore the isExplicit - * argument, since that only affects truncation cases. + * varbit_support() + * + * Planner support function for the varbit() length coercion function. + * + * Currently, the only interesting thing we can do is flatten calls that set + * the new maximum length >= the previous maximum length. We can ignore the + * isExplicit argument, since that only affects truncation cases. */ Datum -varbit_transform(PG_FUNCTION_ARGS) +varbit_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; - Node *typmod; - Assert(list_length(expr->args) >= 2); + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; + Node *typmod; - typmod = (Node *) lsecond(expr->args); + Assert(list_length(expr->args) >= 2); - if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) - { - Node *source = (Node *) linitial(expr->args); - int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); - int32 old_max = exprTypmod(source); - int32 new_max = new_typmod; - - /* Note: varbit() treats typmod 0 as invalid, so we do too */ - if (new_max <= 0 || (old_max > 0 && old_max <= new_max)) - ret = relabel_to_typmod(source, new_typmod); + typmod = (Node *) lsecond(expr->args); + + if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) + { + Node *source = (Node *) linitial(expr->args); + int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); + int32 old_max = exprTypmod(source); + int32 new_max = new_typmod; + + /* Note: varbit() treats typmod 0 as invalid, so we do too */ + if (new_max <= 0 || (old_max > 0 && old_max <= new_max)) + ret = relabel_to_typmod(source, new_typmod); + } } PG_RETURN_POINTER(ret); diff --git a/src/backend/utils/adt/varchar.c b/src/backend/utils/adt/varchar.c index 5cf927e..c866af0 100644 --- a/src/backend/utils/adt/varchar.c +++ b/src/backend/utils/adt/varchar.c @@ -21,6 +21,7 @@ #include "catalog/pg_type.h" #include "libpq/pqformat.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/varlena.h" @@ -547,32 +548,41 @@ varcharsend(PG_FUNCTION_ARGS) /* - * varchar_transform() - * Flatten calls to varchar's length coercion function that set the new maximum - * length >= the previous maximum length. We can ignore the isExplicit - * argument, since that only affects truncation cases. + * varchar_support() + * + * Planner support function for the varchar() length coercion function. + * + * Currently, the only interesting thing we can do is flatten calls that set + * the new maximum length >= the previous maximum length. We can ignore the + * isExplicit argument, since that only affects truncation cases. */ Datum -varchar_transform(PG_FUNCTION_ARGS) +varchar_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; - Node *typmod; - Assert(list_length(expr->args) >= 2); + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; + Node *typmod; - typmod = (Node *) lsecond(expr->args); + Assert(list_length(expr->args) >= 2); - if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) - { - Node *source = (Node *) linitial(expr->args); - int32 old_typmod = exprTypmod(source); - int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); - int32 old_max = old_typmod - VARHDRSZ; - int32 new_max = new_typmod - VARHDRSZ; - - if (new_typmod < 0 || (old_typmod >= 0 && old_max <= new_max)) - ret = relabel_to_typmod(source, new_typmod); + typmod = (Node *) lsecond(expr->args); + + if (IsA(typmod, Const) &&!((Const *) typmod)->constisnull) + { + Node *source = (Node *) linitial(expr->args); + int32 old_typmod = exprTypmod(source); + int32 new_typmod = DatumGetInt32(((Const *) typmod)->constvalue); + int32 old_max = old_typmod - VARHDRSZ; + int32 new_max = new_typmod - VARHDRSZ; + + if (new_typmod < 0 || (old_typmod >= 0 && old_max <= new_max)) + ret = relabel_to_typmod(source, new_typmod); + } } PG_RETURN_POINTER(ret); diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index 245fcbf..4ff358a 100644 --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl @@ -1883,9 +1883,9 @@ my %tests = ( 'CREATE TRANSFORM FOR int' => { create_order => 34, create_sql => - 'CREATE TRANSFORM FOR int LANGUAGE SQL (FROM SQL WITH FUNCTION varchar_transform(internal), TO SQL WITH FUNCTIONint4recv(internal));', + 'CREATE TRANSFORM FOR int LANGUAGE SQL (FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTIONint4recv(internal));', regexp => - qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog\.varchar_transform\(internal\),TO SQL WITH FUNCTION pg_catalog\.int4recv\(internal\)\);/m, + qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog\.varchar_support\(internal\),TO SQL WITH FUNCTION pg_catalog\.int4recv\(internal\)\);/m, like => { %full_runs, section_pre_data => 1, }, }, @@ -2880,7 +2880,7 @@ my %tests = ( procost, prorows, provariadic, - protransform, + prosupport, prokind, prosecdef, proleakproof, @@ -2912,7 +2912,7 @@ my %tests = ( \QGRANT SELECT(procost) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prorows) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(provariadic) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* - \QGRANT SELECT(protransform) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* + \QGRANT SELECT(prosupport) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prokind) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prosecdef) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(proleakproof) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 3ecc2e1..e5cb5bb 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -1326,11 +1326,11 @@ { oid => '668', descr => 'adjust char() to typmod length', proname => 'bpchar', prorettype => 'bpchar', proargtypes => 'bpchar int4 bool', prosrc => 'bpchar' }, -{ oid => '3097', descr => 'transform a varchar length coercion', - proname => 'varchar_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'varchar_transform' }, +{ oid => '3097', descr => 'planner support for varchar length coercion', + proname => 'varchar_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'varchar_support' }, { oid => '669', descr => 'adjust varchar() to typmod length', - proname => 'varchar', protransform => 'varchar_transform', + proname => 'varchar', prosupport => 'varchar_support', prorettype => 'varchar', proargtypes => 'varchar int4 bool', prosrc => 'varchar' }, @@ -1954,13 +1954,9 @@ # OIDS 1000 - 1999 -{ oid => '3994', descr => 'transform a time zone adjustment', - proname => 'timestamp_izone_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_izone_transform' }, { oid => '1026', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_izone_transform', - prorettype => 'timestamp', proargtypes => 'interval timestamptz', - prosrc => 'timestamptz_izone' }, + proname => 'timezone', prorettype => 'timestamp', + proargtypes => 'interval timestamptz', prosrc => 'timestamptz_izone' }, { oid => '1031', descr => 'I/O', proname => 'aclitemin', provolatile => 's', prorettype => 'aclitem', @@ -2190,13 +2186,9 @@ { oid => '1158', descr => 'convert UNIX epoch to timestamptz', proname => 'to_timestamp', prorettype => 'timestamptz', proargtypes => 'float8', prosrc => 'float8_timestamptz' }, -{ oid => '3995', descr => 'transform a time zone adjustment', - proname => 'timestamp_zone_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_zone_transform' }, { oid => '1159', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_zone_transform', - prorettype => 'timestamp', proargtypes => 'text timestamptz', - prosrc => 'timestamptz_zone' }, + proname => 'timezone', prorettype => 'timestamp', + proargtypes => 'text timestamptz', prosrc => 'timestamptz_zone' }, { oid => '1160', descr => 'I/O', proname => 'interval_in', provolatile => 's', prorettype => 'interval', @@ -2301,11 +2293,11 @@ # OIDS 1200 - 1299 -{ oid => '3918', descr => 'transform an interval length coercion', - proname => 'interval_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'interval_transform' }, +{ oid => '3918', descr => 'planner support for interval length coercion', + proname => 'interval_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'interval_support' }, { oid => '1200', descr => 'adjust interval precision', - proname => 'interval', protransform => 'interval_transform', + proname => 'interval', prosupport => 'interval_support', prorettype => 'interval', proargtypes => 'interval int4', prosrc => 'interval_scale' }, @@ -3713,13 +3705,12 @@ { oid => '1685', descr => 'adjust bit() to typmod length', proname => 'bit', prorettype => 'bit', proargtypes => 'bit int4 bool', prosrc => 'bit' }, -{ oid => '3158', descr => 'transform a varbit length coercion', - proname => 'varbit_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'varbit_transform' }, +{ oid => '3158', descr => 'planner support for varbit length coercion', + proname => 'varbit_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'varbit_support' }, { oid => '1687', descr => 'adjust varbit() to typmod length', - proname => 'varbit', protransform => 'varbit_transform', - prorettype => 'varbit', proargtypes => 'varbit int4 bool', - prosrc => 'varbit' }, + proname => 'varbit', prosupport => 'varbit_support', prorettype => 'varbit', + proargtypes => 'varbit int4 bool', prosrc => 'varbit' }, { oid => '1698', descr => 'position of sub-bitstring', proname => 'position', prorettype => 'int4', proargtypes => 'bit bit', @@ -4081,11 +4072,11 @@ { oid => '2918', descr => 'I/O typmod', proname => 'numerictypmodout', prorettype => 'cstring', proargtypes => 'int4', prosrc => 'numerictypmodout' }, -{ oid => '3157', descr => 'transform a numeric length coercion', - proname => 'numeric_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'numeric_transform' }, +{ oid => '3157', descr => 'planner support for numeric length coercion', + proname => 'numeric_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'numeric_support' }, { oid => '1703', descr => 'adjust numeric to typmod precision/scale', - proname => 'numeric', protransform => 'numeric_transform', + proname => 'numeric', prosupport => 'numeric_support', prorettype => 'numeric', proargtypes => 'numeric int4', prosrc => 'numeric' }, { oid => '1704', proname => 'numeric_abs', prorettype => 'numeric', proargtypes => 'numeric', @@ -5448,15 +5439,15 @@ proname => 'bytea_sortsupport', prorettype => 'void', proargtypes => 'internal', prosrc => 'bytea_sortsupport' }, -{ oid => '3917', descr => 'transform a timestamp length coercion', - proname => 'timestamp_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_transform' }, -{ oid => '3944', descr => 'transform a time length coercion', - proname => 'time_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'time_transform' }, +{ oid => '3917', descr => 'planner support for timestamp length coercion', + proname => 'timestamp_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'timestamp_support' }, +{ oid => '3944', descr => 'planner support for time length coercion', + proname => 'time_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'time_support' }, { oid => '1961', descr => 'adjust timestamp precision', - proname => 'timestamp', protransform => 'timestamp_transform', + proname => 'timestamp', prosupport => 'timestamp_support', prorettype => 'timestamp', proargtypes => 'timestamp int4', prosrc => 'timestamp_scale' }, @@ -5468,14 +5459,14 @@ prosrc => 'oidsmaller' }, { oid => '1967', descr => 'adjust timestamptz precision', - proname => 'timestamptz', protransform => 'timestamp_transform', + proname => 'timestamptz', prosupport => 'timestamp_support', prorettype => 'timestamptz', proargtypes => 'timestamptz int4', prosrc => 'timestamptz_scale' }, { oid => '1968', descr => 'adjust time precision', - proname => 'time', protransform => 'time_transform', prorettype => 'time', + proname => 'time', prosupport => 'time_support', prorettype => 'time', proargtypes => 'time int4', prosrc => 'time_scale' }, { oid => '1969', descr => 'adjust time with time zone precision', - proname => 'timetz', protransform => 'time_transform', prorettype => 'timetz', + proname => 'timetz', prosupport => 'time_support', prorettype => 'timetz', proargtypes => 'timetz int4', prosrc => 'timetz_scale' }, { oid => '2003', @@ -5662,13 +5653,11 @@ prosrc => 'select pg_catalog.age(cast(current_date as timestamp without time zone), $1)' }, { oid => '2069', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_zone_transform', - prorettype => 'timestamptz', proargtypes => 'text timestamp', - prosrc => 'timestamp_zone' }, + proname => 'timezone', prorettype => 'timestamptz', + proargtypes => 'text timestamp', prosrc => 'timestamp_zone' }, { oid => '2070', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_izone_transform', - prorettype => 'timestamptz', proargtypes => 'interval timestamp', - prosrc => 'timestamp_izone' }, + proname => 'timezone', prorettype => 'timestamptz', + proargtypes => 'interval timestamp', prosrc => 'timestamp_izone' }, { oid => '2071', proname => 'date_pl_interval', prorettype => 'timestamp', proargtypes => 'date interval', prosrc => 'date_pl_interval' }, diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index c2bb951..b433769 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -53,8 +53,8 @@ CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,Proce /* element type of variadic array, or 0 */ Oid provariadic BKI_DEFAULT(0) BKI_LOOKUP(pg_type); - /* transforms calls to it during planning */ - regproc protransform BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); + /* planner support function for this function, or 0 if none */ + regproc prosupport BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); /* see PROKIND_ categories below */ char prokind BKI_DEFAULT(f); diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index fbe2dc1..3d3adc2 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -505,7 +505,8 @@ typedef enum NodeTag T_IndexAmRoutine, /* in access/amapi.h */ T_TsmRoutine, /* in access/tsmapi.h */ T_ForeignKeyCacheInfo, /* in utils/rel.h */ - T_CallContext /* in nodes/parsenodes.h */ + T_CallContext, /* in nodes/parsenodes.h */ + T_SupportRequestSimplify /* in nodes/supportnodes.h */ } NodeTag; /* diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h new file mode 100644 index 0000000..1f7d02b --- /dev/null +++ b/src/include/nodes/supportnodes.h @@ -0,0 +1,70 @@ +/*------------------------------------------------------------------------- + * + * supportnodes.h + * Definitions for planner support functions. + * + * This file defines the API for "planner support functions", which + * are SQL functions (normally written in C) that can be attached to + * another "target" function to give the system additional knowledge + * about the target function. All the current capabilities have to do + * with planning queries that use the target function, though it is + * possible that future extensions will add functionality to be invoked + * by the parser or executor. + * + * A support function must have the SQL signature + * supportfn(internal) returns internal + * The argument is a pointer to one of the Node types defined in this file. + * The result is usually also a Node pointer, though its type depends on + * which capability is being invoked. In all cases, a NULL pointer result + * (that's PG_RETURN_POINTER(NULL), not PG_RETURN_NULL()) indicates that + * the support function cannot do anything useful for the given request. + * Support functions must return a NULL pointer, not fail, if they do not + * recognize the request node type or cannot handle the given case; this + * allows for future extensions of the set of request cases. + * + * + * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/nodes/supportnodes.h + * + *------------------------------------------------------------------------- + */ +#ifndef SUPPORTNODES_H +#define SUPPORTNODES_H + +#include "nodes/primnodes.h" + +struct PlannerInfo; /* avoid including relation.h here */ + + +/* + * The Simplify request allows the support function to perform plan-time + * simplification of a call to its target function. For example, a varchar + * length coercion that does not decrease the allowed length of its argument + * could be replaced by a RelabelType node, or "x + 0" could be replaced by + * "x". This is invoked during the planner's constant-folding pass, so the + * function's arguments can be presumed already simplified. + * + * The planner's PlannerInfo "root" is typically not needed, but can be + * consulted if it's necessary to obtain info about Vars present in + * the given node tree. Beware that root could be NULL in some usages. + * + * "fcall" will be a FuncExpr invoking the support function's target + * function. (This is true even if the original parsetree node was an + * operator call; a FuncExpr is synthesized for this purpose.) + * + * The result should be a semantically-equivalent transformed node tree, + * or NULL if no simplification could be performed. Do *not* return or + * modify *fcall, as it isn't really a separately allocated Node. But + * it's okay to use fcall->args, or parts of it, in the result tree. + */ +typedef struct SupportRequestSimplify +{ + NodeTag type; + + struct PlannerInfo *root; /* Planner's infrastructure */ + FuncExpr *fcall; /* Function call to be simplified */ +} SupportRequestSimplify; + +#endif /* SUPPORTNODES_H */ diff --git a/src/include/utils/datetime.h b/src/include/utils/datetime.h index f5ec9bb..87f819e 100644 --- a/src/include/utils/datetime.h +++ b/src/include/utils/datetime.h @@ -330,7 +330,7 @@ extern int DecodeUnits(int field, char *lowtoken, int *val); extern int j2day(int jd); -extern Node *TemporalTransform(int32 max_precis, Node *node); +extern Node *TemporalSimplify(int32 max_precis, Node *node); extern bool CheckDateTokenTables(void); diff --git a/src/test/modules/test_ddl_deparse/expected/create_transform.out b/src/test/modules/test_ddl_deparse/expected/create_transform.out index 0d1cc36..da7fea2 100644 --- a/src/test/modules/test_ddl_deparse/expected/create_transform.out +++ b/src/test/modules/test_ddl_deparse/expected/create_transform.out @@ -7,7 +7,7 @@ -- internal and as return argument the datatype of the transform done. -- pl/plpgsql does not authorize the use of internal as data type. CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); NOTICE: DDL test: type simple, tag CREATE TRANSFORM DROP TRANSFORM FOR int LANGUAGE SQL; diff --git a/src/test/modules/test_ddl_deparse/sql/create_transform.sql b/src/test/modules/test_ddl_deparse/sql/create_transform.sql index 0968702..132fc5a 100644 --- a/src/test/modules/test_ddl_deparse/sql/create_transform.sql +++ b/src/test/modules/test_ddl_deparse/sql/create_transform.sql @@ -8,7 +8,7 @@ -- internal and as return argument the datatype of the transform done. -- pl/plpgsql does not authorize the use of internal as data type. CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); DROP TRANSFORM FOR int LANGUAGE SQL; diff --git a/src/test/regress/expected/object_address.out b/src/test/regress/expected/object_address.out index 4085e45..c89ec06 100644 --- a/src/test/regress/expected/object_address.out +++ b/src/test/regress/expected/object_address.out @@ -38,7 +38,7 @@ CREATE USER MAPPING FOR regress_addr_user SERVER "integer"; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user IN SCHEMA public GRANT ALL ON TABLES TO regress_addr_user; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user REVOKE DELETE ON TABLES FROM regress_addr_user; CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (connect = false, slot_name = NONE); diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out index ef268d3..4edc817 100644 --- a/src/test/regress/expected/oidjoins.out +++ b/src/test/regress/expected/oidjoins.out @@ -809,12 +809,12 @@ WHERE provariadic != 0 AND ------+------------- (0 rows) -SELECT ctid, protransform +SELECT ctid, prosupport FROM pg_catalog.pg_proc fk -WHERE protransform != 0 AND - NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.protransform); - ctid | protransform -------+-------------- +WHERE prosupport != 0 AND + NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.prosupport); + ctid | prosupport +------+------------ (0 rows) SELECT ctid, prorettype diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out index 7328095..ce25ee0 100644 --- a/src/test/regress/expected/opr_sanity.out +++ b/src/test/regress/expected/opr_sanity.out @@ -453,10 +453,10 @@ WHERE proallargtypes IS NOT NULL AND -----+---------+-------------+----------------+------------- (0 rows) --- Check for protransform functions with the wrong signature +-- Check for prosupport functions with the wrong signature SELECT p1.oid, p1.proname, p2.oid, p2.proname FROM pg_proc AS p1, pg_proc AS p2 -WHERE p2.oid = p1.protransform AND +WHERE p2.oid = p1.prosupport AND (p2.prorettype != 'internal'::regtype OR p2.proretset OR p2.pronargs != 1 OR p2.proargtypes[0] != 'internal'::regtype); oid | proname | oid | proname diff --git a/src/test/regress/sql/object_address.sql b/src/test/regress/sql/object_address.sql index d7df322..fd79465 100644 --- a/src/test/regress/sql/object_address.sql +++ b/src/test/regress/sql/object_address.sql @@ -41,7 +41,7 @@ CREATE USER MAPPING FOR regress_addr_user SERVER "integer"; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user IN SCHEMA public GRANT ALL ON TABLES TO regress_addr_user; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user REVOKE DELETE ON TABLES FROM regress_addr_user; CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (connect = false, slot_name = NONE); diff --git a/src/test/regress/sql/oidjoins.sql b/src/test/regress/sql/oidjoins.sql index c8291d3..dbe4a58 100644 --- a/src/test/regress/sql/oidjoins.sql +++ b/src/test/regress/sql/oidjoins.sql @@ -405,10 +405,10 @@ SELECT ctid, provariadic FROM pg_catalog.pg_proc fk WHERE provariadic != 0 AND NOT EXISTS(SELECT 1 FROM pg_catalog.pg_type pk WHERE pk.oid = fk.provariadic); -SELECT ctid, protransform +SELECT ctid, prosupport FROM pg_catalog.pg_proc fk -WHERE protransform != 0 AND - NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.protransform); +WHERE prosupport != 0 AND + NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.prosupport); SELECT ctid, prorettype FROM pg_catalog.pg_proc fk WHERE prorettype != 0 AND diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql index 8544cbe..e2014fc 100644 --- a/src/test/regress/sql/opr_sanity.sql +++ b/src/test/regress/sql/opr_sanity.sql @@ -353,10 +353,10 @@ WHERE proallargtypes IS NOT NULL AND FROM generate_series(1, array_length(proallargtypes, 1)) g(i) WHERE proargmodes IS NULL OR proargmodes[i] IN ('i', 'b', 'v')); --- Check for protransform functions with the wrong signature +-- Check for prosupport functions with the wrong signature SELECT p1.oid, p1.proname, p2.oid, p2.proname FROM pg_proc AS p1, pg_proc AS p2 -WHERE p2.oid = p1.protransform AND +WHERE p2.oid = p1.prosupport AND (p2.prorettype != 'internal'::regtype OR p2.proretset OR p2.pronargs != 1 OR p2.proargtypes[0] != 'internal'::regtype); diff --git a/src/tools/findoidjoins/README b/src/tools/findoidjoins/README index 305454a..e5fc310 100644 --- a/src/tools/findoidjoins/README +++ b/src/tools/findoidjoins/README @@ -161,7 +161,7 @@ Join pg_catalog.pg_proc.pronamespace => pg_catalog.pg_namespace.oid Join pg_catalog.pg_proc.proowner => pg_catalog.pg_authid.oid Join pg_catalog.pg_proc.prolang => pg_catalog.pg_language.oid Join pg_catalog.pg_proc.provariadic => pg_catalog.pg_type.oid -Join pg_catalog.pg_proc.protransform => pg_catalog.pg_proc.oid +Join pg_catalog.pg_proc.prosupport => pg_catalog.pg_proc.oid Join pg_catalog.pg_proc.prorettype => pg_catalog.pg_type.oid Join pg_catalog.pg_range.rngtypid => pg_catalog.pg_type.oid Join pg_catalog.pg_range.rngsubtype => pg_catalog.pg_type.oid diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index af4d062..6dd0700 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -5146,11 +5146,11 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l </row> <row> - <entry><structfield>protransform</structfield></entry> + <entry><structfield>prosupport</structfield></entry> <entry><type>regproc</type></entry> <entry><literal><link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.oid</literal></entry> - <entry>Calls to this function can be simplified by this other function - (see <xref linkend="xfunc-transform-functions"/>)</entry> + <entry>Optional planner support function for this function + (see <xref linkend="xfunc-optimization"/>)</entry> </row> <row> diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index e18272c..d70aa6e 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -3241,40 +3241,6 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray </para> </sect2> - <sect2 id="xfunc-transform-functions"> - <title>Transform Functions</title> - - <para> - Some function calls can be simplified during planning based on - properties specific to the function. For example, - <literal>int4mul(n, 1)</literal> could be simplified to just <literal>n</literal>. - To define such function-specific optimizations, write a - <firstterm>transform function</firstterm> and place its OID in the - <structfield>protransform</structfield> field of the primary function's - <structname>pg_proc</structname> entry. The transform function must have the SQL - signature <literal>protransform(internal) RETURNS internal</literal>. The - argument, actually <type>FuncExpr *</type>, is a dummy node representing a - call to the primary function. If the transform function's study of the - expression tree proves that a simplified expression tree can substitute - for all possible concrete calls represented thereby, build and return - that simplified expression. Otherwise, return a <literal>NULL</literal> - pointer (<emphasis>not</emphasis> a SQL null). - </para> - - <para> - We make no guarantee that <productname>PostgreSQL</productname> will never call the - primary function in cases that the transform function could simplify. - Ensure rigorous equivalence between the simplified expression and an - actual call to the primary function. - </para> - - <para> - Currently, this facility is not exposed to users at the SQL level - because of security concerns, so it is only practical to use for - optimizing built-in functions. - </para> - </sect2> - <sect2> <title>Shared Memory and LWLocks</title> @@ -3388,3 +3354,89 @@ if (!ptr) </sect2> </sect1> + + <sect1 id="xfunc-optimization"> + <title>Function Optimization Information</title> + + <indexterm zone="xfunc-optimization"> + <primary>optimization information</primary> + <secondary>for functions</secondary> + </indexterm> + + <para> + By default, a function is just a <quote>black box</quote> that the + database system knows very little about the behavior of. However, + that means that queries using the function may be executed much less + efficiently than they could be. It is possible to supply additional + knowledge that helps the planner optimize function calls. + </para> + + <para> + Some basic facts can be supplied by declarative annotations provided in + the <xref linkend="sql-createfunction"/> command. Most important of + these is the function's <link linkend="xfunc-volatility">volatility + category</link> (<literal>IMMUTABLE</literal>, <literal>STABLE</literal>, + or <literal>VOLATILE</literal>); one should always be careful to + specify this correctly when defining a function. + The parallel safety property (<literal>PARALLEL + UNSAFE</literal>, <literal>PARALLEL RESTRICTED</literal>, or + <literal>PARALLEL SAFE</literal>) must also be specified if you hope + to use the function in parallelized queries. + It can also be useful to specify the function's estimated execution + cost, and/or the number of rows a set-returning function is estimated + to return. However, the declarative way of specifying those two + facts only allows specifying a constant value, which is often + inadequate. + </para> + + <para> + It is also possible to attach a <firstterm>planner support + function</firstterm> to a SQL-callable function (called + its <firstterm>target function</firstterm>), and thereby provide + knowledge about the target function that is too complex to be + represented declaratively. Planner support functions have to be + written in C (although their target functions might not be), so this is + an advanced feature that relatively few people will use. + </para> + + <para> + A planner support function must have the SQL signature +<programlisting> +supportfn(internal) returns internal +</programlisting> + It is attached to its target function by specifying + the <literal>SUPPORT</literal> clause when creating the target function. + </para> + + <para> + The details of the API for planner support functions can be found in + file <filename>src/include/nodes/supportnodes.h</filename> in the + <productname>PostgreSQL</productname> source code. Here we provide + just an overview of what planner support functions can do. + The set of possible requests to a support function is extensible, + so more things might be possible in future versions. + </para> + + <para> + Some function calls can be simplified during planning based on + properties specific to the function. For example, + <literal>int4mul(n, 1)</literal> could be simplified to + just <literal>n</literal>. This type of transformation can be + performed by a planner support function, by having it implement + the <literal>SupportRequestSimplify</literal> request type. + The support function will be called for each instance of its target + function found in a query parse tree. If it finds that the particular + call can be simplified into some other form, it can build and return a + parse tree representing that expression. This will automatically work + for operators based on the function, too — in the example just + given, <literal>n * 1</literal> would also be simplified to + <literal>n</literal>. + (But note that this is just an example; this particular + optimization is not actually performed by + standard <productname>PostgreSQL</productname>.) + We make no guarantee that <productname>PostgreSQL</productname> will + never call the target function in cases that the support function could + simplify. Ensure rigorous equivalence between the simplified + expression and an actual execution of the target function. + </para> + </sect1> diff --git a/doc/src/sgml/xoper.sgml b/doc/src/sgml/xoper.sgml index 2f5560a..260e43c 100644 --- a/doc/src/sgml/xoper.sgml +++ b/doc/src/sgml/xoper.sgml @@ -78,6 +78,11 @@ SELECT (a + b) AS c FROM test_complex; <sect1 id="xoper-optimization"> <title>Operator Optimization Information</title> + <indexterm zone="xoper-optimization"> + <primary>optimization information</primary> + <secondary>for operators</secondary> + </indexterm> + <para> A <productname>PostgreSQL</productname> operator definition can include several optional clauses that tell the system useful things about how @@ -97,6 +102,13 @@ SELECT (a + b) AS c FROM test_complex; the ones that release &version; understands. </para> + <para> + It is also possible to attach a planner support function to the function + that underlies an operator, providing another way of telling the system + about the behavior of the operator. + See <xref linkend="xfunc-optimization"/> for more information. + </para> + <sect2> <title><literal>COMMUTATOR</literal></title> diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index db78061..3a86f1e 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -319,7 +319,7 @@ ProcedureCreate(const char *procedureName, values[Anum_pg_proc_procost - 1] = Float4GetDatum(procost); values[Anum_pg_proc_prorows - 1] = Float4GetDatum(prorows); values[Anum_pg_proc_provariadic - 1] = ObjectIdGetDatum(variadicType); - values[Anum_pg_proc_protransform - 1] = ObjectIdGetDatum(InvalidOid); + values[Anum_pg_proc_prosupport - 1] = ObjectIdGetDatum(InvalidOid); values[Anum_pg_proc_prokind - 1] = CharGetDatum(prokind); values[Anum_pg_proc_prosecdef - 1] = BoolGetDatum(security_definer); values[Anum_pg_proc_proleakproof - 1] = BoolGetDatum(isLeakProof); diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 86c346b..1f60be2 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -32,6 +32,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" #include "optimizer/optimizer.h" @@ -4046,13 +4047,16 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod, args, funcvariadic, func_tuple, context); - if (!newexpr && allow_non_const && OidIsValid(func_form->protransform)) + if (!newexpr && allow_non_const && OidIsValid(func_form->prosupport)) { /* - * Build a dummy FuncExpr node containing the simplified arg list. We - * use this approach to present a uniform interface to the transform - * function regardless of how the function is actually being invoked. + * Build a SupportRequestSimplify node to pass to the support + * function, pointing to a dummy FuncExpr node containing the + * simplified arg list. We use this approach to present a uniform + * interface to the support function regardless of how the target + * function is actually being invoked. */ + SupportRequestSimplify req; FuncExpr fexpr; fexpr.xpr.type = T_FuncExpr; @@ -4066,9 +4070,16 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod, fexpr.args = args; fexpr.location = -1; + req.type = T_SupportRequestSimplify; + req.root = context->root; + req.fcall = &fexpr; + newexpr = (Expr *) - DatumGetPointer(OidFunctionCall1(func_form->protransform, - PointerGetDatum(&fexpr))); + DatumGetPointer(OidFunctionCall1(func_form->prosupport, + PointerGetDatum(&req))); + + /* catch a possible API misunderstanding */ + Assert(newexpr != (Expr *) &fexpr); } if (!newexpr && allow_non_const) diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c index 3810e4a..cf5a1c6 100644 --- a/src/backend/utils/adt/date.c +++ b/src/backend/utils/adt/date.c @@ -24,6 +24,7 @@ #include "access/xact.h" #include "libpq/pqformat.h" #include "miscadmin.h" +#include "nodes/supportnodes.h" #include "parser/scansup.h" #include "utils/array.h" #include "utils/builtins.h" @@ -1341,15 +1342,25 @@ make_time(PG_FUNCTION_ARGS) } -/* time_transform() - * Flatten calls to time_scale() and timetz_scale() that solely represent - * increases in allowed precision. +/* time_support() + * + * Planner support function for the time_scale() and timetz_scale() + * length coercion functions (we need not distinguish them here). */ Datum -time_transform(PG_FUNCTION_ARGS) +time_support(PG_FUNCTION_ARGS) { - PG_RETURN_POINTER(TemporalTransform(MAX_TIME_PRECISION, - (Node *) PG_GETARG_POINTER(0))); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + + ret = TemporalSimplify(MAX_TIME_PRECISION, (Node *) req->fcall); + } + + PG_RETURN_POINTER(ret); } /* time_scale() diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c index 61dbd05..0068e71 100644 --- a/src/backend/utils/adt/datetime.c +++ b/src/backend/utils/adt/datetime.c @@ -4462,16 +4462,23 @@ CheckDateTokenTables(void) } /* - * Common code for temporal protransform functions. Types time, timetz, - * timestamp and timestamptz each have a range of allowed precisions. An - * unspecified precision is rigorously equivalent to the highest specifiable - * precision. + * Common code for temporal prosupport functions: simplify, if possible, + * a call to a temporal type's length-coercion function. + * + * Types time, timetz, timestamp and timestamptz each have a range of allowed + * precisions. An unspecified precision is rigorously equivalent to the + * highest specifiable precision. We can replace the function call with a + * no-op RelabelType if it is coercing to the same or higher precision as the + * input is known to have. + * + * The input Node is always a FuncExpr, but to reduce the #include footprint + * of datetime.h, we declare it as Node *. * * Note: timestamp_scale throws an error when the typmod is out of range, but * we can't get there from a cast: our typmodin will have caught it already. */ Node * -TemporalTransform(int32 max_precis, Node *node) +TemporalSimplify(int32 max_precis, Node *node) { FuncExpr *expr = castNode(FuncExpr, node); Node *ret = NULL; diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index 45cd1a0..1c9deeb 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -34,6 +34,7 @@ #include "libpq/pqformat.h" #include "miscadmin.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/float.h" @@ -890,19 +891,25 @@ numeric_send(PG_FUNCTION_ARGS) /* - * numeric_transform() - + * numeric_support() * - * Flatten calls to numeric's length coercion function that solely represent - * increases in allowable precision. Scale changes mutate every datum, so - * they are unoptimizable. Some values, e.g. 1E-1001, can only fit into an - * unconstrained numeric, so a change from an unconstrained numeric to any - * constrained numeric is also unoptimizable. + * Planner support function for the numeric() length coercion function. + * + * Flatten calls that solely represent increases in allowable precision. + * Scale changes mutate every datum, so they are unoptimizable. Some values, + * e.g. 1E-1001, can only fit into an unconstrained numeric, so a change from + * an unconstrained numeric to any constrained numeric is also unoptimizable. */ Datum -numeric_transform(PG_FUNCTION_ARGS) +numeric_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; Node *typmod; Assert(list_length(expr->args) >= 2); @@ -920,16 +927,18 @@ numeric_transform(PG_FUNCTION_ARGS) int32 new_precision = (new_typmod - VARHDRSZ) >> 16 & 0xffff; /* - * If new_typmod < VARHDRSZ, the destination is unconstrained; that's - * always OK. If old_typmod >= VARHDRSZ, the source is constrained, - * and we're OK if the scale is unchanged and the precision is not - * decreasing. See further notes in function header comment. + * If new_typmod < VARHDRSZ, the destination is unconstrained; + * that's always OK. If old_typmod >= VARHDRSZ, the source is + * constrained, and we're OK if the scale is unchanged and the + * precision is not decreasing. See further notes in function + * header comment. */ if (new_typmod < (int32) VARHDRSZ || (old_typmod >= (int32) VARHDRSZ && new_scale == old_scale && new_precision >= old_precision)) ret = relabel_to_typmod(source, new_typmod); } + } PG_RETURN_POINTER(ret); } diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c index 7befb6a..e0ef2f7 100644 --- a/src/backend/utils/adt/timestamp.c +++ b/src/backend/utils/adt/timestamp.c @@ -29,6 +29,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "parser/scansup.h" #include "utils/array.h" #include "utils/builtins.h" @@ -297,15 +298,26 @@ timestamptypmodout(PG_FUNCTION_ARGS) } -/* timestamp_transform() - * Flatten calls to timestamp_scale() and timestamptz_scale() that solely - * represent increases in allowed precision. +/* + * timestamp_support() + * + * Planner support function for the timestamp_scale() and timestamptz_scale() + * length coercion functions (we need not distinguish them here). */ Datum -timestamp_transform(PG_FUNCTION_ARGS) +timestamp_support(PG_FUNCTION_ARGS) { - PG_RETURN_POINTER(TemporalTransform(MAX_TIMESTAMP_PRECISION, - (Node *) PG_GETARG_POINTER(0))); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + + ret = TemporalSimplify(MAX_TIMESTAMP_PRECISION, (Node *) req->fcall); + } + + PG_RETURN_POINTER(ret); } /* timestamp_scale() @@ -1235,16 +1247,25 @@ intervaltypmodleastfield(int32 typmod) } -/* interval_transform() +/* + * interval_support() + * + * Planner support function for interval_scale(). + * * Flatten superfluous calls to interval_scale(). The interval typmod is * complex to permit accepting and regurgitating all SQL standard variations. * For truncation purposes, it boils down to a single, simple granularity. */ Datum -interval_transform(PG_FUNCTION_ARGS) +interval_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; Node *typmod; Assert(list_length(expr->args) >= 2); @@ -1277,9 +1298,9 @@ interval_transform(PG_FUNCTION_ARGS) /* * Cast is a no-op if least field stays the same or decreases - * while precision stays the same or increases. But precision, - * which is to say, sub-second precision, only affects ranges that - * include SECOND. + * while precision stays the same or increases. But + * precision, which is to say, sub-second precision, only + * affects ranges that include SECOND. */ noop = (new_least_field <= old_least_field) && (old_least_field > 0 /* SECOND */ || @@ -1289,6 +1310,7 @@ interval_transform(PG_FUNCTION_ARGS) if (noop) ret = relabel_to_typmod(source, new_typmod); } + } PG_RETURN_POINTER(ret); } @@ -1359,7 +1381,7 @@ AdjustIntervalForTypmod(Interval *interval, int32 typmod) * can't do it consistently. (We cannot enforce a range limit on the * highest expected field, since we do not have any equivalent of * SQL's <interval leading field precision>.) If we ever decide to - * revisit this, interval_transform will likely require adjusting. + * revisit this, interval_support will likely require adjusting. * * Note: before PG 8.4 we interpreted a limited set of fields as * actually causing a "modulo" operation on a given value, potentially @@ -5020,18 +5042,6 @@ interval_part(PG_FUNCTION_ARGS) } -/* timestamp_zone_transform() - * The original optimization here caused problems by relabeling Vars that - * could be matched to index entries. It might be possible to resurrect it - * at some point by teaching the planner to be less cavalier with RelabelType - * nodes, but that will take careful analysis. - */ -Datum -timestamp_zone_transform(PG_FUNCTION_ARGS) -{ - PG_RETURN_POINTER(NULL); -} - /* timestamp_zone() * Encode timestamp type with specified time zone. * This function is just timestamp2timestamptz() except instead of @@ -5125,18 +5135,6 @@ timestamp_zone(PG_FUNCTION_ARGS) PG_RETURN_TIMESTAMPTZ(result); } -/* timestamp_izone_transform() - * The original optimization here caused problems by relabeling Vars that - * could be matched to index entries. It might be possible to resurrect it - * at some point by teaching the planner to be less cavalier with RelabelType - * nodes, but that will take careful analysis. - */ -Datum -timestamp_izone_transform(PG_FUNCTION_ARGS) -{ - PG_RETURN_POINTER(NULL); -} - /* timestamp_izone() * Encode timestamp type with specified time interval as time zone. */ diff --git a/src/backend/utils/adt/varbit.c b/src/backend/utils/adt/varbit.c index 1585da0..fdcc620 100644 --- a/src/backend/utils/adt/varbit.c +++ b/src/backend/utils/adt/varbit.c @@ -20,6 +20,7 @@ #include "common/int.h" #include "libpq/pqformat.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/varbit.h" @@ -672,16 +673,24 @@ varbit_send(PG_FUNCTION_ARGS) } /* - * varbit_transform() - * Flatten calls to varbit's length coercion function that set the new maximum - * length >= the previous maximum length. We can ignore the isExplicit - * argument, since that only affects truncation cases. + * varbit_support() + * + * Planner support function for the varbit() length coercion function. + * + * Currently, the only interesting thing we can do is flatten calls that set + * the new maximum length >= the previous maximum length. We can ignore the + * isExplicit argument, since that only affects truncation cases. */ Datum -varbit_transform(PG_FUNCTION_ARGS) +varbit_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; Node *typmod; Assert(list_length(expr->args) >= 2); @@ -699,6 +708,7 @@ varbit_transform(PG_FUNCTION_ARGS) if (new_max <= 0 || (old_max > 0 && old_max <= new_max)) ret = relabel_to_typmod(source, new_typmod); } + } PG_RETURN_POINTER(ret); } diff --git a/src/backend/utils/adt/varchar.c b/src/backend/utils/adt/varchar.c index 5cf927e..c866af0 100644 --- a/src/backend/utils/adt/varchar.c +++ b/src/backend/utils/adt/varchar.c @@ -21,6 +21,7 @@ #include "catalog/pg_type.h" #include "libpq/pqformat.h" #include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/varlena.h" @@ -547,16 +548,24 @@ varcharsend(PG_FUNCTION_ARGS) /* - * varchar_transform() - * Flatten calls to varchar's length coercion function that set the new maximum - * length >= the previous maximum length. We can ignore the isExplicit - * argument, since that only affects truncation cases. + * varchar_support() + * + * Planner support function for the varchar() length coercion function. + * + * Currently, the only interesting thing we can do is flatten calls that set + * the new maximum length >= the previous maximum length. We can ignore the + * isExplicit argument, since that only affects truncation cases. */ Datum -varchar_transform(PG_FUNCTION_ARGS) +varchar_support(PG_FUNCTION_ARGS) { - FuncExpr *expr = castNode(FuncExpr, PG_GETARG_POINTER(0)); + Node *rawreq = (Node *) PG_GETARG_POINTER(0); Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *expr = req->fcall; Node *typmod; Assert(list_length(expr->args) >= 2); @@ -574,6 +583,7 @@ varchar_transform(PG_FUNCTION_ARGS) if (new_typmod < 0 || (old_typmod >= 0 && old_max <= new_max)) ret = relabel_to_typmod(source, new_typmod); } + } PG_RETURN_POINTER(ret); } diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index 245fcbf..4ff358a 100644 --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl @@ -1883,9 +1883,9 @@ my %tests = ( 'CREATE TRANSFORM FOR int' => { create_order => 34, create_sql => - 'CREATE TRANSFORM FOR int LANGUAGE SQL (FROM SQL WITH FUNCTION varchar_transform(internal), TO SQL WITH FUNCTIONint4recv(internal));', + 'CREATE TRANSFORM FOR int LANGUAGE SQL (FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTIONint4recv(internal));', regexp => - qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog\.varchar_transform\(internal\),TO SQL WITH FUNCTION pg_catalog\.int4recv\(internal\)\);/m, + qr/CREATE TRANSFORM FOR integer LANGUAGE sql \(FROM SQL WITH FUNCTION pg_catalog\.varchar_support\(internal\),TO SQL WITH FUNCTION pg_catalog\.int4recv\(internal\)\);/m, like => { %full_runs, section_pre_data => 1, }, }, @@ -2880,7 +2880,7 @@ my %tests = ( procost, prorows, provariadic, - protransform, + prosupport, prokind, prosecdef, proleakproof, @@ -2912,7 +2912,7 @@ my %tests = ( \QGRANT SELECT(procost) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prorows) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(provariadic) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* - \QGRANT SELECT(protransform) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* + \QGRANT SELECT(prosupport) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prokind) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(prosecdef) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* \QGRANT SELECT(proleakproof) ON TABLE pg_catalog.pg_proc TO PUBLIC;\E\n.* diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 3ecc2e1..e5cb5bb 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -1326,11 +1326,11 @@ { oid => '668', descr => 'adjust char() to typmod length', proname => 'bpchar', prorettype => 'bpchar', proargtypes => 'bpchar int4 bool', prosrc => 'bpchar' }, -{ oid => '3097', descr => 'transform a varchar length coercion', - proname => 'varchar_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'varchar_transform' }, +{ oid => '3097', descr => 'planner support for varchar length coercion', + proname => 'varchar_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'varchar_support' }, { oid => '669', descr => 'adjust varchar() to typmod length', - proname => 'varchar', protransform => 'varchar_transform', + proname => 'varchar', prosupport => 'varchar_support', prorettype => 'varchar', proargtypes => 'varchar int4 bool', prosrc => 'varchar' }, @@ -1954,13 +1954,9 @@ # OIDS 1000 - 1999 -{ oid => '3994', descr => 'transform a time zone adjustment', - proname => 'timestamp_izone_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_izone_transform' }, { oid => '1026', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_izone_transform', - prorettype => 'timestamp', proargtypes => 'interval timestamptz', - prosrc => 'timestamptz_izone' }, + proname => 'timezone', prorettype => 'timestamp', + proargtypes => 'interval timestamptz', prosrc => 'timestamptz_izone' }, { oid => '1031', descr => 'I/O', proname => 'aclitemin', provolatile => 's', prorettype => 'aclitem', @@ -2190,13 +2186,9 @@ { oid => '1158', descr => 'convert UNIX epoch to timestamptz', proname => 'to_timestamp', prorettype => 'timestamptz', proargtypes => 'float8', prosrc => 'float8_timestamptz' }, -{ oid => '3995', descr => 'transform a time zone adjustment', - proname => 'timestamp_zone_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_zone_transform' }, { oid => '1159', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_zone_transform', - prorettype => 'timestamp', proargtypes => 'text timestamptz', - prosrc => 'timestamptz_zone' }, + proname => 'timezone', prorettype => 'timestamp', + proargtypes => 'text timestamptz', prosrc => 'timestamptz_zone' }, { oid => '1160', descr => 'I/O', proname => 'interval_in', provolatile => 's', prorettype => 'interval', @@ -2301,11 +2293,11 @@ # OIDS 1200 - 1299 -{ oid => '3918', descr => 'transform an interval length coercion', - proname => 'interval_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'interval_transform' }, +{ oid => '3918', descr => 'planner support for interval length coercion', + proname => 'interval_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'interval_support' }, { oid => '1200', descr => 'adjust interval precision', - proname => 'interval', protransform => 'interval_transform', + proname => 'interval', prosupport => 'interval_support', prorettype => 'interval', proargtypes => 'interval int4', prosrc => 'interval_scale' }, @@ -3713,13 +3705,12 @@ { oid => '1685', descr => 'adjust bit() to typmod length', proname => 'bit', prorettype => 'bit', proargtypes => 'bit int4 bool', prosrc => 'bit' }, -{ oid => '3158', descr => 'transform a varbit length coercion', - proname => 'varbit_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'varbit_transform' }, +{ oid => '3158', descr => 'planner support for varbit length coercion', + proname => 'varbit_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'varbit_support' }, { oid => '1687', descr => 'adjust varbit() to typmod length', - proname => 'varbit', protransform => 'varbit_transform', - prorettype => 'varbit', proargtypes => 'varbit int4 bool', - prosrc => 'varbit' }, + proname => 'varbit', prosupport => 'varbit_support', prorettype => 'varbit', + proargtypes => 'varbit int4 bool', prosrc => 'varbit' }, { oid => '1698', descr => 'position of sub-bitstring', proname => 'position', prorettype => 'int4', proargtypes => 'bit bit', @@ -4081,11 +4072,11 @@ { oid => '2918', descr => 'I/O typmod', proname => 'numerictypmodout', prorettype => 'cstring', proargtypes => 'int4', prosrc => 'numerictypmodout' }, -{ oid => '3157', descr => 'transform a numeric length coercion', - proname => 'numeric_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'numeric_transform' }, +{ oid => '3157', descr => 'planner support for numeric length coercion', + proname => 'numeric_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'numeric_support' }, { oid => '1703', descr => 'adjust numeric to typmod precision/scale', - proname => 'numeric', protransform => 'numeric_transform', + proname => 'numeric', prosupport => 'numeric_support', prorettype => 'numeric', proargtypes => 'numeric int4', prosrc => 'numeric' }, { oid => '1704', proname => 'numeric_abs', prorettype => 'numeric', proargtypes => 'numeric', @@ -5448,15 +5439,15 @@ proname => 'bytea_sortsupport', prorettype => 'void', proargtypes => 'internal', prosrc => 'bytea_sortsupport' }, -{ oid => '3917', descr => 'transform a timestamp length coercion', - proname => 'timestamp_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'timestamp_transform' }, -{ oid => '3944', descr => 'transform a time length coercion', - proname => 'time_transform', prorettype => 'internal', - proargtypes => 'internal', prosrc => 'time_transform' }, +{ oid => '3917', descr => 'planner support for timestamp length coercion', + proname => 'timestamp_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'timestamp_support' }, +{ oid => '3944', descr => 'planner support for time length coercion', + proname => 'time_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'time_support' }, { oid => '1961', descr => 'adjust timestamp precision', - proname => 'timestamp', protransform => 'timestamp_transform', + proname => 'timestamp', prosupport => 'timestamp_support', prorettype => 'timestamp', proargtypes => 'timestamp int4', prosrc => 'timestamp_scale' }, @@ -5468,14 +5459,14 @@ prosrc => 'oidsmaller' }, { oid => '1967', descr => 'adjust timestamptz precision', - proname => 'timestamptz', protransform => 'timestamp_transform', + proname => 'timestamptz', prosupport => 'timestamp_support', prorettype => 'timestamptz', proargtypes => 'timestamptz int4', prosrc => 'timestamptz_scale' }, { oid => '1968', descr => 'adjust time precision', - proname => 'time', protransform => 'time_transform', prorettype => 'time', + proname => 'time', prosupport => 'time_support', prorettype => 'time', proargtypes => 'time int4', prosrc => 'time_scale' }, { oid => '1969', descr => 'adjust time with time zone precision', - proname => 'timetz', protransform => 'time_transform', prorettype => 'timetz', + proname => 'timetz', prosupport => 'time_support', prorettype => 'timetz', proargtypes => 'timetz int4', prosrc => 'timetz_scale' }, { oid => '2003', @@ -5662,13 +5653,11 @@ prosrc => 'select pg_catalog.age(cast(current_date as timestamp without time zone), $1)' }, { oid => '2069', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_zone_transform', - prorettype => 'timestamptz', proargtypes => 'text timestamp', - prosrc => 'timestamp_zone' }, + proname => 'timezone', prorettype => 'timestamptz', + proargtypes => 'text timestamp', prosrc => 'timestamp_zone' }, { oid => '2070', descr => 'adjust timestamp to new time zone', - proname => 'timezone', protransform => 'timestamp_izone_transform', - prorettype => 'timestamptz', proargtypes => 'interval timestamp', - prosrc => 'timestamp_izone' }, + proname => 'timezone', prorettype => 'timestamptz', + proargtypes => 'interval timestamp', prosrc => 'timestamp_izone' }, { oid => '2071', proname => 'date_pl_interval', prorettype => 'timestamp', proargtypes => 'date interval', prosrc => 'date_pl_interval' }, diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index c2bb951..b433769 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -53,8 +53,8 @@ CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,Proce /* element type of variadic array, or 0 */ Oid provariadic BKI_DEFAULT(0) BKI_LOOKUP(pg_type); - /* transforms calls to it during planning */ - regproc protransform BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); + /* planner support function for this function, or 0 if none */ + regproc prosupport BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); /* see PROKIND_ categories below */ char prokind BKI_DEFAULT(f); diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index fbe2dc1..3d3adc2 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -505,7 +505,8 @@ typedef enum NodeTag T_IndexAmRoutine, /* in access/amapi.h */ T_TsmRoutine, /* in access/tsmapi.h */ T_ForeignKeyCacheInfo, /* in utils/rel.h */ - T_CallContext /* in nodes/parsenodes.h */ + T_CallContext, /* in nodes/parsenodes.h */ + T_SupportRequestSimplify /* in nodes/supportnodes.h */ } NodeTag; /* diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h new file mode 100644 index 0000000..1f7d02b --- /dev/null +++ b/src/include/nodes/supportnodes.h @@ -0,0 +1,70 @@ +/*------------------------------------------------------------------------- + * + * supportnodes.h + * Definitions for planner support functions. + * + * This file defines the API for "planner support functions", which + * are SQL functions (normally written in C) that can be attached to + * another "target" function to give the system additional knowledge + * about the target function. All the current capabilities have to do + * with planning queries that use the target function, though it is + * possible that future extensions will add functionality to be invoked + * by the parser or executor. + * + * A support function must have the SQL signature + * supportfn(internal) returns internal + * The argument is a pointer to one of the Node types defined in this file. + * The result is usually also a Node pointer, though its type depends on + * which capability is being invoked. In all cases, a NULL pointer result + * (that's PG_RETURN_POINTER(NULL), not PG_RETURN_NULL()) indicates that + * the support function cannot do anything useful for the given request. + * Support functions must return a NULL pointer, not fail, if they do not + * recognize the request node type or cannot handle the given case; this + * allows for future extensions of the set of request cases. + * + * + * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/nodes/supportnodes.h + * + *------------------------------------------------------------------------- + */ +#ifndef SUPPORTNODES_H +#define SUPPORTNODES_H + +#include "nodes/primnodes.h" + +struct PlannerInfo; /* avoid including relation.h here */ + + +/* + * The Simplify request allows the support function to perform plan-time + * simplification of a call to its target function. For example, a varchar + * length coercion that does not decrease the allowed length of its argument + * could be replaced by a RelabelType node, or "x + 0" could be replaced by + * "x". This is invoked during the planner's constant-folding pass, so the + * function's arguments can be presumed already simplified. + * + * The planner's PlannerInfo "root" is typically not needed, but can be + * consulted if it's necessary to obtain info about Vars present in + * the given node tree. Beware that root could be NULL in some usages. + * + * "fcall" will be a FuncExpr invoking the support function's target + * function. (This is true even if the original parsetree node was an + * operator call; a FuncExpr is synthesized for this purpose.) + * + * The result should be a semantically-equivalent transformed node tree, + * or NULL if no simplification could be performed. Do *not* return or + * modify *fcall, as it isn't really a separately allocated Node. But + * it's okay to use fcall->args, or parts of it, in the result tree. + */ +typedef struct SupportRequestSimplify +{ + NodeTag type; + + struct PlannerInfo *root; /* Planner's infrastructure */ + FuncExpr *fcall; /* Function call to be simplified */ +} SupportRequestSimplify; + +#endif /* SUPPORTNODES_H */ diff --git a/src/include/utils/datetime.h b/src/include/utils/datetime.h index f5ec9bb..87f819e 100644 --- a/src/include/utils/datetime.h +++ b/src/include/utils/datetime.h @@ -330,7 +330,7 @@ extern int DecodeUnits(int field, char *lowtoken, int *val); extern int j2day(int jd); -extern Node *TemporalTransform(int32 max_precis, Node *node); +extern Node *TemporalSimplify(int32 max_precis, Node *node); extern bool CheckDateTokenTables(void); diff --git a/src/test/modules/test_ddl_deparse/expected/create_transform.out b/src/test/modules/test_ddl_deparse/expected/create_transform.out index 0d1cc36..da7fea2 100644 --- a/src/test/modules/test_ddl_deparse/expected/create_transform.out +++ b/src/test/modules/test_ddl_deparse/expected/create_transform.out @@ -7,7 +7,7 @@ -- internal and as return argument the datatype of the transform done. -- pl/plpgsql does not authorize the use of internal as data type. CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); NOTICE: DDL test: type simple, tag CREATE TRANSFORM DROP TRANSFORM FOR int LANGUAGE SQL; diff --git a/src/test/modules/test_ddl_deparse/sql/create_transform.sql b/src/test/modules/test_ddl_deparse/sql/create_transform.sql index 0968702..132fc5a 100644 --- a/src/test/modules/test_ddl_deparse/sql/create_transform.sql +++ b/src/test/modules/test_ddl_deparse/sql/create_transform.sql @@ -8,7 +8,7 @@ -- internal and as return argument the datatype of the transform done. -- pl/plpgsql does not authorize the use of internal as data type. CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); DROP TRANSFORM FOR int LANGUAGE SQL; diff --git a/src/test/regress/expected/object_address.out b/src/test/regress/expected/object_address.out index 4085e45..c89ec06 100644 --- a/src/test/regress/expected/object_address.out +++ b/src/test/regress/expected/object_address.out @@ -38,7 +38,7 @@ CREATE USER MAPPING FOR regress_addr_user SERVER "integer"; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user IN SCHEMA public GRANT ALL ON TABLES TO regress_addr_user; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user REVOKE DELETE ON TABLES FROM regress_addr_user; CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (connect = false, slot_name = NONE); diff --git a/src/test/regress/expected/oidjoins.out b/src/test/regress/expected/oidjoins.out index ef268d3..4edc817 100644 --- a/src/test/regress/expected/oidjoins.out +++ b/src/test/regress/expected/oidjoins.out @@ -809,12 +809,12 @@ WHERE provariadic != 0 AND ------+------------- (0 rows) -SELECT ctid, protransform +SELECT ctid, prosupport FROM pg_catalog.pg_proc fk -WHERE protransform != 0 AND - NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.protransform); - ctid | protransform -------+-------------- +WHERE prosupport != 0 AND + NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.prosupport); + ctid | prosupport +------+------------ (0 rows) SELECT ctid, prorettype diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out index 7328095..ce25ee0 100644 --- a/src/test/regress/expected/opr_sanity.out +++ b/src/test/regress/expected/opr_sanity.out @@ -453,10 +453,10 @@ WHERE proallargtypes IS NOT NULL AND -----+---------+-------------+----------------+------------- (0 rows) --- Check for protransform functions with the wrong signature +-- Check for prosupport functions with the wrong signature SELECT p1.oid, p1.proname, p2.oid, p2.proname FROM pg_proc AS p1, pg_proc AS p2 -WHERE p2.oid = p1.protransform AND +WHERE p2.oid = p1.prosupport AND (p2.prorettype != 'internal'::regtype OR p2.proretset OR p2.pronargs != 1 OR p2.proargtypes[0] != 'internal'::regtype); oid | proname | oid | proname diff --git a/src/test/regress/sql/object_address.sql b/src/test/regress/sql/object_address.sql index d7df322..fd79465 100644 --- a/src/test/regress/sql/object_address.sql +++ b/src/test/regress/sql/object_address.sql @@ -41,7 +41,7 @@ CREATE USER MAPPING FOR regress_addr_user SERVER "integer"; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user IN SCHEMA public GRANT ALL ON TABLES TO regress_addr_user; ALTER DEFAULT PRIVILEGES FOR ROLE regress_addr_user REVOKE DELETE ON TABLES FROM regress_addr_user; CREATE TRANSFORM FOR int LANGUAGE SQL ( - FROM SQL WITH FUNCTION varchar_transform(internal), + FROM SQL WITH FUNCTION varchar_support(internal), TO SQL WITH FUNCTION int4recv(internal)); CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (connect = false, slot_name = NONE); diff --git a/src/test/regress/sql/oidjoins.sql b/src/test/regress/sql/oidjoins.sql index c8291d3..dbe4a58 100644 --- a/src/test/regress/sql/oidjoins.sql +++ b/src/test/regress/sql/oidjoins.sql @@ -405,10 +405,10 @@ SELECT ctid, provariadic FROM pg_catalog.pg_proc fk WHERE provariadic != 0 AND NOT EXISTS(SELECT 1 FROM pg_catalog.pg_type pk WHERE pk.oid = fk.provariadic); -SELECT ctid, protransform +SELECT ctid, prosupport FROM pg_catalog.pg_proc fk -WHERE protransform != 0 AND - NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.protransform); +WHERE prosupport != 0 AND + NOT EXISTS(SELECT 1 FROM pg_catalog.pg_proc pk WHERE pk.oid = fk.prosupport); SELECT ctid, prorettype FROM pg_catalog.pg_proc fk WHERE prorettype != 0 AND diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql index 8544cbe..e2014fc 100644 --- a/src/test/regress/sql/opr_sanity.sql +++ b/src/test/regress/sql/opr_sanity.sql @@ -353,10 +353,10 @@ WHERE proallargtypes IS NOT NULL AND FROM generate_series(1, array_length(proallargtypes, 1)) g(i) WHERE proargmodes IS NULL OR proargmodes[i] IN ('i', 'b', 'v')); --- Check for protransform functions with the wrong signature +-- Check for prosupport functions with the wrong signature SELECT p1.oid, p1.proname, p2.oid, p2.proname FROM pg_proc AS p1, pg_proc AS p2 -WHERE p2.oid = p1.protransform AND +WHERE p2.oid = p1.prosupport AND (p2.prorettype != 'internal'::regtype OR p2.proretset OR p2.pronargs != 1 OR p2.proargtypes[0] != 'internal'::regtype); diff --git a/src/tools/findoidjoins/README b/src/tools/findoidjoins/README index 305454a..e5fc310 100644 --- a/src/tools/findoidjoins/README +++ b/src/tools/findoidjoins/README @@ -161,7 +161,7 @@ Join pg_catalog.pg_proc.pronamespace => pg_catalog.pg_namespace.oid Join pg_catalog.pg_proc.proowner => pg_catalog.pg_authid.oid Join pg_catalog.pg_proc.prolang => pg_catalog.pg_language.oid Join pg_catalog.pg_proc.provariadic => pg_catalog.pg_type.oid -Join pg_catalog.pg_proc.protransform => pg_catalog.pg_proc.oid +Join pg_catalog.pg_proc.prosupport => pg_catalog.pg_proc.oid Join pg_catalog.pg_proc.prorettype => pg_catalog.pg_type.oid Join pg_catalog.pg_range.rngtypid => pg_catalog.pg_type.oid Join pg_catalog.pg_range.rngsubtype => pg_catalog.pg_type.oid diff --git a/doc/src/sgml/keywords.sgml b/doc/src/sgml/keywords.sgml index a37d0b7..fa32a88 100644 --- a/doc/src/sgml/keywords.sgml +++ b/doc/src/sgml/keywords.sgml @@ -4522,6 +4522,13 @@ <entry>reserved</entry> </row> <row> + <entry><token>SUPPORT</token></entry> + <entry>non-reserved</entry> + <entry></entry> + <entry></entry> + <entry></entry> + </row> + <row> <entry><token>SYMMETRIC</token></entry> <entry>reserved</entry> <entry>reserved</entry> diff --git a/doc/src/sgml/ref/alter_function.sgml b/doc/src/sgml/ref/alter_function.sgml index d8747e0..03ffa59 100644 --- a/doc/src/sgml/ref/alter_function.sgml +++ b/doc/src/sgml/ref/alter_function.sgml @@ -40,6 +40,7 @@ ALTER FUNCTION <replaceable>name</replaceable> [ ( [ [ <replaceable class="param PARALLEL { UNSAFE | RESTRICTED | SAFE } COST <replaceable class="parameter">execution_cost</replaceable> ROWS <replaceable class="parameter">result_rows</replaceable> + SUPPORT <replaceable class="parameter">support_function</replaceable> SET <replaceable class="parameter">configuration_parameter</replaceable> { TO | = } { <replaceable class="parameter">value</replaceable>| DEFAULT } SET <replaceable class="parameter">configuration_parameter</replaceable> FROM CURRENT RESET <replaceable class="parameter">configuration_parameter</replaceable> @@ -248,6 +249,24 @@ ALTER FUNCTION <replaceable>name</replaceable> [ ( [ [ <replaceable class="param </listitem> </varlistentry> + <varlistentry> + <term><literal>SUPPORT</literal> <replaceable class="parameter">support_function</replaceable></term> + + <listitem> + <para> + Set or change the planner support function to use for this function. + See <xref linkend="xfunc-optimization"/> for details. You must be + superuser to use this option. + </para> + + <para> + This option cannot be used to remove the support function altogether, + since it must name a new support function. Use <command>CREATE OR + REPLACE FUNCTION</command> if you need to do that. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><replaceable>configuration_parameter</replaceable></term> <term><replaceable>value</replaceable></term> diff --git a/doc/src/sgml/ref/create_function.sgml b/doc/src/sgml/ref/create_function.sgml index 4072543..dd6a2f7 100644 --- a/doc/src/sgml/ref/create_function.sgml +++ b/doc/src/sgml/ref/create_function.sgml @@ -33,6 +33,7 @@ CREATE [ OR REPLACE ] FUNCTION | PARALLEL { UNSAFE | RESTRICTED | SAFE } | COST <replaceable class="parameter">execution_cost</replaceable> | ROWS <replaceable class="parameter">result_rows</replaceable> + | SUPPORT <replaceable class="parameter">support_function</replaceable> | SET <replaceable class="parameter">configuration_parameter</replaceable> { TO <replaceable class="parameter">value</replaceable>| = <replaceable class="parameter">value</replaceable> | FROM CURRENT } | AS '<replaceable class="parameter">definition</replaceable>' | AS '<replaceable class="parameter">obj_file</replaceable>', '<replaceable class="parameter">link_symbol</replaceable>' @@ -478,6 +479,19 @@ CREATE [ OR REPLACE ] FUNCTION </varlistentry> <varlistentry> + <term><literal>SUPPORT</literal> <replaceable class="parameter">support_function</replaceable></term> + + <listitem> + <para> + The name (optionally schema-qualified) of a <firstterm>planner support + function</firstterm> to use for this function. See + <xref linkend="xfunc-optimization"/> for details. + You must be superuser to use this option. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><replaceable>configuration_parameter</replaceable></term> <term><replaceable>value</replaceable></term> <listitem> diff --git a/src/backend/catalog/pg_aggregate.c b/src/backend/catalog/pg_aggregate.c index cc3806e..19e3171 100644 --- a/src/backend/catalog/pg_aggregate.c +++ b/src/backend/catalog/pg_aggregate.c @@ -632,6 +632,7 @@ AggregateCreate(const char *aggName, parameterDefaults, /* parameterDefaults */ PointerGetDatum(NULL), /* trftypes */ PointerGetDatum(NULL), /* proconfig */ + InvalidOid, /* no prosupport */ 1, /* procost */ 0); /* prorows */ procOid = myself.objectId; diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c index 2b8f651..23b01f8 100644 --- a/src/backend/catalog/pg_depend.c +++ b/src/backend/catalog/pg_depend.c @@ -286,9 +286,12 @@ deleteDependencyRecordsForClass(Oid classId, Oid objectId, * newRefObjectId is the new referenced object (must be of class refClassId). * * Note the lack of objsubid parameters. If there are subobject references - * they will all be readjusted. + * they will all be readjusted. Also, there is an expectation that we are + * dealing with NORMAL dependencies: if we have to replace an (implicit) + * dependency on a pinned object with an explicit dependency on an unpinned + * one, the new one will be NORMAL. * - * Returns the number of records updated. + * Returns the number of records updated -- zero indicates a problem. */ long changeDependencyFor(Oid classId, Oid objectId, @@ -301,35 +304,52 @@ changeDependencyFor(Oid classId, Oid objectId, SysScanDesc scan; HeapTuple tup; ObjectAddress objAddr; + ObjectAddress depAddr; + bool oldIsPinned; bool newIsPinned; depRel = table_open(DependRelationId, RowExclusiveLock); /* - * If oldRefObjectId is pinned, there won't be any dependency entries on - * it --- we can't cope in that case. (This isn't really worth expending - * code to fix, in current usage; it just means you can't rename stuff out - * of pg_catalog, which would likely be a bad move anyway.) + * Check to see if either oldRefObjectId or newRefObjectId is pinned. + * Pinned objects should not have any dependency entries pointing to them, + * so in these cases we should add or remove a pg_depend entry, or do + * nothing at all, rather than update an entry as in the normal case. */ objAddr.classId = refClassId; objAddr.objectId = oldRefObjectId; objAddr.objectSubId = 0; - if (isObjectPinned(&objAddr, depRel)) - ereport(ERROR, - (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), - errmsg("cannot remove dependency on %s because it is a system object", - getObjectDescription(&objAddr)))); + oldIsPinned = isObjectPinned(&objAddr, depRel); - /* - * We can handle adding a dependency on something pinned, though, since - * that just means deleting the dependency entry. - */ objAddr.objectId = newRefObjectId; newIsPinned = isObjectPinned(&objAddr, depRel); - /* Now search for dependency records */ + if (oldIsPinned) + { + table_close(depRel, RowExclusiveLock); + + /* + * If both are pinned, we need do nothing. However, return 1 not 0, + * else callers will think this is an error case. + */ + if (newIsPinned) + return 1; + + /* + * There is no old dependency record, but we should insert a new one. + * Assume a normal dependency is wanted. + */ + depAddr.classId = classId; + depAddr.objectId = objectId; + depAddr.objectSubId = 0; + recordDependencyOn(&depAddr, &objAddr, DEPENDENCY_NORMAL); + + return 1; + } + + /* There should be existing dependency record(s), so search. */ ScanKeyInit(&key[0], Anum_pg_depend_classid, BTEqualStrategyNumber, F_OIDEQ, diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c index 3a86f1e..557e0ea 100644 --- a/src/backend/catalog/pg_proc.c +++ b/src/backend/catalog/pg_proc.c @@ -88,6 +88,7 @@ ProcedureCreate(const char *procedureName, List *parameterDefaults, Datum trftypes, Datum proconfig, + Oid prosupport, float4 procost, float4 prorows) { @@ -319,7 +320,7 @@ ProcedureCreate(const char *procedureName, values[Anum_pg_proc_procost - 1] = Float4GetDatum(procost); values[Anum_pg_proc_prorows - 1] = Float4GetDatum(prorows); values[Anum_pg_proc_provariadic - 1] = ObjectIdGetDatum(variadicType); - values[Anum_pg_proc_prosupport - 1] = ObjectIdGetDatum(InvalidOid); + values[Anum_pg_proc_prosupport - 1] = ObjectIdGetDatum(prosupport); values[Anum_pg_proc_prokind - 1] = CharGetDatum(prokind); values[Anum_pg_proc_prosecdef - 1] = BoolGetDatum(security_definer); values[Anum_pg_proc_proleakproof - 1] = BoolGetDatum(isLeakProof); @@ -656,6 +657,15 @@ ProcedureCreate(const char *procedureName, recordDependencyOnExpr(&myself, (Node *) parameterDefaults, NIL, DEPENDENCY_NORMAL); + /* dependency on support function, if any */ + if (OidIsValid(prosupport)) + { + referenced.classId = ProcedureRelationId; + referenced.objectId = prosupport; + referenced.objectSubId = 0; + recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL); + } + /* dependency on owner */ if (!is_update) recordDependencyOnOwner(ProcedureRelationId, retval, proowner); diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c index 9a2f1a8..4f62e48 100644 --- a/src/backend/commands/functioncmds.c +++ b/src/backend/commands/functioncmds.c @@ -479,6 +479,7 @@ compute_common_attribute(ParseState *pstate, List **set_items, DefElem **cost_item, DefElem **rows_item, + DefElem **support_item, DefElem **parallel_item) { if (strcmp(defel->defname, "volatility") == 0) @@ -537,6 +538,15 @@ compute_common_attribute(ParseState *pstate, *rows_item = defel; } + else if (strcmp(defel->defname, "support") == 0) + { + if (is_procedure) + goto procedure_error; + if (*support_item) + goto duplicate_error; + + *support_item = defel; + } else if (strcmp(defel->defname, "parallel") == 0) { if (is_procedure) @@ -635,6 +645,45 @@ update_proconfig_value(ArrayType *a, List *set_items) return a; } +static Oid +interpret_func_support(DefElem *defel) +{ + List *procName = defGetQualifiedName(defel); + Oid procOid; + Oid argList[1]; + + /* + * Support functions always take one INTERNAL argument and return + * INTERNAL. + */ + argList[0] = INTERNALOID; + + procOid = LookupFuncName(procName, 1, argList, true); + if (!OidIsValid(procOid)) + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_FUNCTION), + errmsg("function %s does not exist", + func_signature_string(procName, 1, NIL, argList)))); + + if (get_func_rettype(procOid) != INTERNALOID) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("support function %s must return type %s", + NameListToString(procName), "internal"))); + + /* + * Someday we might want an ACL check here; but for now, we insist that + * you be superuser to specify a support function, so privilege on the + * support function is moot. + */ + if (!superuser()) + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to specify a support function"))); + + return procOid; +} + /* * Dissect the list of options assembled in gram.y into function @@ -655,6 +704,7 @@ compute_function_attributes(ParseState *pstate, ArrayType **proconfig, float4 *procost, float4 *prorows, + Oid *prosupport, char *parallel_p) { ListCell *option; @@ -669,6 +719,7 @@ compute_function_attributes(ParseState *pstate, List *set_items = NIL; DefElem *cost_item = NULL; DefElem *rows_item = NULL; + DefElem *support_item = NULL; DefElem *parallel_item = NULL; foreach(option, options) @@ -726,6 +777,7 @@ compute_function_attributes(ParseState *pstate, &set_items, &cost_item, &rows_item, + &support_item, ¶llel_item)) { /* recognized common option */ @@ -788,6 +840,8 @@ compute_function_attributes(ParseState *pstate, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("ROWS must be positive"))); } + if (support_item) + *prosupport = interpret_func_support(support_item); if (parallel_item) *parallel_p = interpret_func_parallel(parallel_item); } @@ -893,6 +947,7 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt) ArrayType *proconfig; float4 procost; float4 prorows; + Oid prosupport; HeapTuple languageTuple; Form_pg_language languageStruct; List *as_clause; @@ -917,6 +972,7 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt) proconfig = NULL; procost = -1; /* indicates not set */ prorows = -1; /* indicates not set */ + prosupport = InvalidOid; parallel = PROPARALLEL_UNSAFE; /* Extract non-default attributes from stmt->options list */ @@ -926,7 +982,8 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt) &as_clause, &language, &transformDefElem, &isWindowFunc, &volatility, &isStrict, &security, &isLeakProof, - &proconfig, &procost, &prorows, ¶llel); + &proconfig, &procost, &prorows, + &prosupport, ¶llel); /* Look up the language and validate permissions */ languageTuple = SearchSysCache1(LANGNAME, PointerGetDatum(language)); @@ -1113,6 +1170,7 @@ CreateFunction(ParseState *pstate, CreateFunctionStmt *stmt) parameterDefaults, PointerGetDatum(trftypes), PointerGetDatum(proconfig), + prosupport, procost, prorows); } @@ -1187,6 +1245,7 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt) List *set_items = NIL; DefElem *cost_item = NULL; DefElem *rows_item = NULL; + DefElem *support_item = NULL; DefElem *parallel_item = NULL; ObjectAddress address; @@ -1194,6 +1253,8 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt) funcOid = LookupFuncWithArgs(stmt->objtype, stmt->func, false); + ObjectAddressSet(address, ProcedureRelationId, funcOid); + tup = SearchSysCacheCopy1(PROCOID, ObjectIdGetDatum(funcOid)); if (!HeapTupleIsValid(tup)) /* should not happen */ elog(ERROR, "cache lookup failed for function %u", funcOid); @@ -1228,6 +1289,7 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt) &set_items, &cost_item, &rows_item, + &support_item, ¶llel_item) == false) elog(ERROR, "option \"%s\" not recognized", defel->defname); } @@ -1266,6 +1328,28 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt) (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("ROWS is not applicable when function does not return a set"))); } + if (support_item) + { + /* interpret_func_support handles the privilege check */ + Oid newsupport = interpret_func_support(support_item); + + /* Add or replace dependency on support function */ + if (OidIsValid(procForm->prosupport)) + changeDependencyFor(ProcedureRelationId, funcOid, + ProcedureRelationId, procForm->prosupport, + newsupport); + else + { + ObjectAddress referenced; + + referenced.classId = ProcedureRelationId; + referenced.objectId = newsupport; + referenced.objectSubId = 0; + recordDependencyOn(&address, &referenced, DEPENDENCY_NORMAL); + } + + procForm->prosupport = newsupport; + } if (set_items) { Datum datum; @@ -1308,8 +1392,6 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt) InvokeObjectPostAlterHook(ProcedureRelationId, funcOid, 0); - ObjectAddressSet(address, ProcedureRelationId, funcOid); - table_close(rel, NoLock); heap_freetuple(tup); diff --git a/src/backend/commands/proclang.c b/src/backend/commands/proclang.c index c2e9e41..59c4e8d 100644 --- a/src/backend/commands/proclang.c +++ b/src/backend/commands/proclang.c @@ -141,6 +141,7 @@ CreateProceduralLanguage(CreatePLangStmt *stmt) NIL, PointerGetDatum(NULL), PointerGetDatum(NULL), + InvalidOid, 1, 0); handlerOid = tmpAddr.objectId; @@ -180,6 +181,7 @@ CreateProceduralLanguage(CreatePLangStmt *stmt) NIL, PointerGetDatum(NULL), PointerGetDatum(NULL), + InvalidOid, 1, 0); inlineOid = tmpAddr.objectId; @@ -222,6 +224,7 @@ CreateProceduralLanguage(CreatePLangStmt *stmt) NIL, PointerGetDatum(NULL), PointerGetDatum(NULL), + InvalidOid, 1, 0); valOid = tmpAddr.objectId; diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c index fa7161e..448926d 100644 --- a/src/backend/commands/typecmds.c +++ b/src/backend/commands/typecmds.c @@ -1664,6 +1664,7 @@ makeRangeConstructors(const char *name, Oid namespace, NIL, /* parameterDefaults */ PointerGetDatum(NULL), /* trftypes */ PointerGetDatum(NULL), /* proconfig */ + InvalidOid, /* prosupport */ 1.0, /* procost */ 0.0); /* prorows */ diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index c1faf41..ef6bbe3 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -676,7 +676,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); SERIALIZABLE SERVER SESSION SESSION_USER SET SETS SETOF SHARE SHOW SIMILAR SIMPLE SKIP SMALLINT SNAPSHOT SOME SQL_P STABLE STANDALONE_P START STATEMENT STATISTICS STDIN STDOUT STORAGE STRICT_P STRIP_P - SUBSCRIPTION SUBSTRING SYMMETRIC SYSID SYSTEM_P + SUBSCRIPTION SUBSTRING SUPPORT SYMMETRIC SYSID SYSTEM_P TABLE TABLES TABLESAMPLE TABLESPACE TEMP TEMPLATE TEMPORARY TEXT_P THEN TIES TIME TIMESTAMP TO TRAILING TRANSACTION TRANSFORM @@ -7834,6 +7834,10 @@ common_func_opt_item: { $$ = makeDefElem("rows", (Node *)$2, @1); } + | SUPPORT any_name + { + $$ = makeDefElem("support", (Node *)$2, @1); + } | FunctionSetResetClause { /* we abuse the normal content of a DefElem here */ @@ -15164,6 +15168,7 @@ unreserved_keyword: | STRICT_P | STRIP_P | SUBSCRIPTION + | SUPPORT | SYSID | SYSTEM_P | TABLES diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 51e4c27..e555e08 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -2638,6 +2638,21 @@ pg_get_functiondef(PG_FUNCTION_ARGS) if (proc->prorows > 0 && proc->prorows != 1000) appendStringInfo(&buf, " ROWS %g", proc->prorows); + if (proc->prosupport) + { + Oid argtypes[1]; + + /* + * We should qualify the support function's name if it wouldn't be + * resolved by lookup in the current search path. + */ + argtypes[0] = INTERNALOID; + appendStringInfo(&buf, " SUPPORT %s", + generate_function_name(proc->prosupport, 1, + NIL, argtypes, + false, NULL, EXPR_KIND_NONE)); + } + if (oldlen != buf.len) appendStringInfoChar(&buf, '\n'); diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 2b1a947..615d997 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -11466,6 +11466,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) char *proconfig; char *procost; char *prorows; + char *prosupport; char *proparallel; char *lanname; char *rettypename; @@ -11488,7 +11489,26 @@ dumpFunc(Archive *fout, FuncInfo *finfo) asPart = createPQExpBuffer(); /* Fetch function-specific details */ - if (fout->remoteVersion >= 110000) + if (fout->remoteVersion >= 120000) + { + /* + * prosupport was added in 12 + */ + appendPQExpBuffer(query, + "SELECT proretset, prosrc, probin, " + "pg_catalog.pg_get_function_arguments(oid) AS funcargs, " + "pg_catalog.pg_get_function_identity_arguments(oid) AS funciargs, " + "pg_catalog.pg_get_function_result(oid) AS funcresult, " + "array_to_string(protrftypes, ' ') AS protrftypes, " + "prokind, provolatile, proisstrict, prosecdef, " + "proleakproof, proconfig, procost, prorows, " + "prosupport, proparallel, " + "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " + "FROM pg_catalog.pg_proc " + "WHERE oid = '%u'::pg_catalog.oid", + finfo->dobj.catId.oid); + } + else if (fout->remoteVersion >= 110000) { /* * prokind was added in 11 @@ -11501,7 +11521,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "array_to_string(protrftypes, ' ') AS protrftypes, " "prokind, provolatile, proisstrict, prosecdef, " "proleakproof, proconfig, procost, prorows, " - "proparallel, " + "'-' AS prosupport, proparallel, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11521,7 +11541,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "CASE WHEN proiswindow THEN 'w' ELSE 'f' END AS prokind, " "provolatile, proisstrict, prosecdef, " "proleakproof, proconfig, procost, prorows, " - "proparallel, " + "'-' AS prosupport, proparallel, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11541,6 +11561,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "CASE WHEN proiswindow THEN 'w' ELSE 'f' END AS prokind, " "provolatile, proisstrict, prosecdef, " "proleakproof, proconfig, procost, prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11559,6 +11580,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "CASE WHEN proiswindow THEN 'w' ELSE 'f' END AS prokind, " "provolatile, proisstrict, prosecdef, " "proleakproof, proconfig, procost, prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11579,6 +11601,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "provolatile, proisstrict, prosecdef, " "false AS proleakproof, " " proconfig, procost, prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11593,6 +11616,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "provolatile, proisstrict, prosecdef, " "false AS proleakproof, " "proconfig, procost, prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11607,6 +11631,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "provolatile, proisstrict, prosecdef, " "false AS proleakproof, " "null AS proconfig, 0 AS procost, 0 AS prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11623,6 +11648,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) "provolatile, proisstrict, prosecdef, " "false AS proleakproof, " "null AS proconfig, 0 AS procost, 0 AS prorows, " + "'-' AS prosupport, " "(SELECT lanname FROM pg_catalog.pg_language WHERE oid = prolang) AS lanname " "FROM pg_catalog.pg_proc " "WHERE oid = '%u'::pg_catalog.oid", @@ -11660,6 +11686,7 @@ dumpFunc(Archive *fout, FuncInfo *finfo) proconfig = PQgetvalue(res, 0, PQfnumber(res, "proconfig")); procost = PQgetvalue(res, 0, PQfnumber(res, "procost")); prorows = PQgetvalue(res, 0, PQfnumber(res, "prorows")); + prosupport = PQgetvalue(res, 0, PQfnumber(res, "prosupport")); if (PQfnumber(res, "proparallel") != -1) proparallel = PQgetvalue(res, 0, PQfnumber(res, "proparallel")); @@ -11873,6 +11900,12 @@ dumpFunc(Archive *fout, FuncInfo *finfo) strcmp(prorows, "0") != 0 && strcmp(prorows, "1000") != 0) appendPQExpBuffer(q, " ROWS %s", prorows); + if (strcmp(prosupport, "-") != 0) + { + /* We rely on regprocout to provide quoting and qualification */ + appendPQExpBuffer(q, " SUPPORT %s", prosupport); + } + if (proparallel != NULL && proparallel[0] != PROPARALLEL_UNSAFE) { if (proparallel[0] == PROPARALLEL_SAFE) diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index 4ff358a..d22ca73 100644 --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl @@ -1774,6 +1774,20 @@ my %tests = ( unlike => { exclude_dump_test_schema => 1, }, }, + 'CREATE FUNCTION ... SUPPORT' => { + create_order => 41, + create_sql => + 'CREATE FUNCTION dump_test.func_with_support() RETURNS int LANGUAGE sql AS $$ SELECT 1 $$ SUPPORT varchar_support;', + regexp => qr/^ + \QCREATE FUNCTION dump_test.func_with_support() RETURNS integer\E + \n\s+\QLANGUAGE sql SUPPORT varchar_support\E + \n\s+AS\ \$\$\Q SELECT 1 \E\$\$; + /xm, + like => + { %full_runs, %dump_test_schema_runs, section_pre_data => 1, }, + unlike => { exclude_dump_test_schema => 1, }, + }, + 'CREATE PROCEDURE dump_test.ptest1' => { create_order => 41, create_sql => 'CREATE PROCEDURE dump_test.ptest1(a int) diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index b433769..e5270d2 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -201,6 +201,7 @@ extern ObjectAddress ProcedureCreate(const char *procedureName, List *parameterDefaults, Datum trftypes, Datum proconfig, + Oid prosupport, float4 procost, float4 prorows); diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h index adeb834..f054440 100644 --- a/src/include/parser/kwlist.h +++ b/src/include/parser/kwlist.h @@ -387,6 +387,7 @@ PG_KEYWORD("strict", STRICT_P, UNRESERVED_KEYWORD) PG_KEYWORD("strip", STRIP_P, UNRESERVED_KEYWORD) PG_KEYWORD("subscription", SUBSCRIPTION, UNRESERVED_KEYWORD) PG_KEYWORD("substring", SUBSTRING, COL_NAME_KEYWORD) +PG_KEYWORD("support", SUPPORT, UNRESERVED_KEYWORD) PG_KEYWORD("symmetric", SYMMETRIC, RESERVED_KEYWORD) PG_KEYWORD("sysid", SYSID, UNRESERVED_KEYWORD) PG_KEYWORD("system", SYSTEM_P, UNRESERVED_KEYWORD) diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index 7bb8ca9..4db792c 100644 --- a/src/test/regress/expected/alter_table.out +++ b/src/test/regress/expected/alter_table.out @@ -3050,10 +3050,9 @@ DETAIL: System catalog modifications are currently disallowed. -- instead create in public first, move to catalog CREATE TABLE new_system_table(id serial primary key, othercol text); ALTER TABLE new_system_table SET SCHEMA pg_catalog; --- XXX: it's currently impossible to move relations out of pg_catalog ALTER TABLE new_system_table SET SCHEMA public; -ERROR: cannot remove dependency on schema pg_catalog because it is a system object --- move back, will be ignored -- already there +ALTER TABLE new_system_table SET SCHEMA pg_catalog; +-- will be ignored -- already there: ALTER TABLE new_system_table SET SCHEMA pg_catalog; ALTER TABLE new_system_table RENAME TO old_system_table; CREATE INDEX old_system_table__othercol ON old_system_table (othercol); diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql index a498e4e..d806430 100644 --- a/src/test/regress/sql/alter_table.sql +++ b/src/test/regress/sql/alter_table.sql @@ -1896,10 +1896,9 @@ CREATE TABLE pg_catalog.new_system_table(); -- instead create in public first, move to catalog CREATE TABLE new_system_table(id serial primary key, othercol text); ALTER TABLE new_system_table SET SCHEMA pg_catalog; - --- XXX: it's currently impossible to move relations out of pg_catalog ALTER TABLE new_system_table SET SCHEMA public; --- move back, will be ignored -- already there +ALTER TABLE new_system_table SET SCHEMA pg_catalog; +-- will be ignored -- already there: ALTER TABLE new_system_table SET SCHEMA pg_catalog; ALTER TABLE new_system_table RENAME TO old_system_table; CREATE INDEX old_system_table__othercol ON old_system_table (othercol); diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 7fcac81..cf3345c 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -2776,6 +2776,7 @@ estimate_path_cost_size(PlannerInfo *root, startup_cost = ofpinfo->rel_startup_cost; startup_cost += aggcosts.transCost.startup; startup_cost += aggcosts.transCost.per_tuple * input_rows; + startup_cost += aggcosts.finalCost.startup; startup_cost += (cpu_operator_cost * numGroupCols) * input_rows; /*----- @@ -2785,7 +2786,7 @@ estimate_path_cost_size(PlannerInfo *root, *----- */ run_cost = ofpinfo->rel_total_cost - ofpinfo->rel_startup_cost; - run_cost += aggcosts.finalCost * numGroups; + run_cost += aggcosts.finalCost.per_tuple * numGroups; run_cost += cpu_tuple_cost * numGroups; /* Account for the eval cost of HAVING quals, if any */ diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index d70aa6e..b486ef3 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -3439,4 +3439,25 @@ supportfn(internal) returns internal simplify. Ensure rigorous equivalence between the simplified expression and an actual execution of the target function. </para> + + <para> + For target functions that return boolean, it is often useful to estimate + the fraction of rows that will be selected by a WHERE clause using that + function. This can be done by a support function that implements + the <literal>SupportRequestSelectivity</literal> request type. + </para> + + <para> + If the target function's runtime is highly dependent on its inputs, + it may be useful to provide a non-constant cost estimate for it. + This can be done by a support function that implements + the <literal>SupportRequestCost</literal> request type. + </para> + + <para> + For target functions that return sets, it is often useful to provide + a non-constant estimate for the number of rows that will be returned. + This can be done by a support function that implements + the <literal>SupportRequestRows</literal> request type. + </para> </sect1> diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c index abca03b..e8142bd 100644 --- a/src/backend/optimizer/path/clausesel.c +++ b/src/backend/optimizer/path/clausesel.c @@ -762,6 +762,21 @@ clause_selectivity(PlannerInfo *root, if (IsA(clause, DistinctExpr)) s1 = 1.0 - s1; } + else if (is_funcclause(clause)) + { + FuncExpr *funcclause = (FuncExpr *) clause; + + /* Try to get an estimate from the support function, if any */ + s1 = function_selectivity(root, + funcclause->funcid, + funcclause->args, + funcclause->inputcollid, + treat_as_join_clause(clause, rinfo, + varRelid, sjinfo), + varRelid, + jointype, + sjinfo); + } else if (IsA(clause, ScalarArrayOpExpr)) { /* Use node specific selectivity calculation function */ diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index b8d406f..44cdf71 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -2116,9 +2116,9 @@ cost_agg(Path *path, PlannerInfo *root, /* * The transCost.per_tuple component of aggcosts should be charged once * per input tuple, corresponding to the costs of evaluating the aggregate - * transfns and their input expressions (with any startup cost of course - * charged but once). The finalCost component is charged once per output - * tuple, corresponding to the costs of evaluating the finalfns. + * transfns and their input expressions. The finalCost.per_tuple component + * is charged once per output tuple, corresponding to the costs of + * evaluating the finalfns. Startup costs are of course charged but once. * * If we are grouping, we charge an additional cpu_operator_cost per * grouping column per input tuple for grouping comparisons. @@ -2140,7 +2140,8 @@ cost_agg(Path *path, PlannerInfo *root, startup_cost = input_total_cost; startup_cost += aggcosts->transCost.startup; startup_cost += aggcosts->transCost.per_tuple * input_tuples; - startup_cost += aggcosts->finalCost; + startup_cost += aggcosts->finalCost.startup; + startup_cost += aggcosts->finalCost.per_tuple; /* we aren't grouping */ total_cost = startup_cost + cpu_tuple_cost; output_tuples = 1; @@ -2159,7 +2160,8 @@ cost_agg(Path *path, PlannerInfo *root, total_cost += aggcosts->transCost.startup; total_cost += aggcosts->transCost.per_tuple * input_tuples; total_cost += (cpu_operator_cost * numGroupCols) * input_tuples; - total_cost += aggcosts->finalCost * numGroups; + total_cost += aggcosts->finalCost.startup; + total_cost += aggcosts->finalCost.per_tuple * numGroups; total_cost += cpu_tuple_cost * numGroups; output_tuples = numGroups; } @@ -2172,8 +2174,9 @@ cost_agg(Path *path, PlannerInfo *root, startup_cost += aggcosts->transCost.startup; startup_cost += aggcosts->transCost.per_tuple * input_tuples; startup_cost += (cpu_operator_cost * numGroupCols) * input_tuples; + startup_cost += aggcosts->finalCost.startup; total_cost = startup_cost; - total_cost += aggcosts->finalCost * numGroups; + total_cost += aggcosts->finalCost.per_tuple * numGroups; total_cost += cpu_tuple_cost * numGroups; output_tuples = numGroups; } @@ -2238,7 +2241,11 @@ cost_windowagg(Path *path, PlannerInfo *root, Cost wfunccost; QualCost argcosts; - wfunccost = get_func_cost(wfunc->winfnoid) * cpu_operator_cost; + argcosts.startup = argcosts.per_tuple = 0; + add_function_cost(root, wfunc->winfnoid, (Node *) wfunc, + &argcosts); + startup_cost += argcosts.startup; + wfunccost = argcosts.per_tuple; /* also add the input expressions' cost to per-input-row costs */ cost_qual_eval_node(&argcosts, (Node *) wfunc->args, root); @@ -3868,8 +3875,8 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) */ if (IsA(node, FuncExpr)) { - context->total.per_tuple += - get_func_cost(((FuncExpr *) node)->funcid) * cpu_operator_cost; + add_function_cost(context->root, ((FuncExpr *) node)->funcid, node, + &context->total); } else if (IsA(node, OpExpr) || IsA(node, DistinctExpr) || @@ -3877,8 +3884,8 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) { /* rely on struct equivalence to treat these all alike */ set_opfuncid((OpExpr *) node); - context->total.per_tuple += - get_func_cost(((OpExpr *) node)->opfuncid) * cpu_operator_cost; + add_function_cost(context->root, ((OpExpr *) node)->opfuncid, node, + &context->total); } else if (IsA(node, ScalarArrayOpExpr)) { @@ -3888,10 +3895,15 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) */ ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) node; Node *arraynode = (Node *) lsecond(saop->args); + QualCost sacosts; set_sa_opfuncid(saop); - context->total.per_tuple += get_func_cost(saop->opfuncid) * - cpu_operator_cost * estimate_array_length(arraynode) * 0.5; + sacosts.startup = sacosts.per_tuple = 0; + add_function_cost(context->root, saop->opfuncid, NULL, + &sacosts); + context->total.startup += sacosts.startup; + context->total.per_tuple += sacosts.per_tuple * + estimate_array_length(arraynode) * 0.5; } else if (IsA(node, Aggref) || IsA(node, WindowFunc)) @@ -3917,11 +3929,13 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) /* check the result type's input function */ getTypeInputInfo(iocoerce->resulttype, &iofunc, &typioparam); - context->total.per_tuple += get_func_cost(iofunc) * cpu_operator_cost; + add_function_cost(context->root, iofunc, NULL, + &context->total); /* check the input type's output function */ getTypeOutputInfo(exprType((Node *) iocoerce->arg), &iofunc, &typisvarlena); - context->total.per_tuple += get_func_cost(iofunc) * cpu_operator_cost; + add_function_cost(context->root, iofunc, NULL, + &context->total); } else if (IsA(node, ArrayCoerceExpr)) { @@ -3945,8 +3959,8 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) { Oid opid = lfirst_oid(lc); - context->total.per_tuple += get_func_cost(get_opcode(opid)) * - cpu_operator_cost; + add_function_cost(context->root, get_opcode(opid), NULL, + &context->total); } } else if (IsA(node, MinMaxExpr) || @@ -4946,7 +4960,7 @@ set_function_size_estimates(PlannerInfo *root, RelOptInfo *rel) foreach(lc, rte->functions) { RangeTblFunction *rtfunc = (RangeTblFunction *) lfirst(lc); - double ntup = expression_returns_set_rows(rtfunc->funcexpr); + double ntup = expression_returns_set_rows(root, rtfunc->funcexpr); if (ntup > rel->tuples) rel->tuples = ntup; diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 1f60be2..5e1a4f5 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -36,8 +36,8 @@ #include "optimizer/clauses.h" #include "optimizer/cost.h" #include "optimizer/optimizer.h" +#include "optimizer/plancat.h" #include "optimizer/planmain.h" -#include "optimizer/prep.h" #include "parser/analyze.h" #include "parser/parse_agg.h" #include "parser/parse_coerce.h" @@ -343,19 +343,24 @@ get_agg_clause_costs_walker(Node *node, get_agg_clause_costs_context *context) if (DO_AGGSPLIT_COMBINE(context->aggsplit)) { /* charge for combining previously aggregated states */ - costs->transCost.per_tuple += get_func_cost(aggcombinefn) * cpu_operator_cost; + add_function_cost(context->root, aggcombinefn, NULL, + &costs->transCost); } else - costs->transCost.per_tuple += get_func_cost(aggtransfn) * cpu_operator_cost; + add_function_cost(context->root, aggtransfn, NULL, + &costs->transCost); if (DO_AGGSPLIT_DESERIALIZE(context->aggsplit) && OidIsValid(aggdeserialfn)) - costs->transCost.per_tuple += get_func_cost(aggdeserialfn) * cpu_operator_cost; + add_function_cost(context->root, aggdeserialfn, NULL, + &costs->transCost); if (DO_AGGSPLIT_SERIALIZE(context->aggsplit) && OidIsValid(aggserialfn)) - costs->finalCost += get_func_cost(aggserialfn) * cpu_operator_cost; + add_function_cost(context->root, aggserialfn, NULL, + &costs->finalCost); if (!DO_AGGSPLIT_SKIPFINAL(context->aggsplit) && OidIsValid(aggfinalfn)) - costs->finalCost += get_func_cost(aggfinalfn) * cpu_operator_cost; + add_function_cost(context->root, aggfinalfn, NULL, + &costs->finalCost); /* * These costs are incurred only by the initial aggregate node, so we @@ -392,8 +397,8 @@ get_agg_clause_costs_walker(Node *node, get_agg_clause_costs_context *context) { cost_qual_eval_node(&argcosts, (Node *) aggref->aggdirectargs, context->root); - costs->transCost.startup += argcosts.startup; - costs->finalCost += argcosts.per_tuple; + costs->finalCost.startup += argcosts.startup; + costs->finalCost.per_tuple += argcosts.per_tuple; } /* @@ -561,7 +566,7 @@ find_window_functions_walker(Node *node, WindowFuncLists *lists) * Note: keep this in sync with expression_returns_set() in nodes/nodeFuncs.c. */ double -expression_returns_set_rows(Node *clause) +expression_returns_set_rows(PlannerInfo *root, Node *clause) { if (clause == NULL) return 1.0; @@ -570,7 +575,7 @@ expression_returns_set_rows(Node *clause) FuncExpr *expr = (FuncExpr *) clause; if (expr->funcretset) - return clamp_row_est(get_func_rows(expr->funcid)); + return clamp_row_est(get_function_rows(root, expr->funcid, clause)); } if (IsA(clause, OpExpr)) { @@ -579,7 +584,7 @@ expression_returns_set_rows(Node *clause) if (expr->opretset) { set_opfuncid(expr); - return clamp_row_est(get_func_rows(expr->opfuncid)); + return clamp_row_est(get_function_rows(root, expr->opfuncid, clause)); } } return 1.0; diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index b57de6b..db8566f 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -2626,7 +2626,7 @@ create_set_projection_path(PlannerInfo *root, Node *node = (Node *) lfirst(lc); double itemrows; - itemrows = expression_returns_set_rows(node); + itemrows = expression_returns_set_rows(root, node); if (tlist_rows < itemrows) tlist_rows = itemrows; } diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 3efa1bd..d6dc83c 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -29,10 +29,12 @@ #include "catalog/heap.h" #include "catalog/partition.h" #include "catalog/pg_am.h" +#include "catalog/pg_proc.h" #include "catalog/pg_statistic_ext.h" #include "foreign/fdwapi.h" #include "miscadmin.h" #include "nodes/makefuncs.h" +#include "nodes/supportnodes.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" #include "optimizer/optimizer.h" @@ -1772,6 +1774,8 @@ restriction_selectivity(PlannerInfo *root, * Returns the selectivity of a specified join operator clause. * This code executes registered procedures stored in the * operator relation, by calling the function manager. + * + * See clause_selectivity() for the meaning of the additional parameters. */ Selectivity join_selectivity(PlannerInfo *root, @@ -1806,6 +1810,184 @@ join_selectivity(PlannerInfo *root, } /* + * function_selectivity + * + * Returns the selectivity of a specified boolean function clause. + * This code executes registered procedures stored in the + * pg_proc relation, by calling the function manager. + * + * See clause_selectivity() for the meaning of the additional parameters. + */ +Selectivity +function_selectivity(PlannerInfo *root, + Oid funcid, + List *args, + Oid inputcollid, + bool is_join, + int varRelid, + JoinType jointype, + SpecialJoinInfo *sjinfo) +{ + RegProcedure prosupport = get_func_support(funcid); + SupportRequestSelectivity req; + SupportRequestSelectivity *sresult; + + /* + * If no support function is provided, use our historical default + * estimate, 0.3333333. This seems a pretty unprincipled choice, but + * Postgres has been using that estimate for function calls since 1992. + * The hoariness of this behavior suggests that we should not be in too + * much hurry to use another value. + */ + if (!prosupport) + return (Selectivity) 0.3333333; + + req.type = T_SupportRequestSelectivity; + req.root = root; + req.funcid = funcid; + req.args = args; + req.inputcollid = inputcollid; + req.is_join = is_join; + req.varRelid = varRelid; + req.jointype = jointype; + req.sjinfo = sjinfo; + req.selectivity = -1; /* to catch failure to set the value */ + + sresult = (SupportRequestSelectivity *) + DatumGetPointer(OidFunctionCall1(prosupport, + PointerGetDatum(&req))); + + /* If support function fails, use default */ + if (sresult != &req) + return (Selectivity) 0.3333333; + + if (req.selectivity < 0.0 || req.selectivity > 1.0) + elog(ERROR, "invalid function selectivity: %f", req.selectivity); + + return (Selectivity) req.selectivity; +} + +/* + * add_function_cost + * + * Get an estimate of the execution cost of a function, and *add* it to + * the contents of *cost. The estimate may include both one-time and + * per-tuple components, since QualCost does. + * + * The funcid must always be supplied. If it is being called as the + * implementation of a specific parsetree node (FuncExpr, OpExpr, + * WindowFunc, etc), pass that as "node", else pass NULL. + * + * In some usages root might be NULL, too. + */ +void +add_function_cost(PlannerInfo *root, Oid funcid, Node *node, + QualCost *cost) +{ + HeapTuple proctup; + Form_pg_proc procform; + + proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); + if (!HeapTupleIsValid(proctup)) + elog(ERROR, "cache lookup failed for function %u", funcid); + procform = (Form_pg_proc) GETSTRUCT(proctup); + + if (OidIsValid(procform->prosupport)) + { + SupportRequestCost req; + SupportRequestCost *sresult; + + req.type = T_SupportRequestCost; + req.root = root; + req.funcid = funcid; + req.node = node; + + /* Initialize cost fields so that support function doesn't have to */ + req.startup = 0; + req.per_tuple = 0; + + sresult = (SupportRequestCost *) + DatumGetPointer(OidFunctionCall1(procform->prosupport, + PointerGetDatum(&req))); + + if (sresult == &req) + { + /* Success, so accumulate support function's estimate into *cost */ + cost->startup += req.startup; + cost->per_tuple += req.per_tuple; + ReleaseSysCache(proctup); + return; + } + } + + /* No support function, or it failed, so rely on procost */ + cost->per_tuple += procform->procost * cpu_operator_cost; + + ReleaseSysCache(proctup); +} + +/* + * get_function_rows + * + * Get an estimate of the number of rows returned by a set-returning function. + * + * The funcid must always be supplied. In current usage, the calling node + * will always be supplied, and will be either a FuncExpr or OpExpr. + * But it's a good idea to not fail if it's NULL. + * + * In some usages root might be NULL, too. + * + * Note: this returns the unfiltered result of the support function, if any. + * It's usually a good idea to apply clamp_row_est() to the result, but we + * leave it to the caller to do so. + */ +double +get_function_rows(PlannerInfo *root, Oid funcid, Node *node) +{ + HeapTuple proctup; + Form_pg_proc procform; + double result; + + proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); + if (!HeapTupleIsValid(proctup)) + elog(ERROR, "cache lookup failed for function %u", funcid); + procform = (Form_pg_proc) GETSTRUCT(proctup); + + Assert(procform->proretset); /* else caller error */ + + if (OidIsValid(procform->prosupport)) + { + SupportRequestRows req; + SupportRequestRows *sresult; + + req.type = T_SupportRequestRows; + req.root = root; + req.funcid = funcid; + req.node = node; + + req.rows = 0; /* just for sanity */ + + sresult = (SupportRequestRows *) + DatumGetPointer(OidFunctionCall1(procform->prosupport, + PointerGetDatum(&req))); + + if (sresult == &req) + { + /* Success */ + ReleaseSysCache(proctup); + return req.rows; + } + } + + /* No support function, or it failed, so rely on prorows */ + result = procform->prorows; + + ReleaseSysCache(proctup); + + return result; +} + +/* * has_unique_index * * Detect whether there is a unique index on the specified attribute diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index fb00504..da64860 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -1575,17 +1575,6 @@ boolvarsel(PlannerInfo *root, Node *arg, int varRelid) selec = var_eq_const(&vardata, BooleanEqualOperator, BoolGetDatum(true), false, true, false); } - else if (is_funcclause(arg)) - { - /* - * If we have no stats and it's a function call, estimate 0.3333333. - * This seems a pretty unprincipled choice, but Postgres has been - * using that estimate for function calls since 1992. The hoariness - * of this behavior suggests that we should not be in too much hurry - * to use another value. - */ - selec = 0.3333333; - } else { /* Otherwise, the default estimate is 0.5 */ @@ -3500,7 +3489,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows, * pointless to worry too much about this without much better * estimates for SRF output rowcounts than we have today.) */ - this_srf_multiplier = expression_returns_set_rows(groupexpr); + this_srf_multiplier = expression_returns_set_rows(root, groupexpr); if (srf_multiplier < this_srf_multiplier) srf_multiplier = this_srf_multiplier; diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c index fba0ee8..e88c45d 100644 --- a/src/backend/utils/cache/lsyscache.c +++ b/src/backend/utils/cache/lsyscache.c @@ -1605,41 +1605,28 @@ get_func_leakproof(Oid funcid) } /* - * get_func_cost - * Given procedure id, return the function's procost field. - */ -float4 -get_func_cost(Oid funcid) -{ - HeapTuple tp; - float4 result; - - tp = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); - if (!HeapTupleIsValid(tp)) - elog(ERROR, "cache lookup failed for function %u", funcid); - - result = ((Form_pg_proc) GETSTRUCT(tp))->procost; - ReleaseSysCache(tp); - return result; -} - -/* - * get_func_rows - * Given procedure id, return the function's prorows field. + * get_func_support + * + * Returns the support function OID associated with a given function, + * or InvalidOid if there is none. */ -float4 -get_func_rows(Oid funcid) +RegProcedure +get_func_support(Oid funcid) { HeapTuple tp; - float4 result; tp = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); - if (!HeapTupleIsValid(tp)) - elog(ERROR, "cache lookup failed for function %u", funcid); + if (HeapTupleIsValid(tp)) + { + Form_pg_proc functup = (Form_pg_proc) GETSTRUCT(tp); + RegProcedure result; - result = ((Form_pg_proc) GETSTRUCT(tp))->prorows; - ReleaseSysCache(tp); - return result; + result = functup->prosupport; + ReleaseSysCache(tp); + return result; + } + else + return (RegProcedure) InvalidOid; } /* ---------- RELATION CACHE ---------- */ diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index 3d3adc2..802d983 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -506,7 +506,10 @@ typedef enum NodeTag T_TsmRoutine, /* in access/tsmapi.h */ T_ForeignKeyCacheInfo, /* in utils/rel.h */ T_CallContext, /* in nodes/parsenodes.h */ - T_SupportRequestSimplify /* in nodes/supportnodes.h */ + T_SupportRequestSimplify, /* in nodes/supportnodes.h */ + T_SupportRequestSelectivity, /* in nodes/supportnodes.h */ + T_SupportRequestCost, /* in nodes/supportnodes.h */ + T_SupportRequestRows /* in nodes/supportnodes.h */ } NodeTag; /* diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index d3c477a..437bb18 100644 --- a/src/include/nodes/pathnodes.h +++ b/src/include/nodes/pathnodes.h @@ -61,7 +61,7 @@ typedef struct AggClauseCosts bool hasNonPartial; /* does any agg not support partial mode? */ bool hasNonSerial; /* is any partial agg non-serializable? */ QualCost transCost; /* total per-input-row execution costs */ - Cost finalCost; /* total per-aggregated-row costs */ + QualCost finalCost; /* total per-aggregated-row costs */ Size transitionSpace; /* space for pass-by-ref transition data */ } AggClauseCosts; diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h index 1f7d02b..1a3a36b 100644 --- a/src/include/nodes/supportnodes.h +++ b/src/include/nodes/supportnodes.h @@ -36,6 +36,7 @@ #include "nodes/primnodes.h" struct PlannerInfo; /* avoid including relation.h here */ +struct SpecialJoinInfo; /* @@ -67,4 +68,103 @@ typedef struct SupportRequestSimplify FuncExpr *fcall; /* Function call to be simplified */ } SupportRequestSimplify; +/* + * The Selectivity request allows the support function to provide a + * selectivity estimate for a function appearing at top level of a WHERE + * clause (so it applies only to functions returning boolean). + * + * The input arguments are the same as are supplied to operator restriction + * and join estimators, except that we unify those two APIs into just one + * request type. See clause_selectivity() for the details. + * + * If an estimate can be made, store it into the "selectivity" field and + * return the address of the SupportRequestSelectivity node; the estimate + * must be between 0 and 1 inclusive. Return NULL if no estimate can be + * made (in which case the planner will fall back to a default estimate, + * traditionally 1/3). + * + * If the target function is being used as the implementation of an operator, + * the support function will not be used for this purpose; the operator's + * restriction or join estimator is consulted instead. + */ +typedef struct SupportRequestSelectivity +{ + NodeTag type; + + /* Input fields: */ + struct PlannerInfo *root; /* Planner's infrastructure */ + Oid funcid; /* function we are inquiring about */ + List *args; /* pre-simplified arguments to function */ + Oid inputcollid; /* function's input collation */ + bool is_join; /* is this a join or restriction case? */ + int varRelid; /* if restriction, RTI of target relation */ + JoinType jointype; /* if join, outer join type */ + struct SpecialJoinInfo *sjinfo; /* if outer join, info about join */ + + /* Output fields: */ + Selectivity selectivity; /* returned selectivity estimate */ +} SupportRequestSelectivity; + +/* + * The Cost request allows the support function to provide an execution + * cost estimate for its target function. The cost estimate can include + * both a one-time (query startup) component and a per-execution component. + * The estimate should *not* include the costs of evaluating the target + * function's arguments, only the target function itself. + * + * The "node" argument is normally the parse node that is invoking the + * target function. This is a FuncExpr in the simplest case, but it could + * also be an OpExpr, DistinctExpr, NullIfExpr, or WindowFunc, or possibly + * other cases in future. NULL is passed if the function cannot presume + * its arguments to be equivalent to what the calling node presents as + * arguments; that happens for, e.g., aggregate support functions and + * per-column comparison operators used by RowExprs. + * + * If an estimate can be made, store it into the cost fields and return the + * address of the SupportRequestCost node. Return NULL if no estimate can be + * made, in which case the planner will rely on the target function's procost + * field. (Note: while procost is automatically scaled by cpu_operator_cost, + * this is not the case for the outputs of the Cost request; the support + * function must scale its results appropriately on its own.) + */ +typedef struct SupportRequestCost +{ + NodeTag type; + + /* Input fields: */ + struct PlannerInfo *root; /* Planner's infrastructure (could be NULL) */ + Oid funcid; /* function we are inquiring about */ + Node *node; /* parse node invoking function, or NULL */ + + /* Output fields: */ + Cost startup; /* one-time cost */ + Cost per_tuple; /* per-evaluation cost */ +} SupportRequestCost; + +/* + * The Rows request allows the support function to provide an output rowcount + * estimate for its target function (so it applies only to set-returning + * functions). + * + * The "node" argument is the parse node that is invoking the target function; + * currently this will always be a FuncExpr or OpExpr. + * + * If an estimate can be made, store it into the rows field and return the + * address of the SupportRequestRows node. Return NULL if no estimate can be + * made, in which case the planner will rely on the target function's prorows + * field. + */ +typedef struct SupportRequestRows +{ + NodeTag type; + + /* Input fields: */ + struct PlannerInfo *root; /* Planner's infrastructure (could be NULL) */ + Oid funcid; /* function we are inquiring about */ + Node *node; /* parse node invoking function */ + + /* Output fields: */ + double rows; /* number of rows expected to be returned */ +} SupportRequestRows; + #endif /* SUPPORTNODES_H */ diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h index 23073c0..f46b6ef 100644 --- a/src/include/optimizer/clauses.h +++ b/src/include/optimizer/clauses.h @@ -31,7 +31,7 @@ extern void get_agg_clause_costs(PlannerInfo *root, Node *clause, extern bool contain_window_function(Node *clause); extern WindowFuncLists *find_window_functions(Node *clause, Index maxWinRef); -extern double expression_returns_set_rows(Node *clause); +extern double expression_returns_set_rows(PlannerInfo *root, Node *clause); extern bool contain_subplans(Node *clause); diff --git a/src/include/optimizer/plancat.h b/src/include/optimizer/plancat.h index 40f70f9..c337f04 100644 --- a/src/include/optimizer/plancat.h +++ b/src/include/optimizer/plancat.h @@ -55,6 +55,20 @@ extern Selectivity join_selectivity(PlannerInfo *root, JoinType jointype, SpecialJoinInfo *sjinfo); +extern Selectivity function_selectivity(PlannerInfo *root, + Oid funcid, + List *args, + Oid inputcollid, + bool is_join, + int varRelid, + JoinType jointype, + SpecialJoinInfo *sjinfo); + +extern void add_function_cost(PlannerInfo *root, Oid funcid, Node *node, + QualCost *cost); + +extern double get_function_rows(PlannerInfo *root, Oid funcid, Node *node); + extern bool has_row_triggers(PlannerInfo *root, Index rti, CmdType event); #endif /* PLANCAT_H */ diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h index ceec85d..16b0b1d 100644 --- a/src/include/utils/lsyscache.h +++ b/src/include/utils/lsyscache.h @@ -120,8 +120,7 @@ extern char func_volatile(Oid funcid); extern char func_parallel(Oid funcid); extern char get_func_prokind(Oid funcid); extern bool get_func_leakproof(Oid funcid); -extern float4 get_func_cost(Oid funcid); -extern float4 get_func_rows(Oid funcid); +extern RegProcedure get_func_support(Oid funcid); extern Oid get_relname_relid(const char *relname, Oid relnamespace); extern char *get_rel_name(Oid relid); extern Oid get_rel_namespace(Oid relid); diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out index 130a0e4..0879c88 100644 --- a/src/test/regress/expected/misc_functions.out +++ b/src/test/regress/expected/misc_functions.out @@ -133,3 +133,63 @@ ERROR: function num_nulls() does not exist LINE 1: SELECT num_nulls(); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. +-- +-- Test adding a support function to a subject function +-- +CREATE FUNCTION my_int_eq(int, int) RETURNS bool + LANGUAGE internal STRICT IMMUTABLE PARALLEL SAFE + AS $$int4eq$$; +-- By default, planner does not think that's selective +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN tenk1 b ON a.unique1 = b.unique1 +WHERE my_int_eq(a.unique2, 42); + QUERY PLAN +---------------------------------------------- + Hash Join + Hash Cond: (b.unique1 = a.unique1) + -> Seq Scan on tenk1 b + -> Hash + -> Seq Scan on tenk1 a + Filter: my_int_eq(unique2, 42) +(6 rows) + +-- With support function that knows it's int4eq, we get a different plan +ALTER FUNCTION my_int_eq(int, int) SUPPORT test_support_func; +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN tenk1 b ON a.unique1 = b.unique1 +WHERE my_int_eq(a.unique2, 42); + QUERY PLAN +------------------------------------------------- + Nested Loop + -> Seq Scan on tenk1 a + Filter: my_int_eq(unique2, 42) + -> Index Scan using tenk1_unique1 on tenk1 b + Index Cond: (unique1 = a.unique1) +(5 rows) + +-- Also test non-default rowcount estimate +CREATE FUNCTION my_gen_series(int, int) RETURNS SETOF integer + LANGUAGE internal STRICT IMMUTABLE PARALLEL SAFE + AS $$generate_series_int4$$ + SUPPORT test_support_func; +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN my_gen_series(1,1000) g ON a.unique1 = g; + QUERY PLAN +---------------------------------------- + Hash Join + Hash Cond: (g.g = a.unique1) + -> Function Scan on my_gen_series g + -> Hash + -> Seq Scan on tenk1 a +(5 rows) + +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g; + QUERY PLAN +------------------------------------------------- + Nested Loop + -> Function Scan on my_gen_series g + -> Index Scan using tenk1_unique1 on tenk1 a + Index Cond: (unique1 = g.g) +(4 rows) + diff --git a/src/test/regress/input/create_function_1.source b/src/test/regress/input/create_function_1.source index 26e2227..223454a 100644 --- a/src/test/regress/input/create_function_1.source +++ b/src/test/regress/input/create_function_1.source @@ -68,6 +68,11 @@ CREATE FUNCTION test_fdw_handler() AS '@libdir@/regress@DLSUFFIX@', 'test_fdw_handler' LANGUAGE C; +CREATE FUNCTION test_support_func(internal) + RETURNS internal + AS '@libdir@/regress@DLSUFFIX@', 'test_support_func' + LANGUAGE C STRICT; + -- Things that shouldn't work: CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL diff --git a/src/test/regress/output/create_function_1.source b/src/test/regress/output/create_function_1.source index 8c50d9b..5f43e8d 100644 --- a/src/test/regress/output/create_function_1.source +++ b/src/test/regress/output/create_function_1.source @@ -60,6 +60,10 @@ CREATE FUNCTION test_fdw_handler() RETURNS fdw_handler AS '@libdir@/regress@DLSUFFIX@', 'test_fdw_handler' LANGUAGE C; +CREATE FUNCTION test_support_func(internal) + RETURNS internal + AS '@libdir@/regress@DLSUFFIX@', 'test_support_func' + LANGUAGE C STRICT; -- Things that shouldn't work: CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL AS 'SELECT ''not an integer'';'; diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c index 7072728..ad3e803 100644 --- a/src/test/regress/regress.c +++ b/src/test/regress/regress.c @@ -23,12 +23,16 @@ #include "access/transam.h" #include "access/tuptoaster.h" #include "access/xact.h" +#include "catalog/pg_operator.h" #include "catalog/pg_type.h" #include "commands/sequence.h" #include "commands/trigger.h" #include "executor/executor.h" #include "executor/spi.h" #include "miscadmin.h" +#include "nodes/supportnodes.h" +#include "optimizer/optimizer.h" +#include "optimizer/plancat.h" #include "port/atomics.h" #include "utils/builtins.h" #include "utils/geo_decls.h" @@ -863,3 +867,76 @@ test_fdw_handler(PG_FUNCTION_ARGS) elog(ERROR, "test_fdw_handler is not implemented"); PG_RETURN_NULL(); } + +PG_FUNCTION_INFO_V1(test_support_func); +Datum +test_support_func(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSelectivity)) + { + /* + * Assume that the target is int4eq; that's safe as long as we don't + * attach this to any other boolean-returning function. + */ + SupportRequestSelectivity *req = (SupportRequestSelectivity *) rawreq; + Selectivity s1; + + if (req->is_join) + s1 = join_selectivity(req->root, Int4EqualOperator, + req->args, + req->inputcollid, + req->jointype, + req->sjinfo); + else + s1 = restriction_selectivity(req->root, Int4EqualOperator, + req->args, + req->inputcollid, + req->varRelid); + + req->selectivity = s1; + ret = (Node *) req; + } + + if (IsA(rawreq, SupportRequestCost)) + { + /* Provide some generic estimate */ + SupportRequestCost *req = (SupportRequestCost *) rawreq; + + req->startup = 0; + req->per_tuple = 2 * cpu_operator_cost; + ret = (Node *) req; + } + + if (IsA(rawreq, SupportRequestRows)) + { + /* + * Assume that the target is generate_series_int4; that's safe as long + * as we don't attach this to any other set-returning function. + */ + SupportRequestRows *req = (SupportRequestRows *) rawreq; + + if (req->node && IsA(req->node, FuncExpr)) /* be paranoid */ + { + List *args = ((FuncExpr *) req->node)->args; + Node *arg1 = linitial(args); + Node *arg2 = lsecond(args); + + if (IsA(arg1, Const) && + !((Const *) arg1)->constisnull && + IsA(arg2, Const) && + !((Const *) arg2)->constisnull) + { + int32 val1 = DatumGetInt32(((Const *) arg1)->constvalue); + int32 val2 = DatumGetInt32(((Const *) arg2)->constvalue); + + req->rows = val2 - val1 + 1; + ret = (Node *) req; + } + } + } + + PG_RETURN_POINTER(ret); +} diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql index 1a20c1f..7a71f76 100644 --- a/src/test/regress/sql/misc_functions.sql +++ b/src/test/regress/sql/misc_functions.sql @@ -29,3 +29,35 @@ SELECT num_nulls(VARIADIC '{}'::int[]); -- should fail, one or more arguments is required SELECT num_nonnulls(); SELECT num_nulls(); + +-- +-- Test adding a support function to a subject function +-- + +CREATE FUNCTION my_int_eq(int, int) RETURNS bool + LANGUAGE internal STRICT IMMUTABLE PARALLEL SAFE + AS $$int4eq$$; + +-- By default, planner does not think that's selective +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN tenk1 b ON a.unique1 = b.unique1 +WHERE my_int_eq(a.unique2, 42); + +-- With support function that knows it's int4eq, we get a different plan +ALTER FUNCTION my_int_eq(int, int) SUPPORT test_support_func; + +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN tenk1 b ON a.unique1 = b.unique1 +WHERE my_int_eq(a.unique2, 42); + +-- Also test non-default rowcount estimate +CREATE FUNCTION my_gen_series(int, int) RETURNS SETOF integer + LANGUAGE internal STRICT IMMUTABLE PARALLEL SAFE + AS $$generate_series_int4$$ + SUPPORT test_support_func; + +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN my_gen_series(1,1000) g ON a.unique1 = g; + +EXPLAIN (COSTS OFF) +SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g; diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index a785361..5b2917d 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -22,12 +22,16 @@ #include "catalog/pg_type.h" #include "funcapi.h" #include "libpq/pqformat.h" +#include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" +#include "optimizer/optimizer.h" #include "utils/array.h" #include "utils/arrayaccess.h" #include "utils/builtins.h" #include "utils/datum.h" #include "utils/lsyscache.h" #include "utils/memutils.h" +#include "utils/selfuncs.h" #include "utils/typcache.h" @@ -6025,6 +6029,36 @@ array_unnest(PG_FUNCTION_ARGS) } } +/* + * Planner support function for array_unnest(anyarray) + */ +Datum +array_unnest_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestRows)) + { + /* Try to estimate the number of rows returned */ + SupportRequestRows *req = (SupportRequestRows *) rawreq; + + if (is_funcclause(req->node)) /* be paranoid */ + { + List *args = ((FuncExpr *) req->node)->args; + Node *arg1; + + /* We can use estimated argument values here */ + arg1 = estimate_expression_value(req->root, linitial(args)); + + req->rows = estimate_array_length(arg1); + ret = (Node *) req; + } + } + + PG_RETURN_POINTER(ret); +} + /* * array_replace/array_remove support diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c index ad8e6d0..04825fc 100644 --- a/src/backend/utils/adt/int.c +++ b/src/backend/utils/adt/int.c @@ -30,11 +30,15 @@ #include <ctype.h> #include <limits.h> +#include <math.h> #include "catalog/pg_type.h" #include "common/int.h" #include "funcapi.h" #include "libpq/pqformat.h" +#include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" +#include "optimizer/optimizer.h" #include "utils/array.h" #include "utils/builtins.h" @@ -1427,3 +1431,73 @@ generate_series_step_int4(PG_FUNCTION_ARGS) /* do when there is no more left */ SRF_RETURN_DONE(funcctx); } + +/* + * Planner support function for generate_series(int4, int4 [, int4]) + */ +Datum +generate_series_int4_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestRows)) + { + /* Try to estimate the number of rows returned */ + SupportRequestRows *req = (SupportRequestRows *) rawreq; + + if (is_funcclause(req->node)) /* be paranoid */ + { + List *args = ((FuncExpr *) req->node)->args; + Node *arg1, + *arg2, + *arg3; + + /* We can use estimated argument values here */ + arg1 = estimate_expression_value(req->root, linitial(args)); + arg2 = estimate_expression_value(req->root, lsecond(args)); + if (list_length(args) >= 3) + arg3 = estimate_expression_value(req->root, lthird(args)); + else + arg3 = NULL; + + /* + * If any argument is constant NULL, we can safely assume that + * zero rows are returned. Otherwise, if they're all non-NULL + * constants, we can calculate the number of rows that will be + * returned. Use double arithmetic to avoid overflow hazards. + */ + if ((IsA(arg1, Const) && + ((Const *) arg1)->constisnull) || + (IsA(arg2, Const) && + ((Const *) arg2)->constisnull) || + (arg3 != NULL && IsA(arg3, Const) && + ((Const *) arg3)->constisnull)) + { + req->rows = 0; + ret = (Node *) req; + } + else if (IsA(arg1, Const) && + IsA(arg2, Const) && + (arg3 == NULL || IsA(arg3, Const))) + { + double start, + finish, + step; + + start = DatumGetInt32(((Const *) arg1)->constvalue); + finish = DatumGetInt32(((Const *) arg2)->constvalue); + step = arg3 ? DatumGetInt32(((Const *) arg3)->constvalue) : 1; + + /* This equation works for either sign of step */ + if (step != 0) + { + req->rows = floor((finish - start + step) / step); + ret = (Node *) req; + } + } + } + } + + PG_RETURN_POINTER(ret); +} diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c index d16cc9e..0ff9394 100644 --- a/src/backend/utils/adt/int8.c +++ b/src/backend/utils/adt/int8.c @@ -20,6 +20,9 @@ #include "common/int.h" #include "funcapi.h" #include "libpq/pqformat.h" +#include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" +#include "optimizer/optimizer.h" #include "utils/int8.h" #include "utils/builtins.h" @@ -1373,3 +1376,73 @@ generate_series_step_int8(PG_FUNCTION_ARGS) /* do when there is no more left */ SRF_RETURN_DONE(funcctx); } + +/* + * Planner support function for generate_series(int8, int8 [, int8]) + */ +Datum +generate_series_int8_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestRows)) + { + /* Try to estimate the number of rows returned */ + SupportRequestRows *req = (SupportRequestRows *) rawreq; + + if (is_funcclause(req->node)) /* be paranoid */ + { + List *args = ((FuncExpr *) req->node)->args; + Node *arg1, + *arg2, + *arg3; + + /* We can use estimated argument values here */ + arg1 = estimate_expression_value(req->root, linitial(args)); + arg2 = estimate_expression_value(req->root, lsecond(args)); + if (list_length(args) >= 3) + arg3 = estimate_expression_value(req->root, lthird(args)); + else + arg3 = NULL; + + /* + * If any argument is constant NULL, we can safely assume that + * zero rows are returned. Otherwise, if they're all non-NULL + * constants, we can calculate the number of rows that will be + * returned. Use double arithmetic to avoid overflow hazards. + */ + if ((IsA(arg1, Const) && + ((Const *) arg1)->constisnull) || + (IsA(arg2, Const) && + ((Const *) arg2)->constisnull) || + (arg3 != NULL && IsA(arg3, Const) && + ((Const *) arg3)->constisnull)) + { + req->rows = 0; + ret = (Node *) req; + } + else if (IsA(arg1, Const) && + IsA(arg2, Const) && + (arg3 == NULL || IsA(arg3, Const))) + { + double start, + finish, + step; + + start = DatumGetInt64(((Const *) arg1)->constvalue); + finish = DatumGetInt64(((Const *) arg2)->constvalue); + step = arg3 ? DatumGetInt64(((Const *) arg3)->constvalue) : 1; + + /* This equation works for either sign of step */ + if (step != 0) + { + req->rows = floor((finish - start + step) / step); + ret = (Node *) req; + } + } + } + } + + PG_RETURN_POINTER(ret); +} diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index e5cb5bb..039b596 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -1530,9 +1530,12 @@ proargtypes => 'anyelement _int4 _int4', prosrc => 'array_fill_with_lower_bounds' }, { oid => '2331', descr => 'expand array to set of rows', - proname => 'unnest', prorows => '100', proretset => 't', - prorettype => 'anyelement', proargtypes => 'anyarray', + proname => 'unnest', prorows => '100', prosupport => 'array_unnest_support', + proretset => 't', prorettype => 'anyelement', proargtypes => 'anyarray', prosrc => 'array_unnest' }, +{ oid => '3996', descr => 'planner support for array_unnest', + proname => 'array_unnest_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'array_unnest_support' }, { oid => '3167', descr => 'remove any occurrences of an element from an array', proname => 'array_remove', proisstrict => 'f', prorettype => 'anyarray', @@ -7536,21 +7539,31 @@ # non-persistent series generator { oid => '1066', descr => 'non-persistent series generator', - proname => 'generate_series', prorows => '1000', proretset => 't', + proname => 'generate_series', prorows => '1000', + prosupport => 'generate_series_int4_support', proretset => 't', prorettype => 'int4', proargtypes => 'int4 int4 int4', prosrc => 'generate_series_step_int4' }, { oid => '1067', descr => 'non-persistent series generator', - proname => 'generate_series', prorows => '1000', proretset => 't', + proname => 'generate_series', prorows => '1000', + prosupport => 'generate_series_int4_support', proretset => 't', prorettype => 'int4', proargtypes => 'int4 int4', prosrc => 'generate_series_int4' }, +{ oid => '3994', descr => 'planner support for generate_series', + proname => 'generate_series_int4_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'generate_series_int4_support' }, { oid => '1068', descr => 'non-persistent series generator', - proname => 'generate_series', prorows => '1000', proretset => 't', + proname => 'generate_series', prorows => '1000', + prosupport => 'generate_series_int8_support', proretset => 't', prorettype => 'int8', proargtypes => 'int8 int8 int8', prosrc => 'generate_series_step_int8' }, { oid => '1069', descr => 'non-persistent series generator', - proname => 'generate_series', prorows => '1000', proretset => 't', + proname => 'generate_series', prorows => '1000', + prosupport => 'generate_series_int8_support', proretset => 't', prorettype => 'int8', proargtypes => 'int8 int8', prosrc => 'generate_series_int8' }, +{ oid => '3995', descr => 'planner support for generate_series', + proname => 'generate_series_int8_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'generate_series_int8_support' }, { oid => '3259', descr => 'non-persistent series generator', proname => 'generate_series', prorows => '1000', proretset => 't', prorettype => 'numeric', proargtypes => 'numeric numeric numeric', diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out index a54b4a5..6e238e8 100644 --- a/src/test/regress/expected/subselect.out +++ b/src/test/regress/expected/subselect.out @@ -904,7 +904,7 @@ select * from int4_tbl where -- explain (verbose, costs off) select * from int4_tbl o where (f1, f1) in - (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1); + (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1); QUERY PLAN ------------------------------------------------------------------- Nested Loop Semi Join @@ -918,9 +918,9 @@ select * from int4_tbl o where (f1, f1) in Output: "ANY_subquery".f1, "ANY_subquery".g Filter: ("ANY_subquery".f1 = "ANY_subquery".g) -> Result - Output: i.f1, ((generate_series(1, 2)) / 10) + Output: i.f1, ((generate_series(1, 50)) / 10) -> ProjectSet - Output: generate_series(1, 2), i.f1 + Output: generate_series(1, 50), i.f1 -> HashAggregate Output: i.f1 Group Key: i.f1 @@ -929,7 +929,7 @@ select * from int4_tbl o where (f1, f1) in (19 rows) select * from int4_tbl o where (f1, f1) in - (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1); + (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1); f1 ---- 0 diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql index 843f511..ccbe8a1 100644 --- a/src/test/regress/sql/subselect.sql +++ b/src/test/regress/sql/subselect.sql @@ -498,9 +498,9 @@ select * from int4_tbl where -- explain (verbose, costs off) select * from int4_tbl o where (f1, f1) in - (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1); + (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1); select * from int4_tbl o where (f1, f1) in - (select f1, generate_series(1,2) / 10 g from int4_tbl i group by f1); + (select f1, generate_series(1,50) / 10 g from int4_tbl i group by f1); -- -- check for over-optimization of whole-row Var referencing an Append plan diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 2385d02..8ed30c0 100644 *** a/src/backend/nodes/nodeFuncs.c --- b/src/backend/nodes/nodeFuncs.c *************** expression_tree_walker(Node *node, *** 2192,2197 **** --- 2192,2208 ---- /* groupClauses are deemed uninteresting */ } break; + case T_IndexClause: + { + IndexClause *iclause = (IndexClause *) node; + + if (walker(iclause->rinfo, context)) + return true; + if (expression_tree_walker((Node *) iclause->indexquals, + walker, context)) + return true; + } + break; case T_PlaceHolderVar: return walker(((PlaceHolderVar *) node)->phexpr, context); case T_InferenceElem: *************** expression_tree_mutator(Node *node, *** 2999,3004 **** --- 3010,3026 ---- return (Node *) newnode; } break; + case T_IndexClause: + { + IndexClause *iclause = (IndexClause *) node; + IndexClause *newnode; + + FLATCOPY(newnode, iclause, IndexClause); + MUTATE(newnode->rinfo, iclause->rinfo, RestrictInfo *); + MUTATE(newnode->indexquals, iclause->indexquals, List *); + return (Node *) newnode; + } + break; case T_PlaceHolderVar: { PlaceHolderVar *phv = (PlaceHolderVar *) node; diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index f97cf37..10038a2 100644 *** a/src/backend/nodes/outfuncs.c --- b/src/backend/nodes/outfuncs.c *************** _outIndexPath(StringInfo str, const Inde *** 1744,1751 **** WRITE_NODE_FIELD(indexinfo); WRITE_NODE_FIELD(indexclauses); - WRITE_NODE_FIELD(indexquals); - WRITE_NODE_FIELD(indexqualcols); WRITE_NODE_FIELD(indexorderbys); WRITE_NODE_FIELD(indexorderbycols); WRITE_ENUM_FIELD(indexscandir, ScanDirection); --- 1744,1749 ---- *************** _outRestrictInfo(StringInfo str, const R *** 2448,2453 **** --- 2446,2463 ---- } static void + _outIndexClause(StringInfo str, const IndexClause *node) + { + WRITE_NODE_TYPE("INDEXCLAUSE"); + + WRITE_NODE_FIELD(rinfo); + WRITE_NODE_FIELD(indexquals); + WRITE_BOOL_FIELD(lossy); + WRITE_INT_FIELD(indexcol); + WRITE_NODE_FIELD(indexcols); + } + + static void _outPlaceHolderVar(StringInfo str, const PlaceHolderVar *node) { WRITE_NODE_TYPE("PLACEHOLDERVAR"); *************** outNode(StringInfo str, const void *obj) *** 4044,4049 **** --- 4054,4062 ---- case T_RestrictInfo: _outRestrictInfo(str, obj); break; + case T_IndexClause: + _outIndexClause(str, obj); + break; case T_PlaceHolderVar: _outPlaceHolderVar(str, obj); break; diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index b8d406f..1057dda 100644 *** a/src/backend/optimizer/path/costsize.c --- b/src/backend/optimizer/path/costsize.c *************** typedef struct *** 145,151 **** QualCost total; } cost_qual_eval_context; ! static List *extract_nonindex_conditions(List *qual_clauses, List *indexquals); static MergeScanSelCache *cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey); --- 145,151 ---- QualCost total; } cost_qual_eval_context; ! static List *extract_nonindex_conditions(List *qual_clauses, List *indexclauses); static MergeScanSelCache *cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey); *************** cost_index(IndexPath *path, PlannerInfo *** 517,534 **** { path->path.rows = path->path.param_info->ppi_rows; /* qpquals come from the rel's restriction clauses and ppi_clauses */ ! qpquals = list_concat( ! extract_nonindex_conditions(path->indexinfo->indrestrictinfo, ! path->indexquals), extract_nonindex_conditions(path->path.param_info->ppi_clauses, ! path->indexquals)); } else { path->path.rows = baserel->rows; /* qpquals come from just the rel's restriction clauses */ qpquals = extract_nonindex_conditions(path->indexinfo->indrestrictinfo, ! path->indexquals); } if (!enable_indexscan) --- 517,533 ---- { path->path.rows = path->path.param_info->ppi_rows; /* qpquals come from the rel's restriction clauses and ppi_clauses */ ! qpquals = list_concat(extract_nonindex_conditions(path->indexinfo->indrestrictinfo, ! path->indexclauses), extract_nonindex_conditions(path->path.param_info->ppi_clauses, ! path->indexclauses)); } else { path->path.rows = baserel->rows; /* qpquals come from just the rel's restriction clauses */ qpquals = extract_nonindex_conditions(path->indexinfo->indrestrictinfo, ! path->indexclauses); } if (!enable_indexscan) *************** cost_index(IndexPath *path, PlannerInfo *** 753,772 **** * * Given a list of quals to be enforced in an indexscan, extract the ones that * will have to be applied as qpquals (ie, the index machinery won't handle ! * them). The actual rules for this appear in create_indexscan_plan() in ! * createplan.c, but the full rules are fairly expensive and we don't want to ! * go to that much effort for index paths that don't get selected for the ! * final plan. So we approximate it as quals that don't appear directly in ! * indexquals and also are not redundant children of the same EquivalenceClass ! * as some indexqual. This method neglects some infrequently-relevant ! * considerations, specifically clauses that needn't be checked because they ! * are implied by an indexqual. It does not seem worth the cycles to try to ! * factor that in at this stage, even though createplan.c will take pains to ! * remove such unnecessary clauses from the qpquals list if this path is ! * selected for use. */ static List * ! extract_nonindex_conditions(List *qual_clauses, List *indexquals) { List *result = NIL; ListCell *lc; --- 752,770 ---- * * Given a list of quals to be enforced in an indexscan, extract the ones that * will have to be applied as qpquals (ie, the index machinery won't handle ! * them). Here we detect only whether a qual clause is directly redundant ! * with some indexclause. If the index path is chosen for use, createplan.c ! * will try a bit harder to get rid of redundant qual conditions; specifically ! * it will see if quals can be proven to be implied by the indexquals. But ! * it does not seem worth the cycles to try to factor that in at this stage, ! * since we're only trying to estimate qual eval costs. Otherwise this must ! * match the logic in create_indexscan_plan(). ! * ! * qual_clauses, and the result, are lists of RestrictInfos. ! * indexclauses is a list of IndexClauses. */ static List * ! extract_nonindex_conditions(List *qual_clauses, List *indexclauses) { List *result = NIL; ListCell *lc; *************** extract_nonindex_conditions(List *qual_c *** 777,786 **** if (rinfo->pseudoconstant) continue; /* we may drop pseudoconstants here */ ! if (list_member_ptr(indexquals, rinfo)) ! continue; /* simple duplicate */ ! if (is_redundant_derived_clause(rinfo, indexquals)) ! continue; /* derived from same EquivalenceClass */ /* ... skip the predicate proof attempt createplan.c will try ... */ result = lappend(result, rinfo); } --- 775,782 ---- if (rinfo->pseudoconstant) continue; /* we may drop pseudoconstants here */ ! if (is_redundant_with_indexclauses(rinfo, indexclauses)) ! continue; /* dup or derived from same EquivalenceClass */ /* ... skip the predicate proof attempt createplan.c will try ... */ result = lappend(result, rinfo); } *************** has_indexed_join_quals(NestPath *joinpat *** 4242,4249 **** innerpath->parent->relids, joinrelids)) { ! if (!(list_member_ptr(indexclauses, rinfo) || ! is_redundant_derived_clause(rinfo, indexclauses))) return false; found_one = true; } --- 4238,4244 ---- innerpath->parent->relids, joinrelids)) { ! if (!is_redundant_with_indexclauses(rinfo, indexclauses)) return false; found_one = true; } diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index 3454f12..2379250 100644 *** a/src/backend/optimizer/path/equivclass.c --- b/src/backend/optimizer/path/equivclass.c *************** is_redundant_derived_clause(RestrictInfo *** 2511,2513 **** --- 2511,2550 ---- return false; } + + /* + * is_redundant_with_indexclauses + * Test whether rinfo is redundant with any clause in the IndexClause + * list. Here, for convenience, we test both simple identity and + * whether it is derived from the same EC as any member of the list. + */ + bool + is_redundant_with_indexclauses(RestrictInfo *rinfo, List *indexclauses) + { + EquivalenceClass *parent_ec = rinfo->parent_ec; + ListCell *lc; + + foreach(lc, indexclauses) + { + IndexClause *iclause = lfirst_node(IndexClause, lc); + RestrictInfo *otherrinfo = iclause->rinfo; + + /* If indexclause is lossy, it won't enforce the condition exactly */ + if (iclause->lossy) + continue; + + /* Match if it's same clause (pointer equality should be enough) */ + if (rinfo == otherrinfo) + return true; + /* Match if derived from same EC */ + if (parent_ec && otherrinfo->parent_ec == parent_ec) + return true; + + /* + * No need to look at the derived clauses in iclause->indexquals; they + * couldn't match if the parent clause didn't. + */ + } + + return false; + } diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index 7e1a390..51d2da5 100644 *** a/src/backend/optimizer/path/indxpath.c --- b/src/backend/optimizer/path/indxpath.c *************** typedef enum *** 56,62 **** typedef struct { bool nonempty; /* True if lists are not all empty */ ! /* Lists of RestrictInfos, one per index column */ List *indexclauses[INDEX_MAX_KEYS]; } IndexClauseSet; --- 56,62 ---- typedef struct { bool nonempty; /* True if lists are not all empty */ ! /* Lists of IndexClause nodes, one list per index column */ List *indexclauses[INDEX_MAX_KEYS]; } IndexClauseSet; *************** static bool match_boolean_index_clause(N *** 175,187 **** static bool match_special_index_operator(Expr *clause, Oid opfamily, Oid idxcollation, bool indexkey_on_left); static Expr *expand_boolean_index_clause(Node *clause, int indexcol, IndexOptInfo *index); static List *expand_indexqual_opclause(RestrictInfo *rinfo, ! Oid opfamily, Oid idxcollation); static RestrictInfo *expand_indexqual_rowcompare(RestrictInfo *rinfo, IndexOptInfo *index, ! int indexcol); static List *prefix_quals(Node *leftop, Oid opfamily, Oid collation, Const *prefix, Pattern_Prefix_Status pstatus); static List *network_prefix_quals(Node *leftop, Oid expr_op, Oid opfamily, --- 175,193 ---- static bool match_special_index_operator(Expr *clause, Oid opfamily, Oid idxcollation, bool indexkey_on_left); + static IndexClause *expand_indexqual_conditions(IndexOptInfo *index, + int indexcol, + RestrictInfo *rinfo); static Expr *expand_boolean_index_clause(Node *clause, int indexcol, IndexOptInfo *index); static List *expand_indexqual_opclause(RestrictInfo *rinfo, ! Oid opfamily, Oid idxcollation, ! bool *lossy); static RestrictInfo *expand_indexqual_rowcompare(RestrictInfo *rinfo, IndexOptInfo *index, ! int indexcol, ! List **indexcolnos, ! bool *lossy); static List *prefix_quals(Node *leftop, Oid opfamily, Oid collation, Const *prefix, Pattern_Prefix_Status pstatus); static List *network_prefix_quals(Node *leftop, Oid expr_op, Oid opfamily, *************** consider_index_join_clauses(PlannerInfo *** 496,502 **** * * 'rel', 'index', 'rclauseset', 'jclauseset', 'eclauseset', and * 'bitindexpaths' as above ! * 'indexjoinclauses' is a list of RestrictInfos for join clauses * 'considered_clauses' is the total number of clauses considered (so far) * '*considered_relids' is a list of all relids sets already considered */ --- 502,508 ---- * * 'rel', 'index', 'rclauseset', 'jclauseset', 'eclauseset', and * 'bitindexpaths' as above ! * 'indexjoinclauses' is a list of IndexClauses for join clauses * 'considered_clauses' is the total number of clauses considered (so far) * '*considered_relids' is a list of all relids sets already considered */ *************** consider_index_join_outer_rels(PlannerIn *** 516,523 **** /* Examine relids of each joinclause in the given list */ foreach(lc, indexjoinclauses) { ! RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); ! Relids clause_relids = rinfo->clause_relids; ListCell *lc2; /* If we already tried its relids set, no need to do so again */ --- 522,530 ---- /* Examine relids of each joinclause in the given list */ foreach(lc, indexjoinclauses) { ! IndexClause *iclause = (IndexClause *) lfirst(lc); ! Relids clause_relids = iclause->rinfo->clause_relids; ! EquivalenceClass *parent_ec = iclause->rinfo->parent_ec; ListCell *lc2; /* If we already tried its relids set, no need to do so again */ *************** consider_index_join_outer_rels(PlannerIn *** 558,565 **** * parameterization; so skip if any clause derived from the same * eclass would already have been included when using oldrelids. */ ! if (rinfo->parent_ec && ! eclass_already_used(rinfo->parent_ec, oldrelids, indexjoinclauses)) continue; --- 565,572 ---- * parameterization; so skip if any clause derived from the same * eclass would already have been included when using oldrelids. */ ! if (parent_ec && ! eclass_already_used(parent_ec, oldrelids, indexjoinclauses)) continue; *************** get_join_index_paths(PlannerInfo *root, *** 628,638 **** /* First find applicable simple join clauses */ foreach(lc, jclauseset->indexclauses[indexcol]) { ! RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); ! if (bms_is_subset(rinfo->clause_relids, relids)) clauseset.indexclauses[indexcol] = ! lappend(clauseset.indexclauses[indexcol], rinfo); } /* --- 635,645 ---- /* First find applicable simple join clauses */ foreach(lc, jclauseset->indexclauses[indexcol]) { ! IndexClause *iclause = (IndexClause *) lfirst(lc); ! if (bms_is_subset(iclause->rinfo->clause_relids, relids)) clauseset.indexclauses[indexcol] = ! lappend(clauseset.indexclauses[indexcol], iclause); } /* *************** get_join_index_paths(PlannerInfo *root, *** 643,654 **** */ foreach(lc, eclauseset->indexclauses[indexcol]) { ! RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); ! if (bms_is_subset(rinfo->clause_relids, relids)) { clauseset.indexclauses[indexcol] = ! lappend(clauseset.indexclauses[indexcol], rinfo); break; } } --- 650,661 ---- */ foreach(lc, eclauseset->indexclauses[indexcol]) { ! IndexClause *iclause = (IndexClause *) lfirst(lc); ! if (bms_is_subset(iclause->rinfo->clause_relids, relids)) { clauseset.indexclauses[indexcol] = ! lappend(clauseset.indexclauses[indexcol], iclause); break; } } *************** eclass_already_used(EquivalenceClass *pa *** 688,694 **** foreach(lc, indexjoinclauses) { ! RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); if (rinfo->parent_ec == parent_ec && bms_is_subset(rinfo->clause_relids, oldrelids)) --- 695,702 ---- foreach(lc, indexjoinclauses) { ! IndexClause *iclause = (IndexClause *) lfirst(lc); ! RestrictInfo *rinfo = iclause->rinfo; if (rinfo->parent_ec == parent_ec && bms_is_subset(rinfo->clause_relids, oldrelids)) *************** get_index_paths(PlannerInfo *root, RelOp *** 848,854 **** * * 'rel' is the index's heap relation * 'index' is the index for which we want to generate paths ! * 'clauses' is the collection of indexable clauses (RestrictInfo nodes) * 'useful_predicate' indicates whether the index has a useful predicate * 'scantype' indicates whether we need plain or bitmap scan support * 'skip_nonnative_saop' indicates whether to accept SAOP if index AM doesn't --- 856,862 ---- * * 'rel' is the index's heap relation * 'index' is the index for which we want to generate paths ! * 'clauses' is the collection of indexable clauses (IndexClause nodes) * 'useful_predicate' indicates whether the index has a useful predicate * 'scantype' indicates whether we need plain or bitmap scan support * 'skip_nonnative_saop' indicates whether to accept SAOP if index AM doesn't *************** build_index_paths(PlannerInfo *root, Rel *** 865,871 **** List *result = NIL; IndexPath *ipath; List *index_clauses; - List *clause_columns; Relids outer_relids; double loop_count; List *orderbyclauses; --- 873,878 ---- *************** build_index_paths(PlannerInfo *root, Rel *** 897,910 **** } /* ! * 1. Collect the index clauses into a single list. * ! * We build a list of RestrictInfo nodes for clauses to be used with this ! * index, along with an integer list of the index column numbers (zero ! * based) that each clause should be used with. The clauses are ordered ! * by index key, so that the column numbers form a nondecreasing sequence. ! * (This order is depended on by btree and possibly other places.) The ! * lists can be empty, if the index AM allows that. * * found_lower_saop_clause is set true if we accept a ScalarArrayOpExpr * index clause for a non-first index column. This prevents us from --- 904,915 ---- } /* ! * 1. Combine the per-column IndexClause lists into an overall list. * ! * In the resulting list, clauses are ordered by index key, so that the ! * column numbers form a nondecreasing sequence. (This order is depended ! * on by btree and possibly other places.) The list can be empty, if the ! * index AM allows that. * * found_lower_saop_clause is set true if we accept a ScalarArrayOpExpr * index clause for a non-first index column. This prevents us from *************** build_index_paths(PlannerInfo *root, Rel *** 918,924 **** * otherwise accounted for. */ index_clauses = NIL; - clause_columns = NIL; found_lower_saop_clause = false; outer_relids = bms_copy(rel->lateral_relids); for (indexcol = 0; indexcol < index->ncolumns; indexcol++) --- 923,928 ---- *************** build_index_paths(PlannerInfo *root, Rel *** 927,934 **** foreach(lc, clauses->indexclauses[indexcol]) { ! RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); if (IsA(rinfo->clause, ScalarArrayOpExpr)) { if (!index->amsearcharray) --- 931,940 ---- foreach(lc, clauses->indexclauses[indexcol]) { ! IndexClause *iclause = (IndexClause *) lfirst(lc); ! RestrictInfo *rinfo = iclause->rinfo; + /* We might need to omit ScalarArrayOpExpr clauses */ if (IsA(rinfo->clause, ScalarArrayOpExpr)) { if (!index->amsearcharray) *************** build_index_paths(PlannerInfo *root, Rel *** 953,960 **** found_lower_saop_clause = true; } } ! index_clauses = lappend(index_clauses, rinfo); ! clause_columns = lappend_int(clause_columns, indexcol); outer_relids = bms_add_members(outer_relids, rinfo->clause_relids); } --- 959,967 ---- found_lower_saop_clause = true; } } ! ! /* OK to include this clause */ ! index_clauses = lappend(index_clauses, iclause); outer_relids = bms_add_members(outer_relids, rinfo->clause_relids); } *************** build_index_paths(PlannerInfo *root, Rel *** 1036,1042 **** { ipath = create_index_path(root, index, index_clauses, - clause_columns, orderbyclauses, orderbyclausecols, useful_pathkeys, --- 1043,1048 ---- *************** build_index_paths(PlannerInfo *root, Rel *** 1059,1065 **** { ipath = create_index_path(root, index, index_clauses, - clause_columns, orderbyclauses, orderbyclausecols, useful_pathkeys, --- 1065,1070 ---- *************** build_index_paths(PlannerInfo *root, Rel *** 1095,1101 **** { ipath = create_index_path(root, index, index_clauses, - clause_columns, NIL, NIL, useful_pathkeys, --- 1100,1105 ---- *************** build_index_paths(PlannerInfo *root, Rel *** 1113,1119 **** { ipath = create_index_path(root, index, index_clauses, - clause_columns, NIL, NIL, useful_pathkeys, --- 1117,1122 ---- *************** get_bitmap_tree_required_outer(Path *bit *** 1810,1816 **** * find_indexpath_quals * * Given the Path structure for a plain or bitmap indexscan, extract lists ! * of all the indexquals and index predicate conditions used in the Path. * These are appended to the initial contents of *quals and *preds (hence * caller should initialize those to NIL). * --- 1813,1819 ---- * find_indexpath_quals * * Given the Path structure for a plain or bitmap indexscan, extract lists ! * of all the index clauses and index predicate conditions used in the Path. * These are appended to the initial contents of *quals and *preds (hence * caller should initialize those to NIL). * *************** find_indexpath_quals(Path *bitmapqual, L *** 1847,1854 **** else if (IsA(bitmapqual, IndexPath)) { IndexPath *ipath = (IndexPath *) bitmapqual; ! *quals = list_concat(*quals, get_actual_clauses(ipath->indexclauses)); *preds = list_concat(*preds, list_copy(ipath->indexinfo->indpred)); } else --- 1850,1863 ---- else if (IsA(bitmapqual, IndexPath)) { IndexPath *ipath = (IndexPath *) bitmapqual; + ListCell *l; ! foreach(l, ipath->indexclauses) ! { ! IndexClause *iclause = (IndexClause *) lfirst(l); ! ! *quals = lappend(*quals, iclause->rinfo->clause); ! } *preds = list_concat(*preds, list_copy(ipath->indexinfo->indpred)); } else *************** match_clauses_to_index(IndexOptInfo *ind *** 2239,2246 **** * match_clause_to_index * Test whether a qual clause can be used with an index. * ! * If the clause is usable, add it to the appropriate list in *clauseset. ! * *clauseset must be initialized to zeroes before first call. * * Note: in some circumstances we may find the same RestrictInfos coming from * multiple places. Defend against redundant outputs by refusing to add a --- 2248,2256 ---- * match_clause_to_index * Test whether a qual clause can be used with an index. * ! * If the clause is usable, add an IndexClause entry for it to the appropriate ! * list in *clauseset. (*clauseset must be initialized to zeroes before first ! * call.) * * Note: in some circumstances we may find the same RestrictInfos coming from * multiple places. Defend against redundant outputs by refusing to add a *************** match_clause_to_index(IndexOptInfo *inde *** 2277,2289 **** /* OK, check each index key column for a match */ for (indexcol = 0; indexcol < index->nkeycolumns; indexcol++) { if (match_clause_to_indexcol(index, indexcol, rinfo)) { clauseset->indexclauses[indexcol] = ! list_append_unique_ptr(clauseset->indexclauses[indexcol], ! rinfo); clauseset->nonempty = true; return; } --- 2287,2316 ---- /* OK, check each index key column for a match */ for (indexcol = 0; indexcol < index->nkeycolumns; indexcol++) { + ListCell *lc; + + /* Ignore duplicates */ + foreach(lc, clauseset->indexclauses[indexcol]) + { + IndexClause *iclause = (IndexClause *) lfirst(lc); + + if (iclause->rinfo == rinfo) + return; + } + + /* + * XXX this should be changed so that we generate an IndexClause + * immediately upon matching, to avoid repeated work. To-do soon. + */ if (match_clause_to_indexcol(index, indexcol, rinfo)) { + IndexClause *iclause; + + iclause = expand_indexqual_conditions(index, indexcol, rinfo); clauseset->indexclauses[indexcol] = ! lappend(clauseset->indexclauses[indexcol], iclause); clauseset->nonempty = true; return; } *************** match_clause_to_index(IndexOptInfo *inde *** 2335,2341 **** * target index column. This is sufficient to guarantee that some index * condition can be constructed from the RowCompareExpr --- whether the * remaining columns match the index too is considered in ! * adjust_rowcompare_for_index(). * * It is also possible to match ScalarArrayOpExpr clauses to indexes, when * the clause is of the form "indexkey op ANY (arrayconst)". --- 2362,2368 ---- * target index column. This is sufficient to guarantee that some index * condition can be constructed from the RowCompareExpr --- whether the * remaining columns match the index too is considered in ! * expand_indexqual_rowcompare(). * * It is also possible to match ScalarArrayOpExpr clauses to indexes, when * the clause is of the form "indexkey op ANY (arrayconst)". *************** match_index_to_operand(Node *operand, *** 3342,3349 **** * match_boolean_index_clause() similarly detects clauses that can be * converted into boolean equality operators. * ! * expand_indexqual_conditions() converts a list of RestrictInfo nodes ! * (with implicit AND semantics across list elements) into a list of clauses * that the executor can actually handle. For operators that are members of * the index's opfamily this transformation is a no-op, but clauses recognized * by match_special_index_operator() or match_boolean_index_clause() must be --- 3369,3376 ---- * match_boolean_index_clause() similarly detects clauses that can be * converted into boolean equality operators. * ! * expand_indexqual_conditions() converts a RestrictInfo node ! * into an IndexClause, which contains clauses * that the executor can actually handle. For operators that are members of * the index's opfamily this transformation is a no-op, but clauses recognized * by match_special_index_operator() or match_boolean_index_clause() must be *************** match_special_index_operator(Expr *claus *** 3556,3593 **** /* * expand_indexqual_conditions ! * Given a list of RestrictInfo nodes, produce a list of directly usable ! * index qual clauses. * * Standard qual clauses (those in the index's opfamily) are passed through * unchanged. Boolean clauses and "special" index operators are expanded * into clauses that the indexscan machinery will know what to do with. * RowCompare clauses are simplified if necessary to create a clause that is * fully checkable by the index. - * - * In addition to the expressions themselves, there are auxiliary lists - * of the index column numbers that the clauses are meant to be used with; - * we generate an updated column number list for the result. (This is not - * the identical list because one input clause sometimes produces more than - * one output clause.) - * - * The input clauses are sorted by column number, and so the output is too. - * (This is depended on in various places in both planner and executor.) */ ! void expand_indexqual_conditions(IndexOptInfo *index, ! List *indexclauses, List *indexclausecols, ! List **indexquals_p, List **indexqualcols_p) { List *indexquals = NIL; - List *indexqualcols = NIL; - ListCell *lcc, - *lci; ! forboth(lcc, indexclauses, lci, indexclausecols) { - RestrictInfo *rinfo = (RestrictInfo *) lfirst(lcc); - int indexcol = lfirst_int(lci); Expr *clause = rinfo->clause; Oid curFamily; Oid curCollation; --- 3583,3610 ---- /* * expand_indexqual_conditions ! * Given a RestrictInfo node, create an IndexClause. * * Standard qual clauses (those in the index's opfamily) are passed through * unchanged. Boolean clauses and "special" index operators are expanded * into clauses that the indexscan machinery will know what to do with. * RowCompare clauses are simplified if necessary to create a clause that is * fully checkable by the index. */ ! static IndexClause * expand_indexqual_conditions(IndexOptInfo *index, ! int indexcol, ! RestrictInfo *rinfo) { + IndexClause *iclause = makeNode(IndexClause); List *indexquals = NIL; ! iclause->rinfo = rinfo; ! iclause->lossy = false; /* might get changed below */ ! iclause->indexcol = indexcol; ! iclause->indexcols = NIL; /* might get changed below */ ! { Expr *clause = rinfo->clause; Oid curFamily; Oid curCollation; *************** expand_indexqual_conditions(IndexOptInfo *** 3607,3616 **** index); if (boolqual) { ! indexquals = lappend(indexquals, ! make_simple_restrictinfo(boolqual)); ! indexqualcols = lappend_int(indexqualcols, indexcol); ! continue; } } --- 3624,3632 ---- index); if (boolqual) { ! iclause->indexquals = ! list_make1(make_simple_restrictinfo(boolqual)); ! return iclause; } } *************** expand_indexqual_conditions(IndexOptInfo *** 3620,3660 **** */ if (is_opclause(clause)) { ! indexquals = list_concat(indexquals, ! expand_indexqual_opclause(rinfo, ! curFamily, ! curCollation)); ! /* expand_indexqual_opclause can produce multiple clauses */ ! while (list_length(indexqualcols) < list_length(indexquals)) ! indexqualcols = lappend_int(indexqualcols, indexcol); } else if (IsA(clause, ScalarArrayOpExpr)) { /* no extra work at this time */ - indexquals = lappend(indexquals, rinfo); - indexqualcols = lappend_int(indexqualcols, indexcol); } else if (IsA(clause, RowCompareExpr)) { ! indexquals = lappend(indexquals, ! expand_indexqual_rowcompare(rinfo, ! index, ! indexcol)); ! indexqualcols = lappend_int(indexqualcols, indexcol); } else if (IsA(clause, NullTest)) { Assert(index->amsearchnulls); - indexquals = lappend(indexquals, rinfo); - indexqualcols = lappend_int(indexqualcols, indexcol); } else elog(ERROR, "unsupported indexqual type: %d", (int) nodeTag(clause)); } ! *indexquals_p = indexquals; ! *indexqualcols_p = indexqualcols; } /* --- 3636,3698 ---- */ if (is_opclause(clause)) { ! /* ! * Check to see if the indexkey is on the right; if so, commute ! * the clause. The indexkey should be the side that refers to ! * (only) the base relation. ! */ ! if (!bms_equal(rinfo->left_relids, index->rel->relids)) ! { ! Oid opno = ((OpExpr *) clause)->opno; ! RestrictInfo *newrinfo; ! ! newrinfo = commute_restrictinfo(rinfo, ! get_commutator(opno)); ! ! /* ! * For now, assume it couldn't be any case that requires ! * expansion. (This is OK for the current capabilities of ! * expand_indexqual_opclause, but we'll need to remove the ! * restriction when we open this up for extensions.) ! */ ! indexquals = list_make1(newrinfo); ! } ! else ! indexquals = expand_indexqual_opclause(rinfo, ! curFamily, ! curCollation, ! &iclause->lossy); } else if (IsA(clause, ScalarArrayOpExpr)) { /* no extra work at this time */ } else if (IsA(clause, RowCompareExpr)) { ! RestrictInfo *newrinfo; ! ! newrinfo = expand_indexqual_rowcompare(rinfo, ! index, ! indexcol, ! &iclause->indexcols, ! &iclause->lossy); ! if (newrinfo != rinfo) ! { ! /* We need to report a derived expression */ ! indexquals = list_make1(newrinfo); ! } } else if (IsA(clause, NullTest)) { Assert(index->amsearchnulls); } else elog(ERROR, "unsupported indexqual type: %d", (int) nodeTag(clause)); } ! iclause->indexquals = indexquals; ! return iclause; } /* *************** expand_boolean_index_clause(Node *clause *** 3725,3737 **** * expand_indexqual_opclause --- expand a single indexqual condition * that is an operator clause * ! * The input is a single RestrictInfo, the output a list of RestrictInfos. * ! * In the base case this is just list_make1(), but we have to be prepared to * expand special cases that were accepted by match_special_index_operator(). */ static List * ! expand_indexqual_opclause(RestrictInfo *rinfo, Oid opfamily, Oid idxcollation) { Expr *clause = rinfo->clause; --- 3763,3777 ---- * expand_indexqual_opclause --- expand a single indexqual condition * that is an operator clause * ! * The input is a single RestrictInfo, the output a list of RestrictInfos, ! * or NIL if the RestrictInfo's clause can be used as-is. * ! * In the base case this is just "return NIL", but we have to be prepared to * expand special cases that were accepted by match_special_index_operator(). */ static List * ! expand_indexqual_opclause(RestrictInfo *rinfo, Oid opfamily, Oid idxcollation, ! bool *lossy) { Expr *clause = rinfo->clause; *************** expand_indexqual_opclause(RestrictInfo * *** 3760,3765 **** --- 3800,3806 ---- case OID_BYTEA_LIKE_OP: if (!op_in_opfamily(expr_op, opfamily)) { + *lossy = true; pstatus = pattern_fixed_prefix(patt, Pattern_Type_Like, expr_coll, &prefix, NULL); return prefix_quals(leftop, opfamily, idxcollation, prefix, pstatus); *************** expand_indexqual_opclause(RestrictInfo * *** 3771,3776 **** --- 3812,3818 ---- case OID_NAME_ICLIKE_OP: if (!op_in_opfamily(expr_op, opfamily)) { + *lossy = true; /* the right-hand const is type text for all of these */ pstatus = pattern_fixed_prefix(patt, Pattern_Type_Like_IC, expr_coll, &prefix, NULL); *************** expand_indexqual_opclause(RestrictInfo * *** 3783,3788 **** --- 3825,3831 ---- case OID_NAME_REGEXEQ_OP: if (!op_in_opfamily(expr_op, opfamily)) { + *lossy = true; /* the right-hand const is type text for all of these */ pstatus = pattern_fixed_prefix(patt, Pattern_Type_Regex, expr_coll, &prefix, NULL); *************** expand_indexqual_opclause(RestrictInfo * *** 3795,3800 **** --- 3838,3844 ---- case OID_NAME_ICREGEXEQ_OP: if (!op_in_opfamily(expr_op, opfamily)) { + *lossy = true; /* the right-hand const is type text for all of these */ pstatus = pattern_fixed_prefix(patt, Pattern_Type_Regex_IC, expr_coll, &prefix, NULL); *************** expand_indexqual_opclause(RestrictInfo * *** 3806,3901 **** case OID_INET_SUBEQ_OP: if (!op_in_opfamily(expr_op, opfamily)) { return network_prefix_quals(leftop, expr_op, opfamily, patt->constvalue); } break; } ! /* Default case: just make a list of the unmodified indexqual */ ! return list_make1(rinfo); } /* * expand_indexqual_rowcompare --- expand a single indexqual condition * that is a RowCompareExpr * - * This is a thin wrapper around adjust_rowcompare_for_index; we export the - * latter so that createplan.c can use it to re-discover which columns of the - * index are used by a row comparison indexqual. - */ - static RestrictInfo * - expand_indexqual_rowcompare(RestrictInfo *rinfo, - IndexOptInfo *index, - int indexcol) - { - RowCompareExpr *clause = (RowCompareExpr *) rinfo->clause; - Expr *newclause; - List *indexcolnos; - bool var_on_left; - - newclause = adjust_rowcompare_for_index(clause, - index, - indexcol, - &indexcolnos, - &var_on_left); - - /* - * If we didn't have to change the RowCompareExpr, return the original - * RestrictInfo. - */ - if (newclause == (Expr *) clause) - return rinfo; - - /* Else we need a new RestrictInfo */ - return make_simple_restrictinfo(newclause); - } - - /* - * adjust_rowcompare_for_index --- expand a single indexqual condition - * that is a RowCompareExpr - * * It's already known that the first column of the row comparison matches * the specified column of the index. We can use additional columns of the * row comparison as index qualifications, so long as they match the index * in the "same direction", ie, the indexkeys are all on the same side of the * clause and the operators are all the same-type members of the opfamilies. * If all the columns of the RowCompareExpr match in this way, we just use it ! * as-is. Otherwise, we build a shortened RowCompareExpr (if more than one * column matches) or a simple OpExpr (if the first-column match is all * there is). In these cases the modified clause is always "<=" or ">=" * even when the original was "<" or ">" --- this is necessary to match all ! * the rows that could match the original. (We are essentially building a ! * lossy version of the row comparison when we do this.) * * *indexcolnos receives an integer list of the index column numbers (zero ! * based) used in the resulting expression. The reason we need to return ! * that is that if the index is selected for use, createplan.c will need to ! * call this again to extract that list. (This is a bit grotty, but row ! * comparison indexquals aren't used enough to justify finding someplace to ! * keep the information in the Path representation.) Since createplan.c ! * also needs to know which side of the RowCompareExpr is the index side, ! * we also return *var_on_left_p rather than re-deducing that there. */ ! Expr * ! adjust_rowcompare_for_index(RowCompareExpr *clause, IndexOptInfo *index, int indexcol, List **indexcolnos, ! bool *var_on_left_p) { bool var_on_left; int op_strategy; Oid op_lefttype; Oid op_righttype; int matching_cols; Oid expr_op; List *opfamilies; List *lefttypes; List *righttypes; List *new_ops; ! ListCell *largs_cell; ! ListCell *rargs_cell; ListCell *opnos_cell; ListCell *collids_cell; --- 3850,3914 ---- case OID_INET_SUBEQ_OP: if (!op_in_opfamily(expr_op, opfamily)) { + *lossy = true; return network_prefix_quals(leftop, expr_op, opfamily, patt->constvalue); } break; } ! /* Default case: the clause can be used as-is. */ ! *lossy = false; ! return NIL; } /* * expand_indexqual_rowcompare --- expand a single indexqual condition * that is a RowCompareExpr * * It's already known that the first column of the row comparison matches * the specified column of the index. We can use additional columns of the * row comparison as index qualifications, so long as they match the index * in the "same direction", ie, the indexkeys are all on the same side of the * clause and the operators are all the same-type members of the opfamilies. + * * If all the columns of the RowCompareExpr match in this way, we just use it ! * as-is, except for possibly commuting it to put the indexkeys on the left. ! * ! * Otherwise, we build a shortened RowCompareExpr (if more than one * column matches) or a simple OpExpr (if the first-column match is all * there is). In these cases the modified clause is always "<=" or ">=" * even when the original was "<" or ">" --- this is necessary to match all ! * the rows that could match the original. (We are building a lossy version ! * of the row comparison when we do this, so we set *lossy = true.) * * *indexcolnos receives an integer list of the index column numbers (zero ! * based) used in the resulting expression. We have to pass that back ! * because createplan.c will need it. */ ! static RestrictInfo * ! expand_indexqual_rowcompare(RestrictInfo *rinfo, IndexOptInfo *index, int indexcol, List **indexcolnos, ! bool *lossy) { + RowCompareExpr *clause = castNode(RowCompareExpr, rinfo->clause); bool var_on_left; int op_strategy; Oid op_lefttype; Oid op_righttype; int matching_cols; Oid expr_op; + List *expr_ops; List *opfamilies; List *lefttypes; List *righttypes; List *new_ops; ! List *var_args; ! List *non_var_args; ! ListCell *vargs_cell; ! ListCell *nargs_cell; ListCell *opnos_cell; ListCell *collids_cell; *************** adjust_rowcompare_for_index(RowCompareEx *** 3905,3911 **** Assert(var_on_left || match_index_to_operand((Node *) linitial(clause->rargs), indexcol, index)); ! *var_on_left_p = var_on_left; expr_op = linitial_oid(clause->opnos); if (!var_on_left) --- 3918,3934 ---- Assert(var_on_left || match_index_to_operand((Node *) linitial(clause->rargs), indexcol, index)); ! ! if (var_on_left) ! { ! var_args = clause->largs; ! non_var_args = clause->rargs; ! } ! else ! { ! var_args = clause->rargs; ! non_var_args = clause->largs; ! } expr_op = linitial_oid(clause->opnos); if (!var_on_left) *************** adjust_rowcompare_for_index(RowCompareEx *** 3918,3924 **** /* Initialize returned list of which index columns are used */ *indexcolnos = list_make1_int(indexcol); ! /* Build lists of the opfamilies and operator datatypes in case needed */ opfamilies = list_make1_oid(index->opfamily[indexcol]); lefttypes = list_make1_oid(op_lefttype); righttypes = list_make1_oid(op_righttype); --- 3941,3948 ---- /* Initialize returned list of which index columns are used */ *indexcolnos = list_make1_int(indexcol); ! /* Build lists of ops, opfamilies and operator datatypes in case needed */ ! expr_ops = list_make1_oid(expr_op); opfamilies = list_make1_oid(index->opfamily[indexcol]); lefttypes = list_make1_oid(op_lefttype); righttypes = list_make1_oid(op_righttype); *************** adjust_rowcompare_for_index(RowCompareEx *** 3930,3956 **** * indexed relation. */ matching_cols = 1; ! largs_cell = lnext(list_head(clause->largs)); ! rargs_cell = lnext(list_head(clause->rargs)); opnos_cell = lnext(list_head(clause->opnos)); collids_cell = lnext(list_head(clause->inputcollids)); ! while (largs_cell != NULL) { ! Node *varop; ! Node *constop; int i; expr_op = lfirst_oid(opnos_cell); ! if (var_on_left) ! { ! varop = (Node *) lfirst(largs_cell); ! constop = (Node *) lfirst(rargs_cell); ! } ! else { - varop = (Node *) lfirst(rargs_cell); - constop = (Node *) lfirst(largs_cell); /* indexkey is on right, so commute the operator */ expr_op = get_commutator(expr_op); if (expr_op == InvalidOid) --- 3954,3973 ---- * indexed relation. */ matching_cols = 1; ! vargs_cell = lnext(list_head(var_args)); ! nargs_cell = lnext(list_head(non_var_args)); opnos_cell = lnext(list_head(clause->opnos)); collids_cell = lnext(list_head(clause->inputcollids)); ! while (vargs_cell != NULL) { ! Node *varop = (Node *) lfirst(vargs_cell); ! Node *constop = (Node *) lfirst(nargs_cell); int i; expr_op = lfirst_oid(opnos_cell); ! if (!var_on_left) { /* indexkey is on right, so commute the operator */ expr_op = get_commutator(expr_op); if (expr_op == InvalidOid) *************** adjust_rowcompare_for_index(RowCompareEx *** 3980,4016 **** /* Add column number to returned list */ *indexcolnos = lappend_int(*indexcolnos, i); ! /* Add opfamily and datatypes to lists */ get_op_opfamily_properties(expr_op, index->opfamily[i], false, &op_strategy, &op_lefttype, &op_righttype); opfamilies = lappend_oid(opfamilies, index->opfamily[i]); lefttypes = lappend_oid(lefttypes, op_lefttype); righttypes = lappend_oid(righttypes, op_righttype); /* This column matches, keep scanning */ matching_cols++; ! largs_cell = lnext(largs_cell); ! rargs_cell = lnext(rargs_cell); opnos_cell = lnext(opnos_cell); collids_cell = lnext(collids_cell); } ! /* Return clause as-is if it's all usable as index quals */ ! if (matching_cols == list_length(clause->opnos)) ! return (Expr *) clause; /* ! * We have to generate a subset rowcompare (possibly just one OpExpr). The ! * painful part of this is changing < to <= or > to >=, so deal with that ! * first. */ ! if (op_strategy == BTLessEqualStrategyNumber || ! op_strategy == BTGreaterEqualStrategyNumber) { ! /* easy, just use the same operators */ ! new_ops = list_truncate(list_copy(clause->opnos), matching_cols); } else { --- 3997,4045 ---- /* Add column number to returned list */ *indexcolnos = lappend_int(*indexcolnos, i); ! /* Add operator info to lists */ get_op_opfamily_properties(expr_op, index->opfamily[i], false, &op_strategy, &op_lefttype, &op_righttype); + expr_ops = lappend_oid(expr_ops, expr_op); opfamilies = lappend_oid(opfamilies, index->opfamily[i]); lefttypes = lappend_oid(lefttypes, op_lefttype); righttypes = lappend_oid(righttypes, op_righttype); /* This column matches, keep scanning */ matching_cols++; ! vargs_cell = lnext(vargs_cell); ! nargs_cell = lnext(nargs_cell); opnos_cell = lnext(opnos_cell); collids_cell = lnext(collids_cell); } ! /* Result is non-lossy if all columns are usable as index quals */ ! *lossy = (matching_cols != list_length(clause->opnos)); /* ! * Return clause as-is if we have var on left and it's all usable as index ! * quals */ ! if (var_on_left && !*lossy) ! return rinfo; ! ! /* ! * We have to generate a modified rowcompare (possibly just one OpExpr). ! * The painful part of this is changing < to <= or > to >=, so deal with ! * that first. ! */ ! if (!*lossy) { ! /* very easy, just use the commuted operators */ ! new_ops = expr_ops; ! } ! else if (op_strategy == BTLessEqualStrategyNumber || ! op_strategy == BTGreaterEqualStrategyNumber) ! { ! /* easy, just use the same (possibly commuted) operators */ ! new_ops = list_truncate(expr_ops, matching_cols); } else { *************** adjust_rowcompare_for_index(RowCompareEx *** 4025,4033 **** else elog(ERROR, "unexpected strategy number %d", op_strategy); new_ops = NIL; ! lefttypes_cell = list_head(lefttypes); ! righttypes_cell = list_head(righttypes); ! foreach(opfamilies_cell, opfamilies) { Oid opfam = lfirst_oid(opfamilies_cell); Oid lefttype = lfirst_oid(lefttypes_cell); --- 4054,4062 ---- else elog(ERROR, "unexpected strategy number %d", op_strategy); new_ops = NIL; ! forthree(opfamilies_cell, opfamilies, ! lefttypes_cell, lefttypes, ! righttypes_cell, righttypes) { Oid opfam = lfirst_oid(opfamilies_cell); Oid lefttype = lfirst_oid(lefttypes_cell); *************** adjust_rowcompare_for_index(RowCompareEx *** 4038,4053 **** if (!OidIsValid(expr_op)) /* should not happen */ elog(ERROR, "missing operator %d(%u,%u) in opfamily %u", op_strategy, lefttype, righttype, opfam); - if (!var_on_left) - { - expr_op = get_commutator(expr_op); - if (!OidIsValid(expr_op)) /* should not happen */ - elog(ERROR, "could not find commutator of operator %d(%u,%u) of opfamily %u", - op_strategy, lefttype, righttype, opfam); - } new_ops = lappend_oid(new_ops, expr_op); - lefttypes_cell = lnext(lefttypes_cell); - righttypes_cell = lnext(righttypes_cell); } } --- 4067,4073 ---- *************** adjust_rowcompare_for_index(RowCompareEx *** 4056,4084 **** { RowCompareExpr *rc = makeNode(RowCompareExpr); ! if (var_on_left) ! rc->rctype = (RowCompareType) op_strategy; ! else ! rc->rctype = (op_strategy == BTLessEqualStrategyNumber) ? ! ROWCOMPARE_GE : ROWCOMPARE_LE; rc->opnos = new_ops; rc->opfamilies = list_truncate(list_copy(clause->opfamilies), matching_cols); rc->inputcollids = list_truncate(list_copy(clause->inputcollids), matching_cols); ! rc->largs = list_truncate(copyObject(clause->largs), matching_cols); ! rc->rargs = list_truncate(copyObject(clause->rargs), matching_cols); ! return (Expr *) rc; } else { ! return make_opclause(linitial_oid(new_ops), BOOLOID, false, ! copyObject(linitial(clause->largs)), ! copyObject(linitial(clause->rargs)), ! InvalidOid, ! linitial_oid(clause->inputcollids)); } } --- 4076,4106 ---- { RowCompareExpr *rc = makeNode(RowCompareExpr); ! rc->rctype = (RowCompareType) op_strategy; rc->opnos = new_ops; rc->opfamilies = list_truncate(list_copy(clause->opfamilies), matching_cols); rc->inputcollids = list_truncate(list_copy(clause->inputcollids), matching_cols); ! rc->largs = list_truncate(copyObject(var_args), matching_cols); ! rc->rargs = list_truncate(copyObject(non_var_args), matching_cols); ! return make_simple_restrictinfo((Expr *) rc); } else { ! Expr *op; ! ! /* We don't report an index column list in this case */ ! *indexcolnos = NIL; ! ! op = make_opclause(linitial_oid(new_ops), BOOLOID, false, ! copyObject(linitial(var_args)), ! copyObject(linitial(non_var_args)), ! InvalidOid, ! linitial_oid(clause->inputcollids)); ! return make_simple_restrictinfo(op); } } diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 1b4f7db..c7645ac 100644 *** a/src/backend/optimizer/plan/createplan.c --- b/src/backend/optimizer/plan/createplan.c *************** static MergeJoin *create_mergejoin_plan( *** 152,159 **** static HashJoin *create_hashjoin_plan(PlannerInfo *root, HashPath *best_path); static Node *replace_nestloop_params(PlannerInfo *root, Node *expr); static Node *replace_nestloop_params_mutator(Node *node, PlannerInfo *root); ! static List *fix_indexqual_references(PlannerInfo *root, IndexPath *index_path); static List *fix_indexorderby_references(PlannerInfo *root, IndexPath *index_path); static Node *fix_indexqual_operand(Node *node, IndexOptInfo *index, int indexcol); static List *get_switched_clauses(List *clauses, Relids outerrelids); static List *order_qual_clauses(PlannerInfo *root, List *clauses); --- 152,164 ---- static HashJoin *create_hashjoin_plan(PlannerInfo *root, HashPath *best_path); static Node *replace_nestloop_params(PlannerInfo *root, Node *expr); static Node *replace_nestloop_params_mutator(Node *node, PlannerInfo *root); ! static void fix_indexqual_references(PlannerInfo *root, IndexPath *index_path, ! List **stripped_indexquals_p, ! List **fixed_indexquals_p); static List *fix_indexorderby_references(PlannerInfo *root, IndexPath *index_path); + static Node *fix_indexqual_clause(PlannerInfo *root, + IndexOptInfo *index, int indexcol, + Node *clause, List *indexcolnos); static Node *fix_indexqual_operand(Node *node, IndexOptInfo *index, int indexcol); static List *get_switched_clauses(List *clauses, Relids outerrelids); static List *order_qual_clauses(PlannerInfo *root, List *clauses); *************** create_indexscan_plan(PlannerInfo *root, *** 2607,2613 **** bool indexonly) { Scan *scan_plan; ! List *indexquals = best_path->indexquals; List *indexorderbys = best_path->indexorderbys; Index baserelid = best_path->path.parent->relid; Oid indexoid = best_path->indexinfo->indexoid; --- 2612,2618 ---- bool indexonly) { Scan *scan_plan; ! List *indexclauses = best_path->indexclauses; List *indexorderbys = best_path->indexorderbys; Index baserelid = best_path->path.parent->relid; Oid indexoid = best_path->indexinfo->indexoid; *************** create_indexscan_plan(PlannerInfo *root, *** 2623,2638 **** Assert(best_path->path.parent->rtekind == RTE_RELATION); /* ! * Build "stripped" indexquals structure (no RestrictInfos) to pass to ! * executor as indexqualorig ! */ ! stripped_indexquals = get_actual_clauses(indexquals); ! ! /* ! * The executor needs a copy with the indexkey on the left of each clause ! * and with index Vars substituted for table ones. */ ! fixed_indexquals = fix_indexqual_references(root, best_path); /* * Likewise fix up index attr references in the ORDER BY expressions. --- 2628,2641 ---- Assert(best_path->path.parent->rtekind == RTE_RELATION); /* ! * Extract the index qual expressions (stripped of RestrictInfos) from the ! * IndexClauses list, and prepare a copy with index Vars substituted for ! * table Vars. (This step also does replace_nestloop_params on the ! * fixed_indexquals.) */ ! fix_indexqual_references(root, best_path, ! &stripped_indexquals, ! &fixed_indexquals); /* * Likewise fix up index attr references in the ORDER BY expressions. *************** create_indexscan_plan(PlannerInfo *root, *** 2648,2661 **** * included in qpqual. The upshot is that qpqual must contain * scan_clauses minus whatever appears in indexquals. * ! * In normal cases simple pointer equality checks will be enough to spot ! * duplicate RestrictInfos, so we try that first. ! * ! * Another common case is that a scan_clauses entry is generated from the ! * same EquivalenceClass as some indexqual, and is therefore redundant ! * with it, though not equal. (This happens when indxpath.c prefers a * different derived equality than what generate_join_implied_equalities ! * picked for a parameterized scan's ppi_clauses.) * * In some situations (particularly with OR'd index conditions) we may * have scan_clauses that are not equal to, but are logically implied by, --- 2651,2664 ---- * included in qpqual. The upshot is that qpqual must contain * scan_clauses minus whatever appears in indexquals. * ! * is_redundant_with_indexclauses() detects cases where a scan clause is ! * present in the indexclauses list or is generated from the same ! * EquivalenceClass as some indexclause, and is therefore redundant with ! * it, though not equal. (The latter happens when indxpath.c prefers a * different derived equality than what generate_join_implied_equalities ! * picked for a parameterized scan's ppi_clauses.) Note that it will not ! * match to lossy index clauses, which is critical because we have to ! * include the original clause in qpqual in that case. * * In some situations (particularly with OR'd index conditions) we may * have scan_clauses that are not equal to, but are logically implied by, *************** create_indexscan_plan(PlannerInfo *root, *** 2674,2685 **** if (rinfo->pseudoconstant) continue; /* we may drop pseudoconstants here */ ! if (list_member_ptr(indexquals, rinfo)) ! continue; /* simple duplicate */ ! if (is_redundant_derived_clause(rinfo, indexquals)) ! continue; /* derived from same EquivalenceClass */ if (!contain_mutable_functions((Node *) rinfo->clause) && ! predicate_implied_by(list_make1(rinfo->clause), indexquals, false)) continue; /* provably implied by indexquals */ qpqual = lappend(qpqual, rinfo); } --- 2677,2687 ---- if (rinfo->pseudoconstant) continue; /* we may drop pseudoconstants here */ ! if (is_redundant_with_indexclauses(rinfo, indexclauses)) ! continue; /* dup or derived from same EquivalenceClass */ if (!contain_mutable_functions((Node *) rinfo->clause) && ! predicate_implied_by(list_make1(rinfo->clause), stripped_indexquals, ! false)) continue; /* provably implied by indexquals */ qpqual = lappend(qpqual, rinfo); } *************** create_bitmap_subplan(PlannerInfo *root, *** 3040,3045 **** --- 3042,3049 ---- { IndexPath *ipath = (IndexPath *) bitmapqual; IndexScan *iscan; + List *subquals; + List *subindexquals; List *subindexECs; ListCell *l; *************** create_bitmap_subplan(PlannerInfo *root, *** 3060,3067 **** plan->plan_width = 0; /* meaningless */ plan->parallel_aware = false; plan->parallel_safe = ipath->path.parallel_safe; ! *qual = get_actual_clauses(ipath->indexclauses); ! *indexqual = get_actual_clauses(ipath->indexquals); foreach(l, ipath->indexinfo->indpred) { Expr *pred = (Expr *) lfirst(l); --- 3064,3089 ---- plan->plan_width = 0; /* meaningless */ plan->parallel_aware = false; plan->parallel_safe = ipath->path.parallel_safe; ! /* Extract original index clauses, actual index quals, relevant ECs */ ! subquals = NIL; ! subindexquals = NIL; ! subindexECs = NIL; ! foreach(l, ipath->indexclauses) ! { ! IndexClause *iclause = (IndexClause *) lfirst(l); ! RestrictInfo *rinfo = iclause->rinfo; ! ! Assert(!rinfo->pseudoconstant); ! subquals = lappend(subquals, rinfo->clause); ! if (iclause->indexquals) ! subindexquals = list_concat(subindexquals, ! get_actual_clauses(iclause->indexquals)); ! else ! subindexquals = lappend(subindexquals, rinfo->clause); ! if (rinfo->parent_ec) ! subindexECs = lappend(subindexECs, rinfo->parent_ec); ! } ! /* We can add any index predicate conditions, too */ foreach(l, ipath->indexinfo->indpred) { Expr *pred = (Expr *) lfirst(l); *************** create_bitmap_subplan(PlannerInfo *root, *** 3072,3092 **** * the conditions that got pushed into the bitmapqual. Avoid * generating redundant conditions. */ ! if (!predicate_implied_by(list_make1(pred), ipath->indexclauses, ! false)) { ! *qual = lappend(*qual, pred); ! *indexqual = lappend(*indexqual, pred); } } ! subindexECs = NIL; ! foreach(l, ipath->indexquals) ! { ! RestrictInfo *rinfo = (RestrictInfo *) lfirst(l); ! ! if (rinfo->parent_ec) ! subindexECs = lappend(subindexECs, rinfo->parent_ec); ! } *indexECs = subindexECs; } else --- 3094,3107 ---- * the conditions that got pushed into the bitmapqual. Avoid * generating redundant conditions. */ ! if (!predicate_implied_by(list_make1(pred), subquals, false)) { ! subquals = lappend(subquals, pred); ! subindexquals = lappend(subindexquals, pred); } } ! *qual = subquals; ! *indexqual = subindexquals; *indexECs = subindexECs; } else *************** replace_nestloop_params_mutator(Node *no *** 4446,4583 **** * Adjust indexqual clauses to the form the executor's indexqual * machinery needs. * ! * We have four tasks here: ! * * Remove RestrictInfo nodes from the input clauses. * * Replace any outer-relation Var or PHV nodes with nestloop Params. * (XXX eventually, that responsibility should go elsewhere?) * * Index keys must be represented by Var nodes with varattno set to the * index's attribute number, not the attribute number in the original rel. - * * If the index key is on the right, commute the clause to put it on the - * left. * ! * The result is a modified copy of the path's indexquals list --- the ! * original is not changed. Note also that the copy shares no substructure ! * with the original; this is needed in case there is a subplan in it (we need ! * two separate copies of the subplan tree, or things will go awry). */ ! static List * ! fix_indexqual_references(PlannerInfo *root, IndexPath *index_path) { IndexOptInfo *index = index_path->indexinfo; List *fixed_indexquals; ! ListCell *lcc, ! *lci; ! fixed_indexquals = NIL; ! forboth(lcc, index_path->indexquals, lci, index_path->indexqualcols) { ! RestrictInfo *rinfo = lfirst_node(RestrictInfo, lcc); ! int indexcol = lfirst_int(lci); ! Node *clause; ! ! /* ! * Replace any outer-relation variables with nestloop params. ! * ! * This also makes a copy of the clause, so it's safe to modify it ! * in-place below. ! */ ! clause = replace_nestloop_params(root, (Node *) rinfo->clause); ! if (IsA(clause, OpExpr)) { ! OpExpr *op = (OpExpr *) clause; ! ! if (list_length(op->args) != 2) ! elog(ERROR, "indexqual clause is not binary opclause"); ! ! /* ! * Check to see if the indexkey is on the right; if so, commute ! * the clause. The indexkey should be the side that refers to ! * (only) the base relation. ! */ ! if (!bms_equal(rinfo->left_relids, index->rel->relids)) ! CommuteOpExpr(op); ! /* ! * Now replace the indexkey expression with an index Var. ! */ ! linitial(op->args) = fix_indexqual_operand(linitial(op->args), ! index, ! indexcol); } ! else if (IsA(clause, RowCompareExpr)) { ! RowCompareExpr *rc = (RowCompareExpr *) clause; ! Expr *newrc; ! List *indexcolnos; ! bool var_on_left; ! ListCell *lca, ! *lcai; ! ! /* ! * Re-discover which index columns are used in the rowcompare. ! */ ! newrc = adjust_rowcompare_for_index(rc, ! index, ! indexcol, ! &indexcolnos, ! &var_on_left); ! ! /* ! * Trouble if adjust_rowcompare_for_index thought the ! * RowCompareExpr didn't match the index as-is; the clause should ! * have gone through that routine already. ! */ ! if (newrc != (Expr *) rc) ! elog(ERROR, "inconsistent results from adjust_rowcompare_for_index"); ! ! /* ! * Check to see if the indexkey is on the right; if so, commute ! * the clause. ! */ ! if (!var_on_left) ! CommuteRowCompareExpr(rc); ! /* ! * Now replace the indexkey expressions with index Vars. ! */ ! Assert(list_length(rc->largs) == list_length(indexcolnos)); ! forboth(lca, rc->largs, lcai, indexcolnos) { ! lfirst(lca) = fix_indexqual_operand(lfirst(lca), ! index, ! lfirst_int(lcai)); ! } ! } ! else if (IsA(clause, ScalarArrayOpExpr)) ! { ! ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) clause; ! ! /* Never need to commute... */ ! ! /* Replace the indexkey expression with an index Var. */ ! linitial(saop->args) = fix_indexqual_operand(linitial(saop->args), ! index, ! indexcol); ! } ! else if (IsA(clause, NullTest)) ! { ! NullTest *nt = (NullTest *) clause; ! /* Replace the indexkey expression with an index Var. */ ! nt->arg = (Expr *) fix_indexqual_operand((Node *) nt->arg, ! index, ! indexcol); } - else - elog(ERROR, "unsupported indexqual type: %d", - (int) nodeTag(clause)); - - fixed_indexquals = lappend(fixed_indexquals, clause); } ! return fixed_indexquals; } /* --- 4461,4527 ---- * Adjust indexqual clauses to the form the executor's indexqual * machinery needs. * ! * We have three tasks here: ! * * Select the actual qual clauses out of the input IndexClause list, ! * and remove RestrictInfo nodes from the qual clauses. * * Replace any outer-relation Var or PHV nodes with nestloop Params. * (XXX eventually, that responsibility should go elsewhere?) * * Index keys must be represented by Var nodes with varattno set to the * index's attribute number, not the attribute number in the original rel. * ! * *stripped_indexquals_p receives a list of the actual qual clauses. ! * ! * *fixed_indexquals_p receives a list of the adjusted quals. This is a copy ! * that shares no substructure with the original; this is needed in case there ! * are subplans in it (we need two separate copies of the subplan tree, or ! * things will go awry). */ ! static void ! fix_indexqual_references(PlannerInfo *root, IndexPath *index_path, ! List **stripped_indexquals_p, List **fixed_indexquals_p) { IndexOptInfo *index = index_path->indexinfo; + List *stripped_indexquals; List *fixed_indexquals; ! ListCell *lc; ! stripped_indexquals = fixed_indexquals = NIL; ! foreach(lc, index_path->indexclauses) { ! IndexClause *iclause = lfirst_node(IndexClause, lc); ! int indexcol = iclause->indexcol; ! if (iclause->indexquals == NIL) { ! /* rinfo->clause is directly usable as an indexqual */ ! Node *clause = (Node *) iclause->rinfo->clause; ! stripped_indexquals = lappend(stripped_indexquals, clause); ! clause = fix_indexqual_clause(root, index, indexcol, ! clause, iclause->indexcols); ! fixed_indexquals = lappend(fixed_indexquals, clause); } ! else { ! /* Process the derived indexquals */ ! ListCell *lc2; ! foreach(lc2, iclause->indexquals) { ! RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc2); ! Node *clause = (Node *) rinfo->clause; ! stripped_indexquals = lappend(stripped_indexquals, clause); ! clause = fix_indexqual_clause(root, index, indexcol, ! clause, iclause->indexcols); ! fixed_indexquals = lappend(fixed_indexquals, clause); ! } } } ! *stripped_indexquals_p = stripped_indexquals; ! *fixed_indexquals_p = fixed_indexquals; } /* *************** fix_indexqual_references(PlannerInfo *ro *** 4585,4595 **** * Adjust indexorderby clauses to the form the executor's index * machinery needs. * ! * This is a simplified version of fix_indexqual_references. The input does ! * not have RestrictInfo nodes, and we assume that indxpath.c already ! * commuted the clauses to put the index keys on the left. Also, we don't ! * bother to support any cases except simple OpExprs, since nothing else ! * is allowed for ordering operators. */ static List * fix_indexorderby_references(PlannerInfo *root, IndexPath *index_path) --- 4529,4536 ---- * Adjust indexorderby clauses to the form the executor's index * machinery needs. * ! * This is a simplified version of fix_indexqual_references. The input is ! * bare clauses and a separate indexcol list, instead of IndexClauses. */ static List * fix_indexorderby_references(PlannerInfo *root, IndexPath *index_path) *************** fix_indexorderby_references(PlannerInfo *** 4606,4641 **** Node *clause = (Node *) lfirst(lcc); int indexcol = lfirst_int(lci); ! /* ! * Replace any outer-relation variables with nestloop params. ! * ! * This also makes a copy of the clause, so it's safe to modify it ! * in-place below. ! */ ! clause = replace_nestloop_params(root, clause); ! if (IsA(clause, OpExpr)) ! { ! OpExpr *op = (OpExpr *) clause; ! if (list_length(op->args) != 2) ! elog(ERROR, "indexorderby clause is not binary opclause"); ! /* ! * Now replace the indexkey expression with an index Var. ! */ ! linitial(op->args) = fix_indexqual_operand(linitial(op->args), ! index, ! indexcol); } ! else ! elog(ERROR, "unsupported indexorderby type: %d", ! (int) nodeTag(clause)); ! fixed_indexorderbys = lappend(fixed_indexorderbys, clause); } ! return fixed_indexorderbys; } /* --- 4547,4625 ---- Node *clause = (Node *) lfirst(lcc); int indexcol = lfirst_int(lci); ! clause = fix_indexqual_clause(root, index, indexcol, clause, NIL); ! fixed_indexorderbys = lappend(fixed_indexorderbys, clause); ! } ! return fixed_indexorderbys; ! } ! /* ! * fix_indexqual_clause ! * Convert a single indexqual clause to the form needed by the executor. ! * ! * We replace nestloop params here, and replace the index key variables ! * or expressions by index Var nodes. ! */ ! static Node * ! fix_indexqual_clause(PlannerInfo *root, IndexOptInfo *index, int indexcol, ! Node *clause, List *indexcolnos) ! { ! /* ! * Replace any outer-relation variables with nestloop params. ! * ! * This also makes a copy of the clause, so it's safe to modify it ! * in-place below. ! */ ! clause = replace_nestloop_params(root, clause); ! if (IsA(clause, OpExpr)) ! { ! OpExpr *op = (OpExpr *) clause; ! ! /* Replace the indexkey expression with an index Var. */ ! linitial(op->args) = fix_indexqual_operand(linitial(op->args), ! index, ! indexcol); ! } ! else if (IsA(clause, RowCompareExpr)) ! { ! RowCompareExpr *rc = (RowCompareExpr *) clause; ! ListCell *lca, ! *lcai; ! ! /* Replace the indexkey expressions with index Vars. */ ! Assert(list_length(rc->largs) == list_length(indexcolnos)); ! forboth(lca, rc->largs, lcai, indexcolnos) ! { ! lfirst(lca) = fix_indexqual_operand(lfirst(lca), ! index, ! lfirst_int(lcai)); } ! } ! else if (IsA(clause, ScalarArrayOpExpr)) ! { ! ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) clause; ! /* Replace the indexkey expression with an index Var. */ ! linitial(saop->args) = fix_indexqual_operand(linitial(saop->args), ! index, ! indexcol); } + else if (IsA(clause, NullTest)) + { + NullTest *nt = (NullTest *) clause; ! /* Replace the indexkey expression with an index Var. */ ! nt->arg = (Expr *) fix_indexqual_operand((Node *) nt->arg, ! index, ! indexcol); ! } ! else ! elog(ERROR, "unsupported indexqual type: %d", ! (int) nodeTag(clause)); ! ! return clause; } /* diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index b223972..ddb86bd 100644 *** a/src/backend/optimizer/plan/planner.c --- b/src/backend/optimizer/plan/planner.c *************** plan_cluster_use_sort(Oid tableOid, Oid *** 6136,6142 **** /* Estimate the cost of index scan */ indexScanPath = create_index_path(root, indexInfo, ! NIL, NIL, NIL, NIL, NIL, ForwardScanDirection, false, NULL, 1.0, false); --- 6136,6142 ---- /* Estimate the cost of index scan */ indexScanPath = create_index_path(root, indexInfo, ! NIL, NIL, NIL, NIL, ForwardScanDirection, false, NULL, 1.0, false); diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 663fa7c..d7ff17c 100644 *** a/src/backend/optimizer/util/clauses.c --- b/src/backend/optimizer/util/clauses.c *************** CommuteOpExpr(OpExpr *clause) *** 2157,2227 **** } /* - * CommuteRowCompareExpr: commute a RowCompareExpr clause - * - * XXX the clause is destructively modified! - */ - void - CommuteRowCompareExpr(RowCompareExpr *clause) - { - List *newops; - List *temp; - ListCell *l; - - /* Sanity checks: caller is at fault if these fail */ - if (!IsA(clause, RowCompareExpr)) - elog(ERROR, "expected a RowCompareExpr"); - - /* Build list of commuted operators */ - newops = NIL; - foreach(l, clause->opnos) - { - Oid opoid = lfirst_oid(l); - - opoid = get_commutator(opoid); - if (!OidIsValid(opoid)) - elog(ERROR, "could not find commutator for operator %u", - lfirst_oid(l)); - newops = lappend_oid(newops, opoid); - } - - /* - * modify the clause in-place! - */ - switch (clause->rctype) - { - case ROWCOMPARE_LT: - clause->rctype = ROWCOMPARE_GT; - break; - case ROWCOMPARE_LE: - clause->rctype = ROWCOMPARE_GE; - break; - case ROWCOMPARE_GE: - clause->rctype = ROWCOMPARE_LE; - break; - case ROWCOMPARE_GT: - clause->rctype = ROWCOMPARE_LT; - break; - default: - elog(ERROR, "unexpected RowCompare type: %d", - (int) clause->rctype); - break; - } - - clause->opnos = newops; - - /* - * Note: we need not change the opfamilies list; we assume any btree - * opfamily containing an operator will also contain its commutator. - * Collations don't change either. - */ - - temp = clause->largs; - clause->largs = clause->rargs; - clause->rargs = temp; - } - - /* * Helper for eval_const_expressions: check that datatype of an attribute * is still what it was when the expression was parsed. This is needed to * guard against improper simplification after ALTER COLUMN TYPE. (XXX we --- 2157,2162 ---- diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index b57de6b..442b44f 100644 *** a/src/backend/optimizer/util/pathnode.c --- b/src/backend/optimizer/util/pathnode.c *************** create_samplescan_path(PlannerInfo *root *** 1001,1010 **** * Creates a path node for an index scan. * * 'index' is a usable index. ! * 'indexclauses' is a list of RestrictInfo nodes representing clauses ! * to be used as index qual conditions in the scan. ! * 'indexclausecols' is an integer list of index column numbers (zero based) ! * the indexclauses can be used with. * 'indexorderbys' is a list of bare expressions (no RestrictInfos) * to be used as index ordering operators in the scan. * 'indexorderbycols' is an integer list of index column numbers (zero based) --- 1001,1008 ---- * Creates a path node for an index scan. * * 'index' is a usable index. ! * 'indexclauses' is a list of IndexClause nodes representing clauses ! * to be enforced as qual conditions in the scan. * 'indexorderbys' is a list of bare expressions (no RestrictInfos) * to be used as index ordering operators in the scan. * 'indexorderbycols' is an integer list of index column numbers (zero based) *************** IndexPath * *** 1025,1031 **** create_index_path(PlannerInfo *root, IndexOptInfo *index, List *indexclauses, - List *indexclausecols, List *indexorderbys, List *indexorderbycols, List *pathkeys, --- 1023,1028 ---- *************** create_index_path(PlannerInfo *root, *** 1037,1044 **** { IndexPath *pathnode = makeNode(IndexPath); RelOptInfo *rel = index->rel; - List *indexquals, - *indexqualcols; pathnode->path.pathtype = indexonly ? T_IndexOnlyScan : T_IndexScan; pathnode->path.parent = rel; --- 1034,1039 ---- *************** create_index_path(PlannerInfo *root, *** 1050,1064 **** pathnode->path.parallel_workers = 0; pathnode->path.pathkeys = pathkeys; - /* Convert clauses to indexquals the executor can handle */ - expand_indexqual_conditions(index, indexclauses, indexclausecols, - &indexquals, &indexqualcols); - - /* Fill in the pathnode */ pathnode->indexinfo = index; pathnode->indexclauses = indexclauses; - pathnode->indexquals = indexquals; - pathnode->indexqualcols = indexqualcols; pathnode->indexorderbys = indexorderbys; pathnode->indexorderbycols = indexorderbycols; pathnode->indexscandir = indexscandir; --- 1045,1052 ---- *************** do { \ *** 3712,3718 **** FLAT_COPY_PATH(ipath, path, IndexPath); ADJUST_CHILD_ATTRS(ipath->indexclauses); - ADJUST_CHILD_ATTRS(ipath->indexquals); new_path = (Path *) ipath; } break; --- 3700,3705 ---- diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c index 1c47c70..03e5f12 100644 *** a/src/backend/optimizer/util/restrictinfo.c --- b/src/backend/optimizer/util/restrictinfo.c *************** make_sub_restrictinfos(Expr *clause, *** 289,294 **** --- 289,358 ---- } /* + * commute_restrictinfo + * + * Given a RestrictInfo containing a binary opclause, produce a RestrictInfo + * representing the commutation of that clause. The caller must pass the + * OID of the commutator operator (which it's presumably looked up, else + * it would not know this is valid). + * + * Beware that the result shares sub-structure with the given RestrictInfo. + * That's okay for the intended usage with derived index quals, but might + * be hazardous if the source is subject to change. Also notice that we + * assume without checking that the commutator op is a member of the same + * btree and hash opclasses as the original op. + */ + RestrictInfo * + commute_restrictinfo(RestrictInfo *rinfo, Oid comm_op) + { + RestrictInfo *result; + OpExpr *newclause; + OpExpr *clause = castNode(OpExpr, rinfo->clause); + + Assert(list_length(clause->args) == 2); + + /* flat-copy all the fields of clause ... */ + newclause = makeNode(OpExpr); + memcpy(newclause, clause, sizeof(OpExpr)); + + /* ... and adjust those we need to change to commute it */ + newclause->opno = comm_op; + newclause->opfuncid = InvalidOid; + newclause->args = list_make2(lsecond(clause->args), + linitial(clause->args)); + + /* likewise, flat-copy all the fields of rinfo ... */ + result = makeNode(RestrictInfo); + memcpy(result, rinfo, sizeof(RestrictInfo)); + + /* + * ... and adjust those we need to change. Note in particular that we can + * preserve any cached selectivity or cost estimates, since those ought to + * be the same for the new clause. Likewise we can keep the source's + * parent_ec. + */ + result->clause = (Expr *) newclause; + result->left_relids = rinfo->right_relids; + result->right_relids = rinfo->left_relids; + Assert(result->orclause == NULL); + result->left_ec = rinfo->right_ec; + result->right_ec = rinfo->left_ec; + result->left_em = rinfo->right_em; + result->right_em = rinfo->left_em; + result->scansel_cache = NIL; /* not worth updating this */ + if (rinfo->hashjoinoperator == clause->opno) + result->hashjoinoperator = comm_op; + else + result->hashjoinoperator = InvalidOid; + result->left_bucketsize = rinfo->right_bucketsize; + result->right_bucketsize = rinfo->left_bucketsize; + result->left_mcvfreq = rinfo->right_mcvfreq; + result->right_mcvfreq = rinfo->left_mcvfreq; + + return result; + } + + /* * restriction_is_or_clause * * Returns t iff the restrictinfo node contains an 'or' clause. diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index fb00504..74fafc6 100644 *** a/src/backend/utils/adt/selfuncs.c --- b/src/backend/utils/adt/selfuncs.c *************** static Selectivity regex_selectivity(con *** 226,231 **** --- 226,233 ---- static Datum string_to_datum(const char *str, Oid datatype); static Const *string_to_const(const char *str, Oid datatype); static Const *string_to_bytea_const(const char *str, size_t str_len); + static IndexQualInfo *deconstruct_indexqual(RestrictInfo *rinfo, + IndexOptInfo *index, int indexcol); static List *add_predicate_to_quals(IndexOptInfo *index, List *indexQuals); *************** string_to_bytea_const(const char *str, s *** 6574,6594 **** *------------------------------------------------------------------------- */ List * deconstruct_indexquals(IndexPath *path) { List *result = NIL; IndexOptInfo *index = path->indexinfo; ! ListCell *lcc, ! *lci; ! forboth(lcc, path->indexquals, lci, path->indexqualcols) { - RestrictInfo *rinfo = lfirst_node(RestrictInfo, lcc); - int indexcol = lfirst_int(lci); Expr *clause; - Node *leftop, - *rightop; IndexQualInfo *qinfo; clause = rinfo->clause; --- 6576,6647 ---- *------------------------------------------------------------------------- */ + /* Extract the actual indexquals (as RestrictInfos) from an IndexClause list */ + static List * + get_index_quals(List *indexclauses) + { + List *result = NIL; + ListCell *lc; + + foreach(lc, indexclauses) + { + IndexClause *iclause = lfirst_node(IndexClause, lc); + + if (iclause->indexquals == NIL) + { + /* rinfo->clause is directly usable as an indexqual */ + result = lappend(result, iclause->rinfo); + } + else + { + /* report the derived indexquals */ + result = list_concat(result, list_copy(iclause->indexquals)); + } + } + return result; + } + List * deconstruct_indexquals(IndexPath *path) { List *result = NIL; IndexOptInfo *index = path->indexinfo; ! ListCell *lc; ! foreach(lc, path->indexclauses) ! { ! IndexClause *iclause = lfirst_node(IndexClause, lc); ! int indexcol = iclause->indexcol; ! IndexQualInfo *qinfo; ! ! if (iclause->indexquals == NIL) ! { ! /* rinfo->clause is directly usable as an indexqual */ ! qinfo = deconstruct_indexqual(iclause->rinfo, index, indexcol); ! result = lappend(result, qinfo); ! } ! else ! { ! /* Process the derived indexquals */ ! ListCell *lc2; ! ! foreach(lc2, iclause->indexquals) ! { ! RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc2); ! ! qinfo = deconstruct_indexqual(rinfo, index, indexcol); ! result = lappend(result, qinfo); ! } ! } ! } ! return result; ! } ! ! static IndexQualInfo * ! deconstruct_indexqual(RestrictInfo *rinfo, IndexOptInfo *index, int indexcol) ! { { Expr *clause; IndexQualInfo *qinfo; clause = rinfo->clause; *************** deconstruct_indexquals(IndexPath *path) *** 6600,6656 **** if (IsA(clause, OpExpr)) { qinfo->clause_op = ((OpExpr *) clause)->opno; ! leftop = get_leftop(clause); ! rightop = get_rightop(clause); ! if (match_index_to_operand(leftop, indexcol, index)) ! { ! qinfo->varonleft = true; ! qinfo->other_operand = rightop; ! } ! else ! { ! Assert(match_index_to_operand(rightop, indexcol, index)); ! qinfo->varonleft = false; ! qinfo->other_operand = leftop; ! } } else if (IsA(clause, RowCompareExpr)) { RowCompareExpr *rc = (RowCompareExpr *) clause; qinfo->clause_op = linitial_oid(rc->opnos); ! /* Examine only first columns to determine left/right sides */ ! if (match_index_to_operand((Node *) linitial(rc->largs), ! indexcol, index)) ! { ! qinfo->varonleft = true; ! qinfo->other_operand = (Node *) rc->rargs; ! } ! else ! { ! Assert(match_index_to_operand((Node *) linitial(rc->rargs), ! indexcol, index)); ! qinfo->varonleft = false; ! qinfo->other_operand = (Node *) rc->largs; ! } } else if (IsA(clause, ScalarArrayOpExpr)) { ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) clause; qinfo->clause_op = saop->opno; - /* index column is always on the left in this case */ - Assert(match_index_to_operand((Node *) linitial(saop->args), - indexcol, index)); - qinfo->varonleft = true; qinfo->other_operand = (Node *) lsecond(saop->args); } else if (IsA(clause, NullTest)) { qinfo->clause_op = InvalidOid; - Assert(match_index_to_operand((Node *) ((NullTest *) clause)->arg, - indexcol, index)); - qinfo->varonleft = true; qinfo->other_operand = NULL; } else --- 6653,6677 ---- if (IsA(clause, OpExpr)) { qinfo->clause_op = ((OpExpr *) clause)->opno; ! qinfo->other_operand = get_rightop(clause); } else if (IsA(clause, RowCompareExpr)) { RowCompareExpr *rc = (RowCompareExpr *) clause; qinfo->clause_op = linitial_oid(rc->opnos); ! qinfo->other_operand = (Node *) rc->rargs; } else if (IsA(clause, ScalarArrayOpExpr)) { ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) clause; qinfo->clause_op = saop->opno; qinfo->other_operand = (Node *) lsecond(saop->args); } else if (IsA(clause, NullTest)) { qinfo->clause_op = InvalidOid; qinfo->other_operand = NULL; } else *************** deconstruct_indexquals(IndexPath *path) *** 6659,6667 **** (int) nodeTag(clause)); } ! result = lappend(result, qinfo); } - return result; } /* --- 6680,6687 ---- (int) nodeTag(clause)); } ! return qinfo; } } /* *************** genericcostestimate(PlannerInfo *root, *** 6731,6737 **** GenericCosts *costs) { IndexOptInfo *index = path->indexinfo; ! List *indexQuals = path->indexquals; List *indexOrderBys = path->indexorderbys; Cost indexStartupCost; Cost indexTotalCost; --- 6751,6757 ---- GenericCosts *costs) { IndexOptInfo *index = path->indexinfo; ! List *indexQuals = get_index_quals(path->indexclauses); List *indexOrderBys = path->indexorderbys; Cost indexStartupCost; Cost indexTotalCost; *************** btcostestimate(PlannerInfo *root, IndexP *** 7052,7065 **** } } - /* - * We would need to commute the clause_op if not varonleft, except - * that we only care if it's equality or not, so that refinement is - * unnecessary. - */ - clause_op = qinfo->clause_op; - /* check for equality operator */ if (OidIsValid(clause_op)) { op_strategy = get_op_opfamily_strategy(clause_op, --- 7072,7079 ---- } } /* check for equality operator */ + clause_op = qinfo->clause_op; if (OidIsValid(clause_op)) { op_strategy = get_op_opfamily_strategy(clause_op, *************** gincost_opexpr(PlannerInfo *root, *** 7560,7571 **** Oid clause_op = qinfo->clause_op; Node *operand = qinfo->other_operand; - if (!qinfo->varonleft) - { - /* must commute the operator */ - clause_op = get_commutator(clause_op); - } - /* aggressively reduce to a constant, and look through relabeling */ operand = estimate_expression_value(root, operand); --- 7574,7579 ---- *************** gincostestimate(PlannerInfo *root, Index *** 7728,7734 **** double *indexPages) { IndexOptInfo *index = path->indexinfo; ! List *indexQuals = path->indexquals; List *indexOrderBys = path->indexorderbys; List *qinfos; ListCell *l; --- 7736,7742 ---- double *indexPages) { IndexOptInfo *index = path->indexinfo; ! List *indexQuals = get_index_quals(path->indexclauses); List *indexOrderBys = path->indexorderbys; List *qinfos; ListCell *l; *************** gincostestimate(PlannerInfo *root, Index *** 7831,7856 **** numEntries = 1; /* ! * Include predicate in selectivityQuals (should match ! * genericcostestimate) */ ! if (index->indpred != NIL) ! { ! List *predExtraQuals = NIL; ! ! foreach(l, index->indpred) ! { ! Node *predQual = (Node *) lfirst(l); ! List *oneQual = list_make1(predQual); ! ! if (!predicate_implied_by(oneQual, indexQuals, false)) ! predExtraQuals = list_concat(predExtraQuals, oneQual); ! } ! /* list_concat avoids modifying the passed-in indexQuals list */ ! selectivityQuals = list_concat(predExtraQuals, indexQuals); ! } ! else ! selectivityQuals = indexQuals; /* Estimate the fraction of main-table tuples that will be visited */ *indexSelectivity = clauselist_selectivity(root, selectivityQuals, --- 7839,7849 ---- numEntries = 1; /* ! * If the index is partial, AND the index predicate with the index-bound ! * quals to produce a more accurate idea of the number of rows covered by ! * the bound conditions. */ ! selectivityQuals = add_predicate_to_quals(index, indexQuals); /* Estimate the fraction of main-table tuples that will be visited */ *indexSelectivity = clauselist_selectivity(root, selectivityQuals, *************** brincostestimate(PlannerInfo *root, Inde *** 8053,8059 **** double *indexPages) { IndexOptInfo *index = path->indexinfo; ! List *indexQuals = path->indexquals; double numPages = index->pages; RelOptInfo *baserel = index->rel; RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root); --- 8046,8052 ---- double *indexPages) { IndexOptInfo *index = path->indexinfo; ! List *indexQuals = get_index_quals(path->indexclauses); double numPages = index->pages; RelOptInfo *baserel = index->rel; RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root); diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index e215ad4..3c003b0 100644 *** a/src/include/nodes/nodes.h --- b/src/include/nodes/nodes.h *************** typedef enum NodeTag *** 262,267 **** --- 262,268 ---- T_PathKey, T_PathTarget, T_RestrictInfo, + T_IndexClause, T_PlaceHolderVar, T_SpecialJoinInfo, T_AppendRelInfo, diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h index d3c477a..0b780b0 100644 *** a/src/include/nodes/pathnodes.h --- b/src/include/nodes/pathnodes.h *************** typedef struct Path *** 1123,1152 **** * * 'indexinfo' is the index to be scanned. * ! * 'indexclauses' is a list of index qualification clauses, with implicit ! * AND semantics across the list. Each clause is a RestrictInfo node from ! * the query's WHERE or JOIN conditions. An empty list implies a full ! * index scan. ! * ! * 'indexquals' has the same structure as 'indexclauses', but it contains ! * the actual index qual conditions that can be used with the index. ! * In simple cases this is identical to 'indexclauses', but when special ! * indexable operators appear in 'indexclauses', they are replaced by the ! * derived indexscannable conditions in 'indexquals'. ! * ! * 'indexqualcols' is an integer list of index column numbers (zero-based) ! * of the same length as 'indexquals', showing which index column each qual ! * is meant to be used with. 'indexquals' is required to be ordered by ! * index column, so 'indexqualcols' must form a nondecreasing sequence. ! * (The order of multiple quals for the same index column is unspecified.) * * 'indexorderbys', if not NIL, is a list of ORDER BY expressions that have * been found to be usable as ordering operators for an amcanorderbyop index. * The list must match the path's pathkeys, ie, one expression per pathkey * in the same order. These are not RestrictInfos, just bare expressions, ! * since they generally won't yield booleans. Also, unlike the case for ! * quals, it's guaranteed that each expression has the index key on the left ! * side of the operator. * * 'indexorderbycols' is an integer list of index column numbers (zero-based) * of the same length as 'indexorderbys', showing which index column each --- 1123,1138 ---- * * 'indexinfo' is the index to be scanned. * ! * 'indexclauses' is a list of IndexClause nodes, each representing one ! * index-checkable restriction, with implicit AND semantics across the list. ! * An empty list implies a full index scan. * * 'indexorderbys', if not NIL, is a list of ORDER BY expressions that have * been found to be usable as ordering operators for an amcanorderbyop index. * The list must match the path's pathkeys, ie, one expression per pathkey * in the same order. These are not RestrictInfos, just bare expressions, ! * since they generally won't yield booleans. It's guaranteed that each ! * expression has the index key on the left side of the operator. * * 'indexorderbycols' is an integer list of index column numbers (zero-based) * of the same length as 'indexorderbys', showing which index column each *************** typedef struct IndexPath *** 1172,1179 **** Path path; IndexOptInfo *indexinfo; List *indexclauses; - List *indexquals; - List *indexqualcols; List *indexorderbys; List *indexorderbycols; ScanDirection indexscandir; --- 1158,1163 ---- *************** typedef struct IndexPath *** 1182,1187 **** --- 1166,1221 ---- } IndexPath; /* + * Each IndexClause references a RestrictInfo node from the query's WHERE + * or JOIN conditions, and shows how that restriction can be applied to + * the particular index. We support both indexclauses that are directly + * usable by the index machinery, which are typically of the form + * "indexcol OP pseudoconstant", and those from which an indexable qual + * can be derived. The simplest such transformation is that a clause + * of the form "pseudoconstant OP indexcol" can be commuted to produce an + * indexable qual (the index machinery expects the indexcol to be on the + * left always). Another example is that we might be able to extract an + * indexable range condition from a LIKE condition, as in "x LIKE 'foo%bar'" + * giving rise to "x >= 'foo' AND x < 'fop'". Derivation of such lossy + * conditions is done by a planner support function attached to the + * indexclause's top-level function or operator. + * + * If indexquals is NIL, it means that rinfo->clause is directly usable as + * an indexqual. Otherwise indexquals contains one or more directly-usable + * indexqual conditions extracted from the given clause. The 'lossy' flag + * indicates whether the indexquals are semantically equivalent to the + * original clause, or form a weaker condition. + * + * Currently, entries in indexquals are RestrictInfos, but they could perhaps + * be bare clauses instead; the only advantage of making them RestrictInfos + * is the possibility of caching cost and selectivity information across + * multiple uses, and it's not clear that that's really worth the price of + * constructing RestrictInfos for them. Note however that the extended-stats + * machinery won't do anything with non-RestrictInfo clauses, so that would + * have to be fixed. + * + * Normally, indexcol is the index of the single index column the clause + * works on, and indexcols is NIL. But if the clause is a RowCompareExpr, + * indexcol is the index of the leading column, and indexcols is a list of + * all the affected columns. (Note that indexcols matches up with the + * columns of the actual indexable RowCompareExpr, which might be in + * indexquals rather than rinfo.) + * + * An IndexPath's IndexClause list is required to be ordered by index + * column, i.e. the indexcol values must form a nondecreasing sequence. + * (The order of multiple clauses for the same index column is unspecified.) + */ + typedef struct IndexClause + { + NodeTag type; + struct RestrictInfo *rinfo; /* original restriction or join clause */ + List *indexquals; /* indexqual(s) derived from it, or NIL */ + bool lossy; /* are indexquals a lossy version of clause? */ + AttrNumber indexcol; /* index column the clause uses (zero-based) */ + List *indexcols; /* multiple index columns, if RowCompare */ + } IndexClause; + + /* * BitmapHeapPath represents one or more indexscans that generate TID bitmaps * instead of directly accessing the heap, followed by AND/OR combinations * to produce a single bitmap, followed by a heap scan that uses the bitmap. diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h index 23073c0..95a78cf 100644 *** a/src/include/optimizer/clauses.h --- b/src/include/optimizer/clauses.h *************** extern bool is_pseudo_constant_clause_re *** 51,57 **** extern int NumRelids(Node *clause); extern void CommuteOpExpr(OpExpr *clause); - extern void CommuteRowCompareExpr(RowCompareExpr *clause); extern Query *inline_set_returning_function(PlannerInfo *root, RangeTblEntry *rte); --- 51,56 ---- diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h index d0c8f99..f62acc3 100644 *** a/src/include/optimizer/pathnode.h --- b/src/include/optimizer/pathnode.h *************** extern Path *create_samplescan_path(Plan *** 41,47 **** extern IndexPath *create_index_path(PlannerInfo *root, IndexOptInfo *index, List *indexclauses, - List *indexclausecols, List *indexorderbys, List *indexorderbycols, List *pathkeys, --- 41,46 ---- diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h index 1b02b3b..040335a 100644 *** a/src/include/optimizer/paths.h --- b/src/include/optimizer/paths.h *************** extern bool indexcol_is_bool_constant_fo *** 78,92 **** int indexcol); extern bool match_index_to_operand(Node *operand, int indexcol, IndexOptInfo *index); - extern void expand_indexqual_conditions(IndexOptInfo *index, - List *indexclauses, List *indexclausecols, - List **indexquals_p, List **indexqualcols_p); extern void check_index_predicates(PlannerInfo *root, RelOptInfo *rel); - extern Expr *adjust_rowcompare_for_index(RowCompareExpr *clause, - IndexOptInfo *index, - int indexcol, - List **indexcolnos, - bool *var_on_left_p); /* * tidpath.h --- 78,84 ---- *************** extern bool eclass_useful_for_merging(Pl *** 175,180 **** --- 167,174 ---- EquivalenceClass *eclass, RelOptInfo *rel); extern bool is_redundant_derived_clause(RestrictInfo *rinfo, List *clauselist); + extern bool is_redundant_with_indexclauses(RestrictInfo *rinfo, + List *indexclauses); /* * pathkeys.c diff --git a/src/include/optimizer/restrictinfo.h b/src/include/optimizer/restrictinfo.h index feeaf0e..c348760 100644 *** a/src/include/optimizer/restrictinfo.h --- b/src/include/optimizer/restrictinfo.h *************** extern RestrictInfo *make_restrictinfo(E *** 29,34 **** --- 29,35 ---- Relids required_relids, Relids outer_relids, Relids nullable_relids); + extern RestrictInfo *commute_restrictinfo(RestrictInfo *rinfo, Oid comm_op); extern bool restriction_is_or_clause(RestrictInfo *restrictinfo); extern bool restriction_is_securely_promotable(RestrictInfo *restrictinfo, RelOptInfo *rel); diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h index 6b1ef91..087b56f 100644 *** a/src/include/utils/selfuncs.h --- b/src/include/utils/selfuncs.h *************** typedef struct *** 108,114 **** { RestrictInfo *rinfo; /* the indexqual itself */ int indexcol; /* zero-based index column number */ - bool varonleft; /* true if index column is on left of qual */ Oid clause_op; /* qual's operator OID, if relevant */ Node *other_operand; /* non-index operand of qual's operator */ } IndexQualInfo; --- 108,113 ---- diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out index 46deb55..4932869 100644 *** a/src/test/regress/expected/create_index.out --- b/src/test/regress/expected/create_index.out *************** SELECT count(*) FROM quad_point_tbl WHER *** 1504,1510 **** -> Bitmap Heap Scan on quad_point_tbl Recheck Cond: ('(1000,1000),(200,200)'::box @> p) -> Bitmap Index Scan on sp_quad_ind ! Index Cond: ('(1000,1000),(200,200)'::box @> p) (5 rows) SELECT count(*) FROM quad_point_tbl WHERE box '(200,200,1000,1000)' @> p; --- 1504,1510 ---- -> Bitmap Heap Scan on quad_point_tbl Recheck Cond: ('(1000,1000),(200,200)'::box @> p) -> Bitmap Index Scan on sp_quad_ind ! Index Cond: (p <@ '(1000,1000),(200,200)'::box) (5 rows) SELECT count(*) FROM quad_point_tbl WHERE box '(200,200,1000,1000)' @> p; *************** SELECT count(*) FROM kd_point_tbl WHERE *** 1623,1629 **** -> Bitmap Heap Scan on kd_point_tbl Recheck Cond: ('(1000,1000),(200,200)'::box @> p) -> Bitmap Index Scan on sp_kd_ind ! Index Cond: ('(1000,1000),(200,200)'::box @> p) (5 rows) SELECT count(*) FROM kd_point_tbl WHERE box '(200,200,1000,1000)' @> p; --- 1623,1629 ---- -> Bitmap Heap Scan on kd_point_tbl Recheck Cond: ('(1000,1000),(200,200)'::box @> p) -> Bitmap Index Scan on sp_kd_ind ! Index Cond: (p <@ '(1000,1000),(200,200)'::box) (5 rows) SELECT count(*) FROM kd_point_tbl WHERE box '(200,200,1000,1000)' @> p; *************** explain (costs off) *** 3181,3188 **** Limit -> Index Scan using boolindex_b_i_key on boolindex Index Cond: (b = true) ! Filter: b ! (4 rows) explain (costs off) select * from boolindex where b = true order by i desc limit 10; --- 3181,3187 ---- Limit -> Index Scan using boolindex_b_i_key on boolindex Index Cond: (b = true) ! (3 rows) explain (costs off) select * from boolindex where b = true order by i desc limit 10; *************** explain (costs off) *** 3191,3198 **** Limit -> Index Scan Backward using boolindex_b_i_key on boolindex Index Cond: (b = true) ! Filter: b ! (4 rows) explain (costs off) select * from boolindex where not b order by i limit 10; --- 3190,3196 ---- Limit -> Index Scan Backward using boolindex_b_i_key on boolindex Index Cond: (b = true) ! (3 rows) explain (costs off) select * from boolindex where not b order by i limit 10; *************** explain (costs off) *** 3201,3208 **** Limit -> Index Scan using boolindex_b_i_key on boolindex Index Cond: (b = false) ! Filter: (NOT b) ! (4 rows) -- -- Test for multilevel page deletion --- 3199,3205 ---- Limit -> Index Scan using boolindex_b_i_key on boolindex Index Cond: (b = false) ! (3 rows) -- -- Test for multilevel page deletion diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out index 2829878..fcc82a1 100644 *** a/src/test/regress/expected/join.out --- b/src/test/regress/expected/join.out *************** where q1 = thousand or q2 = thousand; *** 3078,3086 **** Recheck Cond: ((q1.q1 = thousand) OR (q2.q2 = thousand)) -> BitmapOr -> Bitmap Index Scan on tenk1_thous_tenthous ! Index Cond: (q1.q1 = thousand) -> Bitmap Index Scan on tenk1_thous_tenthous ! Index Cond: (q2.q2 = thousand) -> Hash -> Seq Scan on int4_tbl (15 rows) --- 3078,3086 ---- Recheck Cond: ((q1.q1 = thousand) OR (q2.q2 = thousand)) -> BitmapOr -> Bitmap Index Scan on tenk1_thous_tenthous ! Index Cond: (thousand = q1.q1) -> Bitmap Index Scan on tenk1_thous_tenthous ! Index Cond: (thousand = q2.q2) -> Hash -> Seq Scan on int4_tbl (15 rows) *************** where t1.unique1 = 1; *** 3332,3338 **** -> Bitmap Heap Scan on tenk1 t2 Recheck Cond: (t1.hundred = hundred) -> Bitmap Index Scan on tenk1_hundred ! Index Cond: (t1.hundred = hundred) -> Index Scan using tenk1_unique2 on tenk1 t3 Index Cond: (unique2 = t2.thousand) (11 rows) --- 3332,3338 ---- -> Bitmap Heap Scan on tenk1 t2 Recheck Cond: (t1.hundred = hundred) -> Bitmap Index Scan on tenk1_hundred ! Index Cond: (hundred = t1.hundred) -> Index Scan using tenk1_unique2 on tenk1 t3 Index Cond: (unique2 = t2.thousand) (11 rows) *************** where t1.unique1 = 1; *** 3352,3358 **** -> Bitmap Heap Scan on tenk1 t2 Recheck Cond: (t1.hundred = hundred) -> Bitmap Index Scan on tenk1_hundred ! Index Cond: (t1.hundred = hundred) -> Index Scan using tenk1_unique2 on tenk1 t3 Index Cond: (unique2 = t2.thousand) (11 rows) --- 3352,3358 ---- -> Bitmap Heap Scan on tenk1 t2 Recheck Cond: (t1.hundred = hundred) -> Bitmap Index Scan on tenk1_hundred ! Index Cond: (hundred = t1.hundred) -> Index Scan using tenk1_unique2 on tenk1 t3 Index Cond: (unique2 = t2.thousand) (11 rows) *************** select b.unique1 from *** 3408,3414 **** -> Nested Loop -> Seq Scan on int4_tbl i1 -> Index Scan using tenk1_thous_tenthous on tenk1 b ! Index Cond: ((thousand = i1.f1) AND (i2.f1 = tenthous)) -> Index Scan using tenk1_unique1 on tenk1 a Index Cond: (unique1 = b.unique2) -> Index Only Scan using tenk1_thous_tenthous on tenk1 c --- 3408,3414 ---- -> Nested Loop -> Seq Scan on int4_tbl i1 -> Index Scan using tenk1_thous_tenthous on tenk1 b ! Index Cond: ((thousand = i1.f1) AND (tenthous = i2.f1)) -> Index Scan using tenk1_unique1 on tenk1 a Index Cond: (unique1 = b.unique2) -> Index Only Scan using tenk1_thous_tenthous on tenk1 c *************** order by fault; *** 3444,3450 **** Filter: ((COALESCE(tenk1.unique1, '-1'::integer) + int8_tbl.q1) = 122) -> Seq Scan on int8_tbl -> Index Scan using tenk1_unique2 on tenk1 ! Index Cond: (int8_tbl.q2 = unique2) (5 rows) select * from --- 3444,3450 ---- Filter: ((COALESCE(tenk1.unique1, '-1'::integer) + int8_tbl.q1) = 122) -> Seq Scan on int8_tbl -> Index Scan using tenk1_unique2 on tenk1 ! Index Cond: (unique2 = int8_tbl.q2) (5 rows) select * from *************** select q1, unique2, thousand, hundred *** 3499,3505 **** Filter: ((COALESCE(b.thousand, 123) = a.q1) AND (a.q1 = COALESCE(b.hundred, 123))) -> Seq Scan on int8_tbl a -> Index Scan using tenk1_unique2 on tenk1 b ! Index Cond: (a.q1 = unique2) (5 rows) select q1, unique2, thousand, hundred --- 3499,3505 ---- Filter: ((COALESCE(b.thousand, 123) = a.q1) AND (a.q1 = COALESCE(b.hundred, 123))) -> Seq Scan on int8_tbl a -> Index Scan using tenk1_unique2 on tenk1 b ! Index Cond: (unique2 = a.q1) (5 rows) select q1, unique2, thousand, hundred *************** explain (costs off) *** 4586,4592 **** Nested Loop Left Join -> Seq Scan on int4_tbl x -> Index Scan using tenk1_unique1 on tenk1 ! Index Cond: (x.f1 = unique1) (4 rows) -- check scoping of lateral versus parent references --- 4586,4592 ---- Nested Loop Left Join -> Seq Scan on int4_tbl x -> Index Scan using tenk1_unique1 on tenk1 ! Index Cond: (unique1 = x.f1) (4 rows) -- check scoping of lateral versus parent references diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out index c55de5d..bbdc373 100644 *** a/src/test/regress/expected/partition_join.out --- b/src/test/regress/expected/partition_join.out *************** SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3 *** 647,653 **** -> Seq Scan on prt1_e_p1 t3 Filter: (c = 0) -> Index Scan using iprt2_p1_b on prt2_p1 t2 ! Index Cond: (t1.a = b) -> Nested Loop Left Join -> Hash Right Join Hash Cond: (t1_1.a = ((t3_1.a + t3_1.b) / 2)) --- 647,653 ---- -> Seq Scan on prt1_e_p1 t3 Filter: (c = 0) -> Index Scan using iprt2_p1_b on prt2_p1 t2 ! Index Cond: (b = t1.a) -> Nested Loop Left Join -> Hash Right Join Hash Cond: (t1_1.a = ((t3_1.a + t3_1.b) / 2)) *************** SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3 *** 656,662 **** -> Seq Scan on prt1_e_p2 t3_1 Filter: (c = 0) -> Index Scan using iprt2_p2_b on prt2_p2 t2_1 ! Index Cond: (t1_1.a = b) -> Nested Loop Left Join -> Hash Right Join Hash Cond: (t1_2.a = ((t3_2.a + t3_2.b) / 2)) --- 656,662 ---- -> Seq Scan on prt1_e_p2 t3_1 Filter: (c = 0) -> Index Scan using iprt2_p2_b on prt2_p2 t2_1 ! Index Cond: (b = t1_1.a) -> Nested Loop Left Join -> Hash Right Join Hash Cond: (t1_2.a = ((t3_2.a + t3_2.b) / 2)) *************** SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3 *** 665,671 **** -> Seq Scan on prt1_e_p3 t3_2 Filter: (c = 0) -> Index Scan using iprt2_p3_b on prt2_p3 t2_2 ! Index Cond: (t1_2.a = b) (30 rows) SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) RIGHT JOIN prt1_e t3 ON(t1.a = (t3.a + t3.b)/2) WHERE t3.c = 0 ORDER BY t1.a, t2.b, t3.a + t3.b; --- 665,671 ---- -> Seq Scan on prt1_e_p3 t3_2 Filter: (c = 0) -> Index Scan using iprt2_p3_b on prt2_p3 t2_2 ! Index Cond: (b = t1_2.a) (30 rows) SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) RIGHT JOIN prt1_e t3 ON(t1.a = (t3.a + t3.b)/2) WHERE t3.c = 0 ORDER BY t1.a, t2.b, t3.a + t3.b; *************** SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 *** 1878,1888 **** -> Seq Scan on prt1_p3 t1_2 -> Append -> Index Scan using iprt2_p1_b on prt2_p1 t2 ! Index Cond: (t1.a < b) -> Index Scan using iprt2_p2_b on prt2_p2 t2_1 ! Index Cond: (t1.a < b) -> Index Scan using iprt2_p3_b on prt2_p3 t2_2 ! Index Cond: (t1.a < b) (12 rows) -- equi-join with join condition on partial keys does not qualify for --- 1878,1888 ---- -> Seq Scan on prt1_p3 t1_2 -> Append -> Index Scan using iprt2_p1_b on prt2_p1 t2 ! Index Cond: (b > t1.a) -> Index Scan using iprt2_p2_b on prt2_p2 t2_1 ! Index Cond: (b > t1.a) -> Index Scan using iprt2_p3_b on prt2_p3 t2_2 ! Index Cond: (b > t1.a) (12 rows) -- equi-join with join condition on partial keys does not qualify for diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out index 120b651..30946f7 100644 *** a/src/test/regress/expected/partition_prune.out --- b/src/test/regress/expected/partition_prune.out *************** select * from tbl1 join tprt on tbl1.col *** 2635,2651 **** -> Seq Scan on tbl1 (actual rows=2 loops=1) -> Append (actual rows=3 loops=2) -> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=2) ! Index Cond: (tbl1.col1 > col1) -> Index Scan using tprt2_idx on tprt_2 (actual rows=2 loops=1) ! Index Cond: (tbl1.col1 > col1) -> Index Scan using tprt3_idx on tprt_3 (never executed) ! Index Cond: (tbl1.col1 > col1) -> Index Scan using tprt4_idx on tprt_4 (never executed) ! Index Cond: (tbl1.col1 > col1) -> Index Scan using tprt5_idx on tprt_5 (never executed) ! Index Cond: (tbl1.col1 > col1) -> Index Scan using tprt6_idx on tprt_6 (never executed) ! Index Cond: (tbl1.col1 > col1) (15 rows) explain (analyze, costs off, summary off, timing off) --- 2635,2651 ---- -> Seq Scan on tbl1 (actual rows=2 loops=1) -> Append (actual rows=3 loops=2) -> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=2) ! Index Cond: (col1 < tbl1.col1) -> Index Scan using tprt2_idx on tprt_2 (actual rows=2 loops=1) ! Index Cond: (col1 < tbl1.col1) -> Index Scan using tprt3_idx on tprt_3 (never executed) ! Index Cond: (col1 < tbl1.col1) -> Index Scan using tprt4_idx on tprt_4 (never executed) ! Index Cond: (col1 < tbl1.col1) -> Index Scan using tprt5_idx on tprt_5 (never executed) ! Index Cond: (col1 < tbl1.col1) -> Index Scan using tprt6_idx on tprt_6 (never executed) ! Index Cond: (col1 < tbl1.col1) (15 rows) explain (analyze, costs off, summary off, timing off) *************** select * from tbl1 inner join tprt on tb *** 2701,2717 **** -> Seq Scan on tbl1 (actual rows=5 loops=1) -> Append (actual rows=5 loops=5) -> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=5) ! Index Cond: (tbl1.col1 > col1) -> Index Scan using tprt2_idx on tprt_2 (actual rows=3 loops=4) ! Index Cond: (tbl1.col1 > col1) -> Index Scan using tprt3_idx on tprt_3 (actual rows=1 loops=2) ! Index Cond: (tbl1.col1 > col1) -> Index Scan using tprt4_idx on tprt_4 (never executed) ! Index Cond: (tbl1.col1 > col1) -> Index Scan using tprt5_idx on tprt_5 (never executed) ! Index Cond: (tbl1.col1 > col1) -> Index Scan using tprt6_idx on tprt_6 (never executed) ! Index Cond: (tbl1.col1 > col1) (15 rows) explain (analyze, costs off, summary off, timing off) --- 2701,2717 ---- -> Seq Scan on tbl1 (actual rows=5 loops=1) -> Append (actual rows=5 loops=5) -> Index Scan using tprt1_idx on tprt_1 (actual rows=2 loops=5) ! Index Cond: (col1 < tbl1.col1) -> Index Scan using tprt2_idx on tprt_2 (actual rows=3 loops=4) ! Index Cond: (col1 < tbl1.col1) -> Index Scan using tprt3_idx on tprt_3 (actual rows=1 loops=2) ! Index Cond: (col1 < tbl1.col1) -> Index Scan using tprt4_idx on tprt_4 (never executed) ! Index Cond: (col1 < tbl1.col1) -> Index Scan using tprt5_idx on tprt_5 (never executed) ! Index Cond: (col1 < tbl1.col1) -> Index Scan using tprt6_idx on tprt_6 (never executed) ! Index Cond: (col1 < tbl1.col1) (15 rows) explain (analyze, costs off, summary off, timing off) *************** select * from tbl1 join tprt on tbl1.col *** 2786,2802 **** -> Seq Scan on tbl1 (actual rows=1 loops=1) -> Append (actual rows=1 loops=1) -> Index Scan using tprt1_idx on tprt_1 (never executed) ! Index Cond: (tbl1.col1 < col1) -> Index Scan using tprt2_idx on tprt_2 (never executed) ! Index Cond: (tbl1.col1 < col1) -> Index Scan using tprt3_idx on tprt_3 (never executed) ! Index Cond: (tbl1.col1 < col1) -> Index Scan using tprt4_idx on tprt_4 (never executed) ! Index Cond: (tbl1.col1 < col1) -> Index Scan using tprt5_idx on tprt_5 (never executed) ! Index Cond: (tbl1.col1 < col1) -> Index Scan using tprt6_idx on tprt_6 (actual rows=1 loops=1) ! Index Cond: (tbl1.col1 < col1) (15 rows) select tbl1.col1, tprt.col1 from tbl1 --- 2786,2802 ---- -> Seq Scan on tbl1 (actual rows=1 loops=1) -> Append (actual rows=1 loops=1) -> Index Scan using tprt1_idx on tprt_1 (never executed) ! Index Cond: (col1 > tbl1.col1) -> Index Scan using tprt2_idx on tprt_2 (never executed) ! Index Cond: (col1 > tbl1.col1) -> Index Scan using tprt3_idx on tprt_3 (never executed) ! Index Cond: (col1 > tbl1.col1) -> Index Scan using tprt4_idx on tprt_4 (never executed) ! Index Cond: (col1 > tbl1.col1) -> Index Scan using tprt5_idx on tprt_5 (never executed) ! Index Cond: (col1 > tbl1.col1) -> Index Scan using tprt6_idx on tprt_6 (actual rows=1 loops=1) ! Index Cond: (col1 > tbl1.col1) (15 rows) select tbl1.col1, tprt.col1 from tbl1 diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml index b486ef3..3403269 100644 --- a/doc/src/sgml/xfunc.sgml +++ b/doc/src/sgml/xfunc.sgml @@ -3460,4 +3460,18 @@ supportfn(internal) returns internal This can be done by a support function that implements the <literal>SupportRequestRows</literal> request type. </para> + + <para> + For target functions that return boolean, it may be possible to + convert a function call appearing in WHERE into an indexable operator + clause or clauses. The converted clauses might be exactly equivalent + to the function's condition, or they could be somewhat weaker (that is, + they might accept some values that the function condition does not). + In the latter case the index condition is said to + be <firstterm>lossy</firstterm>; it can still be used to scan an index, + but the function call will have to be executed for each row returned by + the index to see if it really passes the WHERE condition or not. + To create such conditions, the support function must implement + the <literal>SupportRequestIndexCondition</literal> request type. + </para> </sect1> diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c index 51d2da5..fdad50e 100644 --- a/src/backend/optimizer/path/indxpath.c +++ b/src/backend/optimizer/path/indxpath.c @@ -20,23 +20,19 @@ #include "access/stratnum.h" #include "access/sysattr.h" #include "catalog/pg_am.h" -#include "catalog/pg_collation.h" #include "catalog/pg_operator.h" #include "catalog/pg_opfamily.h" #include "catalog/pg_type.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" -#include "optimizer/clauses.h" +#include "nodes/supportnodes.h" #include "optimizer/cost.h" #include "optimizer/optimizer.h" #include "optimizer/pathnode.h" #include "optimizer/paths.h" #include "optimizer/prep.h" #include "optimizer/restrictinfo.h" -#include "utils/builtins.h" -#include "utils/bytea.h" #include "utils/lsyscache.h" -#include "utils/pg_locale.h" #include "utils/selfuncs.h" @@ -136,7 +132,7 @@ static double adjust_rowcount_for_semijoins(PlannerInfo *root, Index outer_relid, double rowcount); static double approximate_joinrel_size(PlannerInfo *root, Relids relids); -static void match_restriction_clauses_to_index(RelOptInfo *rel, +static void match_restriction_clauses_to_index(PlannerInfo *root, IndexOptInfo *index, IndexClauseSet *clauseset); static void match_join_clauses_to_index(PlannerInfo *root, @@ -146,22 +142,45 @@ static void match_join_clauses_to_index(PlannerInfo *root, static void match_eclass_clauses_to_index(PlannerInfo *root, IndexOptInfo *index, IndexClauseSet *clauseset); -static void match_clauses_to_index(IndexOptInfo *index, +static void match_clauses_to_index(PlannerInfo *root, List *clauses, + IndexOptInfo *index, IndexClauseSet *clauseset); -static void match_clause_to_index(IndexOptInfo *index, +static void match_clause_to_index(PlannerInfo *root, RestrictInfo *rinfo, + IndexOptInfo *index, IndexClauseSet *clauseset); -static bool match_clause_to_indexcol(IndexOptInfo *index, +static IndexClause *match_clause_to_indexcol(PlannerInfo *root, + RestrictInfo *rinfo, int indexcol, - RestrictInfo *rinfo); -static bool is_indexable_operator(Oid expr_op, Oid opfamily, - bool indexkey_on_left); -static bool match_rowcompare_to_indexcol(IndexOptInfo *index, + IndexOptInfo *index); +static IndexClause *match_boolean_index_clause(RestrictInfo *rinfo, + int indexcol, IndexOptInfo *index); +static IndexClause *match_opclause_to_indexcol(PlannerInfo *root, + RestrictInfo *rinfo, + int indexcol, + IndexOptInfo *index); +static IndexClause *match_funcclause_to_indexcol(PlannerInfo *root, + RestrictInfo *rinfo, + int indexcol, + IndexOptInfo *index); +static IndexClause *get_index_clause_from_support(PlannerInfo *root, + RestrictInfo *rinfo, + Oid funcid, + int indexarg, + int indexcol, + IndexOptInfo *index); +static IndexClause *match_saopclause_to_indexcol(RestrictInfo *rinfo, + int indexcol, + IndexOptInfo *index); +static IndexClause *match_rowcompare_to_indexcol(RestrictInfo *rinfo, int indexcol, - Oid opfamily, - Oid idxcollation, - RowCompareExpr *clause); + IndexOptInfo *index); +static IndexClause *expand_indexqual_rowcompare(RestrictInfo *rinfo, + int indexcol, + IndexOptInfo *index, + Oid expr_op, + bool var_on_left); static void match_pathkeys_to_index(IndexOptInfo *index, List *pathkeys, List **orderby_clauses_p, List **clause_columns_p); @@ -170,30 +189,6 @@ static Expr *match_clause_to_ordering_op(IndexOptInfo *index, static bool ec_member_matches_indexcol(PlannerInfo *root, RelOptInfo *rel, EquivalenceClass *ec, EquivalenceMember *em, void *arg); -static bool match_boolean_index_clause(Node *clause, int indexcol, - IndexOptInfo *index); -static bool match_special_index_operator(Expr *clause, - Oid opfamily, Oid idxcollation, - bool indexkey_on_left); -static IndexClause *expand_indexqual_conditions(IndexOptInfo *index, - int indexcol, - RestrictInfo *rinfo); -static Expr *expand_boolean_index_clause(Node *clause, int indexcol, - IndexOptInfo *index); -static List *expand_indexqual_opclause(RestrictInfo *rinfo, - Oid opfamily, Oid idxcollation, - bool *lossy); -static RestrictInfo *expand_indexqual_rowcompare(RestrictInfo *rinfo, - IndexOptInfo *index, - int indexcol, - List **indexcolnos, - bool *lossy); -static List *prefix_quals(Node *leftop, Oid opfamily, Oid collation, - Const *prefix, Pattern_Prefix_Status pstatus); -static List *network_prefix_quals(Node *leftop, Oid expr_op, Oid opfamily, - Datum rightop); -static Datum string_to_datum(const char *str, Oid datatype); -static Const *string_to_const(const char *str, Oid datatype); /* @@ -272,7 +267,7 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel) * Identify the restriction clauses that can match the index. */ MemSet(&rclauseset, 0, sizeof(rclauseset)); - match_restriction_clauses_to_index(rel, index, &rclauseset); + match_restriction_clauses_to_index(root, index, &rclauseset); /* * Build index paths from the restriction clauses. These will be @@ -1224,7 +1219,7 @@ build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel, * Identify the restriction clauses that can match the index. */ MemSet(&clauseset, 0, sizeof(clauseset)); - match_clauses_to_index(index, clauses, &clauseset); + match_clauses_to_index(root, clauses, index, &clauseset); /* * If no matches so far, and the index predicate isn't useful, we @@ -1236,7 +1231,7 @@ build_paths_for_OR(PlannerInfo *root, RelOptInfo *rel, /* * Add "other" restriction clauses to the clauseset. */ - match_clauses_to_index(index, other_clauses, &clauseset); + match_clauses_to_index(root, other_clauses, index, &clauseset); /* * Construct paths if possible. @@ -2148,11 +2143,12 @@ approximate_joinrel_size(PlannerInfo *root, Relids relids) * Matching clauses are added to *clauseset. */ static void -match_restriction_clauses_to_index(RelOptInfo *rel, IndexOptInfo *index, +match_restriction_clauses_to_index(PlannerInfo *root, + IndexOptInfo *index, IndexClauseSet *clauseset) { /* We can ignore clauses that are implied by the index predicate */ - match_clauses_to_index(index, index->indrestrictinfo, clauseset); + match_clauses_to_index(root, index->indrestrictinfo, index, clauseset); } /* @@ -2182,7 +2178,7 @@ match_join_clauses_to_index(PlannerInfo *root, if (restriction_is_or_clause(rinfo)) *joinorclauses = lappend(*joinorclauses, rinfo); else - match_clause_to_index(index, rinfo, clauseset); + match_clause_to_index(root, rinfo, index, clauseset); } } @@ -2220,7 +2216,7 @@ match_eclass_clauses_to_index(PlannerInfo *root, IndexOptInfo *index, * since for non-btree indexes the EC's equality operators might not * be in the index opclass (cf ec_member_matches_indexcol). */ - match_clauses_to_index(index, clauses, clauseset); + match_clauses_to_index(root, clauses, index, clauseset); } } @@ -2230,8 +2226,9 @@ match_eclass_clauses_to_index(PlannerInfo *root, IndexOptInfo *index, * Matching clauses are added to *clauseset. */ static void -match_clauses_to_index(IndexOptInfo *index, +match_clauses_to_index(PlannerInfo *root, List *clauses, + IndexOptInfo *index, IndexClauseSet *clauseset) { ListCell *lc; @@ -2240,7 +2237,7 @@ match_clauses_to_index(IndexOptInfo *index, { RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc); - match_clause_to_index(index, rinfo, clauseset); + match_clause_to_index(root, rinfo, index, clauseset); } } @@ -2262,8 +2259,9 @@ match_clauses_to_index(IndexOptInfo *index, * same clause multiple times with different index columns. */ static void -match_clause_to_index(IndexOptInfo *index, +match_clause_to_index(PlannerInfo *root, RestrictInfo *rinfo, + IndexOptInfo *index, IndexClauseSet *clauseset) { int indexcol; @@ -2287,6 +2285,7 @@ match_clause_to_index(IndexOptInfo *index, /* OK, check each index key column for a match */ for (indexcol = 0; indexcol < index->nkeycolumns; indexcol++) { + IndexClause *iclause; ListCell *lc; /* Ignore duplicates */ @@ -2298,17 +2297,14 @@ match_clause_to_index(IndexOptInfo *index, return; } - /* - * XXX this should be changed so that we generate an IndexClause - * immediately upon matching, to avoid repeated work. To-do soon. - */ - if (match_clause_to_indexcol(index, - indexcol, - rinfo)) + /* OK, try to match the clause to the index column */ + iclause = match_clause_to_indexcol(root, + rinfo, + indexcol, + index); + if (iclause) { - IndexClause *iclause; - - iclause = expand_indexqual_conditions(index, indexcol, rinfo); + /* Success, so record it */ clauseset->indexclauses[indexcol] = lappend(clauseset->indexclauses[indexcol], iclause); clauseset->nonempty = true; @@ -2319,16 +2315,15 @@ match_clause_to_index(IndexOptInfo *index, /* * match_clause_to_indexcol() - * Determines whether a restriction clause matches a column of an index. + * Determine whether a restriction clause matches a column of an index, + * and if so, build an IndexClause node describing the details. * - * To match an index normally, the clause: + * To match an index normally, an operator clause: * * (1) must be in the form (indexkey op const) or (const op indexkey); * and - * (2) must contain an operator which is in the same family as the index - * operator for this column, or is a "special" operator as recognized - * by match_special_index_operator(); - * and + * (2) must contain an operator which is in the index's operator family + * for this column; and * (3) must match the collation of the index, if collation is relevant. * * Our definition of "const" is exceedingly liberal: we allow anything that @@ -2346,8 +2341,8 @@ match_clause_to_index(IndexOptInfo *index, * Presently, the executor can only deal with indexquals that have the * indexkey on the left, so we can only use clauses that have the indexkey * on the right if we can commute the clause to put the key on the left. - * We do not actually do the commuting here, but we check whether a - * suitable commutator operator is available. + * We handle that by generating an IndexClause with the correctly-commuted + * opclause as a derived indexqual. * * If the index has a collation, the clause must have the same collation. * For collation-less indexes, we assume it doesn't matter; this is @@ -2357,12 +2352,7 @@ match_clause_to_index(IndexOptInfo *index, * embodied in the macro IndexCollMatchesExprColl.) * * It is also possible to match RowCompareExpr clauses to indexes (but - * currently, only btree indexes handle this). In this routine we will - * report a match if the first column of the row comparison matches the - * target index column. This is sufficient to guarantee that some index - * condition can be constructed from the RowCompareExpr --- whether the - * remaining columns match the index too is considered in - * expand_indexqual_rowcompare(). + * currently, only btree indexes handle this). * * It is also possible to match ScalarArrayOpExpr clauses to indexes, when * the clause is of the form "indexkey op ANY (arrayconst)". @@ -2370,82 +2360,71 @@ match_clause_to_index(IndexOptInfo *index, * For boolean indexes, it is also possible to match the clause directly * to the indexkey; or perhaps the clause is (NOT indexkey). * - * 'index' is the index of interest. - * 'indexcol' is a column number of 'index' (counting from 0). + * And, last but not least, some operators and functions can be processed + * to derive (typically lossy) indexquals from a clause that isn't in + * itself indexable. If we see that any operand of an OpExpr or FuncExpr + * matches the index key, and the function has a planner support function + * attached to it, we'll invoke the support function to see if such an + * indexqual can be built. + * * 'rinfo' is the clause to be tested (as a RestrictInfo node). + * 'indexcol' is a column number of 'index' (counting from 0). + * 'index' is the index of interest. * - * Returns true if the clause can be used with this index key. + * Returns an IndexClause if the clause can be used with this index key, + * or NULL if not. * - * NOTE: returns false if clause is an OR or AND clause; it is the + * NOTE: returns NULL if clause is an OR or AND clause; it is the * responsibility of higher-level routines to cope with those. */ -static bool -match_clause_to_indexcol(IndexOptInfo *index, +static IndexClause * +match_clause_to_indexcol(PlannerInfo *root, + RestrictInfo *rinfo, int indexcol, - RestrictInfo *rinfo) + IndexOptInfo *index) { + IndexClause *iclause; Expr *clause = rinfo->clause; - Index index_relid = index->rel->relid; Oid opfamily; - Oid idxcollation; - Node *leftop, - *rightop; - Relids left_relids; - Relids right_relids; - Oid expr_op; - Oid expr_coll; - bool plain_op; Assert(indexcol < index->nkeycolumns); - opfamily = index->opfamily[indexcol]; - idxcollation = index->indexcollations[indexcol]; + /* + * Historically this code has coped with NULL clauses. That's probably + * not possible anymore, but we might as well continue to cope. + */ + if (clause == NULL) + return NULL; /* First check for boolean-index cases. */ + opfamily = index->opfamily[indexcol]; if (IsBooleanOpfamily(opfamily)) { - if (match_boolean_index_clause((Node *) clause, indexcol, index)) - return true; + iclause = match_boolean_index_clause(rinfo, indexcol, index); + if (iclause) + return iclause; } /* - * Clause must be a binary opclause, or possibly a ScalarArrayOpExpr - * (which is always binary, by definition). Or it could be a - * RowCompareExpr, which we pass off to match_rowcompare_to_indexcol(). - * Or, if the index supports it, we can handle IS NULL/NOT NULL clauses. + * Clause must be an opclause, funcclause, ScalarArrayOpExpr, or + * RowCompareExpr. Or, if the index supports it, we can handle IS + * NULL/NOT NULL clauses. */ - if (is_opclause(clause)) + if (IsA(clause, OpExpr)) + { + return match_opclause_to_indexcol(root, rinfo, indexcol, index); + } + else if (IsA(clause, FuncExpr)) { - leftop = get_leftop(clause); - rightop = get_rightop(clause); - if (!leftop || !rightop) - return false; - left_relids = rinfo->left_relids; - right_relids = rinfo->right_relids; - expr_op = ((OpExpr *) clause)->opno; - expr_coll = ((OpExpr *) clause)->inputcollid; - plain_op = true; + return match_funcclause_to_indexcol(root, rinfo, indexcol, index); } - else if (clause && IsA(clause, ScalarArrayOpExpr)) + else if (IsA(clause, ScalarArrayOpExpr)) { - ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) clause; - - /* We only accept ANY clauses, not ALL */ - if (!saop->useOr) - return false; - leftop = (Node *) linitial(saop->args); - rightop = (Node *) lsecond(saop->args); - left_relids = NULL; /* not actually needed */ - right_relids = pull_varnos(rightop); - expr_op = saop->opno; - expr_coll = saop->inputcollid; - plain_op = false; + return match_saopclause_to_indexcol(rinfo, indexcol, index); } - else if (clause && IsA(clause, RowCompareExpr)) + else if (IsA(clause, RowCompareExpr)) { - return match_rowcompare_to_indexcol(index, indexcol, - opfamily, idxcollation, - (RowCompareExpr *) clause); + return match_rowcompare_to_indexcol(rinfo, indexcol, index); } else if (index->amsearchnulls && IsA(clause, NullTest)) { @@ -2453,101 +2432,441 @@ match_clause_to_indexcol(IndexOptInfo *index, if (!nt->argisrow && match_index_to_operand((Node *) nt->arg, indexcol, index)) - return true; - return false; + { + iclause = makeNode(IndexClause); + iclause->rinfo = rinfo; + iclause->indexquals = NIL; + iclause->lossy = false; + iclause->indexcol = indexcol; + iclause->indexcols = NIL; + return iclause; + } + } + + return NULL; +} + +/* + * match_boolean_index_clause + * Recognize restriction clauses that can be matched to a boolean index. + * + * The idea here is that, for an index on a boolean column that supports the + * BooleanEqualOperator, we can transform a plain reference to the indexkey + * into "indexkey = true", or "NOT indexkey" into "indexkey = false", etc, + * so as to make the expression indexable using the index's "=" operator. + * Since Postgres 8.1, we must do this because constant simplification does + * the reverse transformation; without this code there'd be no way to use + * such an index at all. + * + * This should be called only when IsBooleanOpfamily() recognizes the + * index's operator family. We check to see if the clause matches the + * index's key, and if so, build a suitable IndexClause. + */ +static IndexClause * +match_boolean_index_clause(RestrictInfo *rinfo, + int indexcol, + IndexOptInfo *index) +{ + Node *clause = (Node *) rinfo->clause; + Expr *op = NULL; + + /* Direct match? */ + if (match_index_to_operand(clause, indexcol, index)) + { + /* convert to indexkey = TRUE */ + op = make_opclause(BooleanEqualOperator, BOOLOID, false, + (Expr *) clause, + (Expr *) makeBoolConst(true, false), + InvalidOid, InvalidOid); + } + /* NOT clause? */ + else if (is_notclause(clause)) + { + Node *arg = (Node *) get_notclausearg((Expr *) clause); + + if (match_index_to_operand(arg, indexcol, index)) + { + /* convert to indexkey = FALSE */ + op = make_opclause(BooleanEqualOperator, BOOLOID, false, + (Expr *) arg, + (Expr *) makeBoolConst(false, false), + InvalidOid, InvalidOid); + } + } + + /* + * Since we only consider clauses at top level of WHERE, we can convert + * indexkey IS TRUE and indexkey IS FALSE to index searches as well. The + * different meaning for NULL isn't important. + */ + else if (clause && IsA(clause, BooleanTest)) + { + BooleanTest *btest = (BooleanTest *) clause; + Node *arg = (Node *) btest->arg; + + if (btest->booltesttype == IS_TRUE && + match_index_to_operand(arg, indexcol, index)) + { + /* convert to indexkey = TRUE */ + op = make_opclause(BooleanEqualOperator, BOOLOID, false, + (Expr *) arg, + (Expr *) makeBoolConst(true, false), + InvalidOid, InvalidOid); + } + else if (btest->booltesttype == IS_FALSE && + match_index_to_operand(arg, indexcol, index)) + { + /* convert to indexkey = FALSE */ + op = make_opclause(BooleanEqualOperator, BOOLOID, false, + (Expr *) arg, + (Expr *) makeBoolConst(false, false), + InvalidOid, InvalidOid); + } + } + + /* + * If we successfully made an operator clause from the given qual, we must + * wrap it in an IndexClause. It's not lossy. + */ + if (op) + { + IndexClause *iclause = makeNode(IndexClause); + + iclause->rinfo = rinfo; + iclause->indexquals = list_make1(make_simple_restrictinfo(op)); + iclause->lossy = false; + iclause->indexcol = indexcol; + iclause->indexcols = NIL; + return iclause; } - else - return false; + + return NULL; +} + +/* + * match_opclause_to_indexcol() + * Handles the OpExpr case for match_clause_to_indexcol(), + * which see for comments. + */ +static IndexClause * +match_opclause_to_indexcol(PlannerInfo *root, + RestrictInfo *rinfo, + int indexcol, + IndexOptInfo *index) +{ + IndexClause *iclause; + OpExpr *clause = (OpExpr *) rinfo->clause; + Node *leftop, + *rightop; + Oid expr_op; + Oid expr_coll; + Index index_relid; + Oid opfamily; + Oid idxcollation; + + /* + * Only binary operators need apply. (In theory, a planner support + * function could do something with a unary operator, but it seems + * unlikely to be worth the cycles to check.) + */ + if (list_length(clause->args) != 2) + return NULL; + + leftop = (Node *) linitial(clause->args); + rightop = (Node *) lsecond(clause->args); + expr_op = clause->opno; + expr_coll = clause->inputcollid; + + index_relid = index->rel->relid; + opfamily = index->opfamily[indexcol]; + idxcollation = index->indexcollations[indexcol]; /* * Check for clauses of the form: (indexkey operator constant) or - * (constant operator indexkey). See above notes about const-ness. + * (constant operator indexkey). See match_clause_to_indexcol's notes + * about const-ness. + * + * Note that we don't ask the support function about clauses that don't + * have one of these forms. Again, in principle it might be possible to + * do something, but it seems unlikely to be worth the cycles to check. */ if (match_index_to_operand(leftop, indexcol, index) && - !bms_is_member(index_relid, right_relids) && + !bms_is_member(index_relid, rinfo->right_relids) && !contain_volatile_functions(rightop)) { if (IndexCollMatchesExprColl(idxcollation, expr_coll) && - is_indexable_operator(expr_op, opfamily, true)) - return true; + op_in_opfamily(expr_op, opfamily)) + { + iclause = makeNode(IndexClause); + iclause->rinfo = rinfo; + iclause->indexquals = NIL; + iclause->lossy = false; + iclause->indexcol = indexcol; + iclause->indexcols = NIL; + return iclause; + } /* - * If we didn't find a member of the index's opfamily, see whether it - * is a "special" indexable operator. + * If we didn't find a member of the index's opfamily, try the support + * function for the operator's underlying function. */ - if (plain_op && - match_special_index_operator(clause, opfamily, - idxcollation, true)) - return true; - return false; + set_opfuncid(clause); /* make sure we have opfuncid */ + return get_index_clause_from_support(root, + rinfo, + clause->opfuncid, + 0, /* indexarg on left */ + indexcol, + index); } - if (plain_op && - match_index_to_operand(rightop, indexcol, index) && - !bms_is_member(index_relid, left_relids) && + if (match_index_to_operand(rightop, indexcol, index) && + !bms_is_member(index_relid, rinfo->left_relids) && !contain_volatile_functions(leftop)) { - if (IndexCollMatchesExprColl(idxcollation, expr_coll) && - is_indexable_operator(expr_op, opfamily, false)) - return true; + if (IndexCollMatchesExprColl(idxcollation, expr_coll)) + { + Oid comm_op = get_commutator(expr_op); + + if (OidIsValid(comm_op) && + op_in_opfamily(comm_op, opfamily)) + { + RestrictInfo *commrinfo; + + /* Build a commuted OpExpr and RestrictInfo */ + commrinfo = commute_restrictinfo(rinfo, comm_op); + + /* Make an IndexClause showing that as a derived qual */ + iclause = makeNode(IndexClause); + iclause->rinfo = rinfo; + iclause->indexquals = list_make1(commrinfo); + iclause->lossy = false; + iclause->indexcol = indexcol; + iclause->indexcols = NIL; + return iclause; + } + } /* - * If we didn't find a member of the index's opfamily, see whether it - * is a "special" indexable operator. + * If we didn't find a member of the index's opfamily, try the support + * function for the operator's underlying function. */ - if (match_special_index_operator(clause, opfamily, - idxcollation, false)) - return true; - return false; + set_opfuncid(clause); /* make sure we have opfuncid */ + return get_index_clause_from_support(root, + rinfo, + clause->opfuncid, + 1, /* indexarg on right */ + indexcol, + index); } - return false; + return NULL; } /* - * is_indexable_operator - * Does the operator match the specified index opfamily? - * - * If the indexkey is on the right, what we actually want to know - * is whether the operator has a commutator operator that matches - * the opfamily. + * match_funcclause_to_indexcol() + * Handles the FuncExpr case for match_clause_to_indexcol(), + * which see for comments. */ -static bool -is_indexable_operator(Oid expr_op, Oid opfamily, bool indexkey_on_left) +static IndexClause * +match_funcclause_to_indexcol(PlannerInfo *root, + RestrictInfo *rinfo, + int indexcol, + IndexOptInfo *index) { - /* Get the commuted operator if necessary */ - if (!indexkey_on_left) + FuncExpr *clause = (FuncExpr *) rinfo->clause; + int indexarg; + ListCell *lc; + + /* + * We have no built-in intelligence about function clauses, but if there's + * a planner support function, it might be able to do something. But, to + * cut down on wasted planning cycles, only call the support function if + * at least one argument matches the target index column. + * + * Note that we don't insist on the other arguments being pseudoconstants; + * the support function has to check that. This is to allow cases where + * only some of the other arguments need to be included in the indexqual. + */ + indexarg = 0; + foreach(lc, clause->args) { - expr_op = get_commutator(expr_op); - if (expr_op == InvalidOid) - return false; + Node *op = (Node *) lfirst(lc); + + if (match_index_to_operand(op, indexcol, index)) + { + return get_index_clause_from_support(root, + rinfo, + clause->funcid, + indexarg, + indexcol, + index); + } + + indexarg++; + } + + return NULL; +} + +/* + * get_index_clause_from_support() + * If the function has a planner support function, try to construct + * an IndexClause using indexquals created by the support function. + */ +static IndexClause * +get_index_clause_from_support(PlannerInfo *root, + RestrictInfo *rinfo, + Oid funcid, + int indexarg, + int indexcol, + IndexOptInfo *index) +{ + Oid prosupport = get_func_support(funcid); + SupportRequestIndexCondition req; + List *sresult; + + if (!OidIsValid(prosupport)) + return NULL; + + req.type = T_SupportRequestIndexCondition; + req.root = root; + req.funcid = funcid; + req.node = (Node *) rinfo->clause; + req.indexarg = indexarg; + req.index = index; + req.indexcol = indexcol; + req.opfamily = index->opfamily[indexcol]; + req.indexcollation = index->indexcollations[indexcol]; + + req.lossy = true; /* default assumption */ + + sresult = (List *) + DatumGetPointer(OidFunctionCall1(prosupport, + PointerGetDatum(&req))); + + if (sresult != NIL) + { + IndexClause *iclause = makeNode(IndexClause); + List *indexquals = NIL; + ListCell *lc; + + /* + * The support function API says it should just give back bare + * clauses, so here we must wrap each one in a RestrictInfo. + */ + foreach(lc, sresult) + { + Expr *clause = (Expr *) lfirst(lc); + + indexquals = lappend(indexquals, make_simple_restrictinfo(clause)); + } + + iclause->rinfo = rinfo; + iclause->indexquals = indexquals; + iclause->lossy = req.lossy; + iclause->indexcol = indexcol; + iclause->indexcols = NIL; + + return iclause; + } + + return NULL; +} + +/* + * match_saopclause_to_indexcol() + * Handles the ScalarArrayOpExpr case for match_clause_to_indexcol(), + * which see for comments. + */ +static IndexClause * +match_saopclause_to_indexcol(RestrictInfo *rinfo, + int indexcol, + IndexOptInfo *index) +{ + ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) rinfo->clause; + Node *leftop, + *rightop; + Relids right_relids; + Oid expr_op; + Oid expr_coll; + Index index_relid; + Oid opfamily; + Oid idxcollation; + + /* We only accept ANY clauses, not ALL */ + if (!saop->useOr) + return NULL; + leftop = (Node *) linitial(saop->args); + rightop = (Node *) lsecond(saop->args); + right_relids = pull_varnos(rightop); + expr_op = saop->opno; + expr_coll = saop->inputcollid; + + index_relid = index->rel->relid; + opfamily = index->opfamily[indexcol]; + idxcollation = index->indexcollations[indexcol]; + + /* + * We must have indexkey on the left and a pseudo-constant array argument. + */ + if (match_index_to_operand(leftop, indexcol, index) && + !bms_is_member(index_relid, right_relids) && + !contain_volatile_functions(rightop)) + { + if (IndexCollMatchesExprColl(idxcollation, expr_coll) && + op_in_opfamily(expr_op, opfamily)) + { + IndexClause *iclause = makeNode(IndexClause); + + iclause->rinfo = rinfo; + iclause->indexquals = NIL; + iclause->lossy = false; + iclause->indexcol = indexcol; + iclause->indexcols = NIL; + return iclause; + } + + /* + * We do not currently ask support functions about ScalarArrayOpExprs, + * though in principle we could. + */ } - /* OK if the (commuted) operator is a member of the index's opfamily */ - return op_in_opfamily(expr_op, opfamily); + return NULL; } /* * match_rowcompare_to_indexcol() * Handles the RowCompareExpr case for match_clause_to_indexcol(), * which see for comments. + * + * In this routine we check whether the first column of the row comparison + * matches the target index column. This is sufficient to guarantee that some + * index condition can be constructed from the RowCompareExpr --- the rest + * is handled by expand_indexqual_rowcompare(). */ -static bool -match_rowcompare_to_indexcol(IndexOptInfo *index, +static IndexClause * +match_rowcompare_to_indexcol(RestrictInfo *rinfo, int indexcol, - Oid opfamily, - Oid idxcollation, - RowCompareExpr *clause) + IndexOptInfo *index) { - Index index_relid = index->rel->relid; + RowCompareExpr *clause = (RowCompareExpr *) rinfo->clause; + Index index_relid; + Oid opfamily; + Oid idxcollation; Node *leftop, *rightop; + bool var_on_left; Oid expr_op; Oid expr_coll; /* Forget it if we're not dealing with a btree index */ if (index->relam != BTREE_AM_OID) - return false; + return NULL; + + index_relid = index->rel->relid; + opfamily = index->opfamily[indexcol]; + idxcollation = index->indexcollations[indexcol]; /* * We could do the matching on the basis of insisting that the opfamily @@ -2566,16 +2885,17 @@ match_rowcompare_to_indexcol(IndexOptInfo *index, /* Collations must match, if relevant */ if (!IndexCollMatchesExprColl(idxcollation, expr_coll)) - return false; + return NULL; /* - * These syntactic tests are the same as in match_clause_to_indexcol() + * These syntactic tests are the same as in match_opclause_to_indexcol() */ if (match_index_to_operand(leftop, indexcol, index) && !bms_is_member(index_relid, pull_varnos(rightop)) && !contain_volatile_functions(rightop)) { /* OK, indexkey is on left */ + var_on_left = true; } else if (match_index_to_operand(rightop, indexcol, index) && !bms_is_member(index_relid, pull_varnos(leftop)) && @@ -2584,10 +2904,11 @@ match_rowcompare_to_indexcol(IndexOptInfo *index, /* indexkey is on right, so commute the operator */ expr_op = get_commutator(expr_op); if (expr_op == InvalidOid) - return false; + return NULL; + var_on_left = false; } else - return false; + return NULL; /* We're good if the operator is the right type of opfamily member */ switch (get_op_opfamily_strategy(expr_op, opfamily)) @@ -2596,26 +2917,266 @@ match_rowcompare_to_indexcol(IndexOptInfo *index, case BTLessEqualStrategyNumber: case BTGreaterEqualStrategyNumber: case BTGreaterStrategyNumber: - return true; + return expand_indexqual_rowcompare(rinfo, + indexcol, + index, + expr_op, + var_on_left); } - return false; + return NULL; } - -/**************************************************************************** - * ---- ROUTINES TO CHECK ORDERING OPERATORS ---- - ****************************************************************************/ - /* - * match_pathkeys_to_index - * Test whether an index can produce output ordered according to the - * given pathkeys using "ordering operators". + * expand_indexqual_rowcompare --- expand a single indexqual condition + * that is a RowCompareExpr * - * If it can, return a list of suitable ORDER BY expressions, each of the form - * "indexedcol operator pseudoconstant", along with an integer list of the - * index column numbers (zero based) that each clause would be used with. - * NIL lists are returned if the ordering is not achievable this way. + * It's already known that the first column of the row comparison matches + * the specified column of the index. We can use additional columns of the + * row comparison as index qualifications, so long as they match the index + * in the "same direction", ie, the indexkeys are all on the same side of the + * clause and the operators are all the same-type members of the opfamilies. + * + * If all the columns of the RowCompareExpr match in this way, we just use it + * as-is, except for possibly commuting it to put the indexkeys on the left. + * + * Otherwise, we build a shortened RowCompareExpr (if more than one + * column matches) or a simple OpExpr (if the first-column match is all + * there is). In these cases the modified clause is always "<=" or ">=" + * even when the original was "<" or ">" --- this is necessary to match all + * the rows that could match the original. (We are building a lossy version + * of the row comparison when we do this, so we set lossy = true.) + * + * Note: this is really just the last half of match_rowcompare_to_indexcol, + * but we split it out for comprehensibility. + */ +static IndexClause * +expand_indexqual_rowcompare(RestrictInfo *rinfo, + int indexcol, + IndexOptInfo *index, + Oid expr_op, + bool var_on_left) +{ + IndexClause *iclause = makeNode(IndexClause); + RowCompareExpr *clause = (RowCompareExpr *) rinfo->clause; + int op_strategy; + Oid op_lefttype; + Oid op_righttype; + int matching_cols; + List *expr_ops; + List *opfamilies; + List *lefttypes; + List *righttypes; + List *new_ops; + List *var_args; + List *non_var_args; + ListCell *vargs_cell; + ListCell *nargs_cell; + ListCell *opnos_cell; + ListCell *collids_cell; + + iclause->rinfo = rinfo; + iclause->indexcol = indexcol; + + if (var_on_left) + { + var_args = clause->largs; + non_var_args = clause->rargs; + } + else + { + var_args = clause->rargs; + non_var_args = clause->largs; + } + + get_op_opfamily_properties(expr_op, index->opfamily[indexcol], false, + &op_strategy, + &op_lefttype, + &op_righttype); + + /* Initialize returned list of which index columns are used */ + iclause->indexcols = list_make1_int(indexcol); + + /* Build lists of ops, opfamilies and operator datatypes in case needed */ + expr_ops = list_make1_oid(expr_op); + opfamilies = list_make1_oid(index->opfamily[indexcol]); + lefttypes = list_make1_oid(op_lefttype); + righttypes = list_make1_oid(op_righttype); + + /* + * See how many of the remaining columns match some index column in the + * same way. As in match_clause_to_indexcol(), the "other" side of any + * potential index condition is OK as long as it doesn't use Vars from the + * indexed relation. + */ + matching_cols = 1; + vargs_cell = lnext(list_head(var_args)); + nargs_cell = lnext(list_head(non_var_args)); + opnos_cell = lnext(list_head(clause->opnos)); + collids_cell = lnext(list_head(clause->inputcollids)); + + while (vargs_cell != NULL) + { + Node *varop = (Node *) lfirst(vargs_cell); + Node *constop = (Node *) lfirst(nargs_cell); + int i; + + expr_op = lfirst_oid(opnos_cell); + if (!var_on_left) + { + /* indexkey is on right, so commute the operator */ + expr_op = get_commutator(expr_op); + if (expr_op == InvalidOid) + break; /* operator is not usable */ + } + if (bms_is_member(index->rel->relid, pull_varnos(constop))) + break; /* no good, Var on wrong side */ + if (contain_volatile_functions(constop)) + break; /* no good, volatile comparison value */ + + /* + * The Var side can match any column of the index. + */ + for (i = 0; i < index->nkeycolumns; i++) + { + if (match_index_to_operand(varop, i, index) && + get_op_opfamily_strategy(expr_op, + index->opfamily[i]) == op_strategy && + IndexCollMatchesExprColl(index->indexcollations[i], + lfirst_oid(collids_cell))) + break; + } + if (i >= index->ncolumns) + break; /* no match found */ + + /* Add column number to returned list */ + iclause->indexcols = lappend_int(iclause->indexcols, i); + + /* Add operator info to lists */ + get_op_opfamily_properties(expr_op, index->opfamily[i], false, + &op_strategy, + &op_lefttype, + &op_righttype); + expr_ops = lappend_oid(expr_ops, expr_op); + opfamilies = lappend_oid(opfamilies, index->opfamily[i]); + lefttypes = lappend_oid(lefttypes, op_lefttype); + righttypes = lappend_oid(righttypes, op_righttype); + + /* This column matches, keep scanning */ + matching_cols++; + vargs_cell = lnext(vargs_cell); + nargs_cell = lnext(nargs_cell); + opnos_cell = lnext(opnos_cell); + collids_cell = lnext(collids_cell); + } + + /* Result is non-lossy if all columns are usable as index quals */ + iclause->lossy = (matching_cols != list_length(clause->opnos)); + + /* + * We can use rinfo->clause as-is if we have var on left and it's all + * usable as index quals. + */ + if (var_on_left && !iclause->lossy) + iclause->indexquals = NIL; + else + { + /* + * We have to generate a modified rowcompare (possibly just one + * OpExpr). The painful part of this is changing < to <= or > to >=, + * so deal with that first. + */ + if (!iclause->lossy) + { + /* very easy, just use the commuted operators */ + new_ops = expr_ops; + } + else if (op_strategy == BTLessEqualStrategyNumber || + op_strategy == BTGreaterEqualStrategyNumber) + { + /* easy, just use the same (possibly commuted) operators */ + new_ops = list_truncate(expr_ops, matching_cols); + } + else + { + ListCell *opfamilies_cell; + ListCell *lefttypes_cell; + ListCell *righttypes_cell; + + if (op_strategy == BTLessStrategyNumber) + op_strategy = BTLessEqualStrategyNumber; + else if (op_strategy == BTGreaterStrategyNumber) + op_strategy = BTGreaterEqualStrategyNumber; + else + elog(ERROR, "unexpected strategy number %d", op_strategy); + new_ops = NIL; + forthree(opfamilies_cell, opfamilies, + lefttypes_cell, lefttypes, + righttypes_cell, righttypes) + { + Oid opfam = lfirst_oid(opfamilies_cell); + Oid lefttype = lfirst_oid(lefttypes_cell); + Oid righttype = lfirst_oid(righttypes_cell); + + expr_op = get_opfamily_member(opfam, lefttype, righttype, + op_strategy); + if (!OidIsValid(expr_op)) /* should not happen */ + elog(ERROR, "missing operator %d(%u,%u) in opfamily %u", + op_strategy, lefttype, righttype, opfam); + new_ops = lappend_oid(new_ops, expr_op); + } + } + + /* If we have more than one matching col, create a subset rowcompare */ + if (matching_cols > 1) + { + RowCompareExpr *rc = makeNode(RowCompareExpr); + + rc->rctype = (RowCompareType) op_strategy; + rc->opnos = new_ops; + rc->opfamilies = list_truncate(list_copy(clause->opfamilies), + matching_cols); + rc->inputcollids = list_truncate(list_copy(clause->inputcollids), + matching_cols); + rc->largs = list_truncate(copyObject(var_args), + matching_cols); + rc->rargs = list_truncate(copyObject(non_var_args), + matching_cols); + iclause->indexquals = list_make1(make_simple_restrictinfo((Expr *) rc)); + } + else + { + Expr *op; + + /* We don't report an index column list in this case */ + iclause->indexcols = NIL; + + op = make_opclause(linitial_oid(new_ops), BOOLOID, false, + copyObject(linitial(var_args)), + copyObject(linitial(non_var_args)), + InvalidOid, + linitial_oid(clause->inputcollids)); + iclause->indexquals = list_make1(make_simple_restrictinfo(op)); + } + } + + return iclause; +} + + +/**************************************************************************** + * ---- ROUTINES TO CHECK ORDERING OPERATORS ---- + ****************************************************************************/ + +/* + * match_pathkeys_to_index + * Test whether an index can produce output ordered according to the + * given pathkeys using "ordering operators". + * + * If it can, return a list of suitable ORDER BY expressions, each of the form + * "indexedcol operator pseudoconstant", along with an integer list of the + * index column numbers (zero based) that each clause would be used with. + * NIL lists are returned if the ordering is not achievable this way. * * On success, the result list is ordered by pathkeys, and in fact is * one-to-one with the requested pathkeys. @@ -3233,7 +3794,7 @@ indexcol_is_bool_constant_for_query(IndexOptInfo *index, int indexcol) continue; /* See if we can match the clause's expression to the index column */ - if (match_boolean_index_clause((Node *) rinfo->clause, indexcol, index)) + if (match_boolean_index_clause(rinfo, indexcol, index)) return true; } @@ -3323,1057 +3884,3 @@ match_index_to_operand(Node *operand, return false; } - -/**************************************************************************** - * ---- ROUTINES FOR "SPECIAL" INDEXABLE OPERATORS ---- - ****************************************************************************/ - -/* - * These routines handle special optimization of operators that can be - * used with index scans even though they are not known to the executor's - * indexscan machinery. The key idea is that these operators allow us - * to derive approximate indexscan qual clauses, such that any tuples - * that pass the operator clause itself must also satisfy the simpler - * indexscan condition(s). Then we can use the indexscan machinery - * to avoid scanning as much of the table as we'd otherwise have to, - * while applying the original operator as a qpqual condition to ensure - * we deliver only the tuples we want. (In essence, we're using a regular - * index as if it were a lossy index.) - * - * An example of what we're doing is - * textfield LIKE 'abc%' - * from which we can generate the indexscanable conditions - * textfield >= 'abc' AND textfield < 'abd' - * which allow efficient scanning of an index on textfield. - * (In reality, character set and collation issues make the transformation - * from LIKE to indexscan limits rather harder than one might think ... - * but that's the basic idea.) - * - * Another thing that we do with this machinery is to provide special - * smarts for "boolean" indexes (that is, indexes on boolean columns - * that support boolean equality). We can transform a plain reference - * to the indexkey into "indexkey = true", or "NOT indexkey" into - * "indexkey = false", so as to make the expression indexable using the - * regular index operators. (As of Postgres 8.1, we must do this here - * because constant simplification does the reverse transformation; - * without this code there'd be no way to use such an index at all.) - * - * Three routines are provided here: - * - * match_special_index_operator() is just an auxiliary function for - * match_clause_to_indexcol(); after the latter fails to recognize a - * restriction opclause's operator as a member of an index's opfamily, - * it asks match_special_index_operator() whether the clause should be - * considered an indexqual anyway. - * - * match_boolean_index_clause() similarly detects clauses that can be - * converted into boolean equality operators. - * - * expand_indexqual_conditions() converts a RestrictInfo node - * into an IndexClause, which contains clauses - * that the executor can actually handle. For operators that are members of - * the index's opfamily this transformation is a no-op, but clauses recognized - * by match_special_index_operator() or match_boolean_index_clause() must be - * converted into one or more "regular" indexqual conditions. - */ - -/* - * match_boolean_index_clause - * Recognize restriction clauses that can be matched to a boolean index. - * - * This should be called only when IsBooleanOpfamily() recognizes the - * index's operator family. We check to see if the clause matches the - * index's key. - */ -static bool -match_boolean_index_clause(Node *clause, - int indexcol, - IndexOptInfo *index) -{ - /* Direct match? */ - if (match_index_to_operand(clause, indexcol, index)) - return true; - /* NOT clause? */ - if (is_notclause(clause)) - { - if (match_index_to_operand((Node *) get_notclausearg((Expr *) clause), - indexcol, index)) - return true; - } - - /* - * Since we only consider clauses at top level of WHERE, we can convert - * indexkey IS TRUE and indexkey IS FALSE to index searches as well. The - * different meaning for NULL isn't important. - */ - else if (clause && IsA(clause, BooleanTest)) - { - BooleanTest *btest = (BooleanTest *) clause; - - if (btest->booltesttype == IS_TRUE || - btest->booltesttype == IS_FALSE) - if (match_index_to_operand((Node *) btest->arg, - indexcol, index)) - return true; - } - return false; -} - -/* - * match_special_index_operator - * Recognize restriction clauses that can be used to generate - * additional indexscanable qualifications. - * - * The given clause is already known to be a binary opclause having - * the form (indexkey OP pseudoconst) or (pseudoconst OP indexkey), - * but the OP proved not to be one of the index's opfamily operators. - * Return 'true' if we can do something with it anyway. - */ -static bool -match_special_index_operator(Expr *clause, Oid opfamily, Oid idxcollation, - bool indexkey_on_left) -{ - bool isIndexable = false; - Node *rightop; - Oid expr_op; - Oid expr_coll; - Const *patt; - Const *prefix = NULL; - Pattern_Prefix_Status pstatus = Pattern_Prefix_None; - - /* - * Currently, all known special operators require the indexkey on the - * left, but this test could be pushed into the switch statement if some - * are added that do not... - */ - if (!indexkey_on_left) - return false; - - /* we know these will succeed */ - rightop = get_rightop(clause); - expr_op = ((OpExpr *) clause)->opno; - expr_coll = ((OpExpr *) clause)->inputcollid; - - /* again, required for all current special ops: */ - if (!IsA(rightop, Const) || - ((Const *) rightop)->constisnull) - return false; - patt = (Const *) rightop; - - switch (expr_op) - { - case OID_TEXT_LIKE_OP: - case OID_BPCHAR_LIKE_OP: - case OID_NAME_LIKE_OP: - /* the right-hand const is type text for all of these */ - pstatus = pattern_fixed_prefix(patt, Pattern_Type_Like, expr_coll, - &prefix, NULL); - isIndexable = (pstatus != Pattern_Prefix_None); - break; - - case OID_BYTEA_LIKE_OP: - pstatus = pattern_fixed_prefix(patt, Pattern_Type_Like, expr_coll, - &prefix, NULL); - isIndexable = (pstatus != Pattern_Prefix_None); - break; - - case OID_TEXT_ICLIKE_OP: - case OID_BPCHAR_ICLIKE_OP: - case OID_NAME_ICLIKE_OP: - /* the right-hand const is type text for all of these */ - pstatus = pattern_fixed_prefix(patt, Pattern_Type_Like_IC, expr_coll, - &prefix, NULL); - isIndexable = (pstatus != Pattern_Prefix_None); - break; - - case OID_TEXT_REGEXEQ_OP: - case OID_BPCHAR_REGEXEQ_OP: - case OID_NAME_REGEXEQ_OP: - /* the right-hand const is type text for all of these */ - pstatus = pattern_fixed_prefix(patt, Pattern_Type_Regex, expr_coll, - &prefix, NULL); - isIndexable = (pstatus != Pattern_Prefix_None); - break; - - case OID_TEXT_ICREGEXEQ_OP: - case OID_BPCHAR_ICREGEXEQ_OP: - case OID_NAME_ICREGEXEQ_OP: - /* the right-hand const is type text for all of these */ - pstatus = pattern_fixed_prefix(patt, Pattern_Type_Regex_IC, expr_coll, - &prefix, NULL); - isIndexable = (pstatus != Pattern_Prefix_None); - break; - - case OID_INET_SUB_OP: - case OID_INET_SUBEQ_OP: - isIndexable = true; - break; - } - - if (prefix) - { - pfree(DatumGetPointer(prefix->constvalue)); - pfree(prefix); - } - - /* done if the expression doesn't look indexable */ - if (!isIndexable) - return false; - - /* - * Must also check that index's opfamily supports the operators we will - * want to apply. (A hash index, for example, will not support ">=".) - * Currently, only btree and spgist support the operators we need. - * - * Note: actually, in the Pattern_Prefix_Exact case, we only need "=" so a - * hash index would work. Currently it doesn't seem worth checking for - * that, however. - * - * We insist on the opfamily being the specific one we expect, else we'd - * do the wrong thing if someone were to make a reverse-sort opfamily with - * the same operators. - * - * The non-pattern opclasses will not sort the way we need in most non-C - * locales. We can use such an index anyway for an exact match (simple - * equality), but not for prefix-match cases. Note that here we are - * looking at the index's collation, not the expression's collation -- - * this test is *not* dependent on the LIKE/regex operator's collation. - */ - switch (expr_op) - { - case OID_TEXT_LIKE_OP: - case OID_TEXT_ICLIKE_OP: - case OID_TEXT_REGEXEQ_OP: - case OID_TEXT_ICREGEXEQ_OP: - case OID_NAME_LIKE_OP: - case OID_NAME_ICLIKE_OP: - case OID_NAME_REGEXEQ_OP: - case OID_NAME_ICREGEXEQ_OP: - isIndexable = - (opfamily == TEXT_PATTERN_BTREE_FAM_OID) || - (opfamily == TEXT_SPGIST_FAM_OID) || - (opfamily == TEXT_BTREE_FAM_OID && - (pstatus == Pattern_Prefix_Exact || - lc_collate_is_c(idxcollation))); - break; - - case OID_BPCHAR_LIKE_OP: - case OID_BPCHAR_ICLIKE_OP: - case OID_BPCHAR_REGEXEQ_OP: - case OID_BPCHAR_ICREGEXEQ_OP: - isIndexable = - (opfamily == BPCHAR_PATTERN_BTREE_FAM_OID) || - (opfamily == BPCHAR_BTREE_FAM_OID && - (pstatus == Pattern_Prefix_Exact || - lc_collate_is_c(idxcollation))); - break; - - case OID_BYTEA_LIKE_OP: - isIndexable = (opfamily == BYTEA_BTREE_FAM_OID); - break; - - case OID_INET_SUB_OP: - case OID_INET_SUBEQ_OP: - isIndexable = (opfamily == NETWORK_BTREE_FAM_OID); - break; - } - - return isIndexable; -} - -/* - * expand_indexqual_conditions - * Given a RestrictInfo node, create an IndexClause. - * - * Standard qual clauses (those in the index's opfamily) are passed through - * unchanged. Boolean clauses and "special" index operators are expanded - * into clauses that the indexscan machinery will know what to do with. - * RowCompare clauses are simplified if necessary to create a clause that is - * fully checkable by the index. - */ -static IndexClause * -expand_indexqual_conditions(IndexOptInfo *index, - int indexcol, - RestrictInfo *rinfo) -{ - IndexClause *iclause = makeNode(IndexClause); - List *indexquals = NIL; - - iclause->rinfo = rinfo; - iclause->lossy = false; /* might get changed below */ - iclause->indexcol = indexcol; - iclause->indexcols = NIL; /* might get changed below */ - - { - Expr *clause = rinfo->clause; - Oid curFamily; - Oid curCollation; - - Assert(indexcol < index->nkeycolumns); - - curFamily = index->opfamily[indexcol]; - curCollation = index->indexcollations[indexcol]; - - /* First check for boolean cases */ - if (IsBooleanOpfamily(curFamily)) - { - Expr *boolqual; - - boolqual = expand_boolean_index_clause((Node *) clause, - indexcol, - index); - if (boolqual) - { - iclause->indexquals = - list_make1(make_simple_restrictinfo(boolqual)); - return iclause; - } - } - - /* - * Else it must be an opclause (usual case), ScalarArrayOp, - * RowCompare, or NullTest - */ - if (is_opclause(clause)) - { - /* - * Check to see if the indexkey is on the right; if so, commute - * the clause. The indexkey should be the side that refers to - * (only) the base relation. - */ - if (!bms_equal(rinfo->left_relids, index->rel->relids)) - { - Oid opno = ((OpExpr *) clause)->opno; - RestrictInfo *newrinfo; - - newrinfo = commute_restrictinfo(rinfo, - get_commutator(opno)); - - /* - * For now, assume it couldn't be any case that requires - * expansion. (This is OK for the current capabilities of - * expand_indexqual_opclause, but we'll need to remove the - * restriction when we open this up for extensions.) - */ - indexquals = list_make1(newrinfo); - } - else - indexquals = expand_indexqual_opclause(rinfo, - curFamily, - curCollation, - &iclause->lossy); - } - else if (IsA(clause, ScalarArrayOpExpr)) - { - /* no extra work at this time */ - } - else if (IsA(clause, RowCompareExpr)) - { - RestrictInfo *newrinfo; - - newrinfo = expand_indexqual_rowcompare(rinfo, - index, - indexcol, - &iclause->indexcols, - &iclause->lossy); - if (newrinfo != rinfo) - { - /* We need to report a derived expression */ - indexquals = list_make1(newrinfo); - } - } - else if (IsA(clause, NullTest)) - { - Assert(index->amsearchnulls); - } - else - elog(ERROR, "unsupported indexqual type: %d", - (int) nodeTag(clause)); - } - - iclause->indexquals = indexquals; - return iclause; -} - -/* - * expand_boolean_index_clause - * Convert a clause recognized by match_boolean_index_clause into - * a boolean equality operator clause. - * - * Returns NULL if the clause isn't a boolean index qual. - */ -static Expr * -expand_boolean_index_clause(Node *clause, - int indexcol, - IndexOptInfo *index) -{ - /* Direct match? */ - if (match_index_to_operand(clause, indexcol, index)) - { - /* convert to indexkey = TRUE */ - return make_opclause(BooleanEqualOperator, BOOLOID, false, - (Expr *) clause, - (Expr *) makeBoolConst(true, false), - InvalidOid, InvalidOid); - } - /* NOT clause? */ - if (is_notclause(clause)) - { - Node *arg = (Node *) get_notclausearg((Expr *) clause); - - /* It must have matched the indexkey */ - Assert(match_index_to_operand(arg, indexcol, index)); - /* convert to indexkey = FALSE */ - return make_opclause(BooleanEqualOperator, BOOLOID, false, - (Expr *) arg, - (Expr *) makeBoolConst(false, false), - InvalidOid, InvalidOid); - } - if (clause && IsA(clause, BooleanTest)) - { - BooleanTest *btest = (BooleanTest *) clause; - Node *arg = (Node *) btest->arg; - - /* It must have matched the indexkey */ - Assert(match_index_to_operand(arg, indexcol, index)); - if (btest->booltesttype == IS_TRUE) - { - /* convert to indexkey = TRUE */ - return make_opclause(BooleanEqualOperator, BOOLOID, false, - (Expr *) arg, - (Expr *) makeBoolConst(true, false), - InvalidOid, InvalidOid); - } - if (btest->booltesttype == IS_FALSE) - { - /* convert to indexkey = FALSE */ - return make_opclause(BooleanEqualOperator, BOOLOID, false, - (Expr *) arg, - (Expr *) makeBoolConst(false, false), - InvalidOid, InvalidOid); - } - /* Oops */ - Assert(false); - } - - return NULL; -} - -/* - * expand_indexqual_opclause --- expand a single indexqual condition - * that is an operator clause - * - * The input is a single RestrictInfo, the output a list of RestrictInfos, - * or NIL if the RestrictInfo's clause can be used as-is. - * - * In the base case this is just "return NIL", but we have to be prepared to - * expand special cases that were accepted by match_special_index_operator(). - */ -static List * -expand_indexqual_opclause(RestrictInfo *rinfo, Oid opfamily, Oid idxcollation, - bool *lossy) -{ - Expr *clause = rinfo->clause; - - /* we know these will succeed */ - Node *leftop = get_leftop(clause); - Node *rightop = get_rightop(clause); - Oid expr_op = ((OpExpr *) clause)->opno; - Oid expr_coll = ((OpExpr *) clause)->inputcollid; - Const *patt = (Const *) rightop; - Const *prefix = NULL; - Pattern_Prefix_Status pstatus; - - /* - * LIKE and regex operators are not members of any btree index opfamily, - * but they can be members of opfamilies for more exotic index types such - * as GIN. Therefore, we should only do expansion if the operator is - * actually not in the opfamily. But checking that requires a syscache - * lookup, so it's best to first see if the operator is one we are - * interested in. - */ - switch (expr_op) - { - case OID_TEXT_LIKE_OP: - case OID_BPCHAR_LIKE_OP: - case OID_NAME_LIKE_OP: - case OID_BYTEA_LIKE_OP: - if (!op_in_opfamily(expr_op, opfamily)) - { - *lossy = true; - pstatus = pattern_fixed_prefix(patt, Pattern_Type_Like, expr_coll, - &prefix, NULL); - return prefix_quals(leftop, opfamily, idxcollation, prefix, pstatus); - } - break; - - case OID_TEXT_ICLIKE_OP: - case OID_BPCHAR_ICLIKE_OP: - case OID_NAME_ICLIKE_OP: - if (!op_in_opfamily(expr_op, opfamily)) - { - *lossy = true; - /* the right-hand const is type text for all of these */ - pstatus = pattern_fixed_prefix(patt, Pattern_Type_Like_IC, expr_coll, - &prefix, NULL); - return prefix_quals(leftop, opfamily, idxcollation, prefix, pstatus); - } - break; - - case OID_TEXT_REGEXEQ_OP: - case OID_BPCHAR_REGEXEQ_OP: - case OID_NAME_REGEXEQ_OP: - if (!op_in_opfamily(expr_op, opfamily)) - { - *lossy = true; - /* the right-hand const is type text for all of these */ - pstatus = pattern_fixed_prefix(patt, Pattern_Type_Regex, expr_coll, - &prefix, NULL); - return prefix_quals(leftop, opfamily, idxcollation, prefix, pstatus); - } - break; - - case OID_TEXT_ICREGEXEQ_OP: - case OID_BPCHAR_ICREGEXEQ_OP: - case OID_NAME_ICREGEXEQ_OP: - if (!op_in_opfamily(expr_op, opfamily)) - { - *lossy = true; - /* the right-hand const is type text for all of these */ - pstatus = pattern_fixed_prefix(patt, Pattern_Type_Regex_IC, expr_coll, - &prefix, NULL); - return prefix_quals(leftop, opfamily, idxcollation, prefix, pstatus); - } - break; - - case OID_INET_SUB_OP: - case OID_INET_SUBEQ_OP: - if (!op_in_opfamily(expr_op, opfamily)) - { - *lossy = true; - return network_prefix_quals(leftop, expr_op, opfamily, - patt->constvalue); - } - break; - } - - /* Default case: the clause can be used as-is. */ - *lossy = false; - return NIL; -} - -/* - * expand_indexqual_rowcompare --- expand a single indexqual condition - * that is a RowCompareExpr - * - * It's already known that the first column of the row comparison matches - * the specified column of the index. We can use additional columns of the - * row comparison as index qualifications, so long as they match the index - * in the "same direction", ie, the indexkeys are all on the same side of the - * clause and the operators are all the same-type members of the opfamilies. - * - * If all the columns of the RowCompareExpr match in this way, we just use it - * as-is, except for possibly commuting it to put the indexkeys on the left. - * - * Otherwise, we build a shortened RowCompareExpr (if more than one - * column matches) or a simple OpExpr (if the first-column match is all - * there is). In these cases the modified clause is always "<=" or ">=" - * even when the original was "<" or ">" --- this is necessary to match all - * the rows that could match the original. (We are building a lossy version - * of the row comparison when we do this, so we set *lossy = true.) - * - * *indexcolnos receives an integer list of the index column numbers (zero - * based) used in the resulting expression. We have to pass that back - * because createplan.c will need it. - */ -static RestrictInfo * -expand_indexqual_rowcompare(RestrictInfo *rinfo, - IndexOptInfo *index, - int indexcol, - List **indexcolnos, - bool *lossy) -{ - RowCompareExpr *clause = castNode(RowCompareExpr, rinfo->clause); - bool var_on_left; - int op_strategy; - Oid op_lefttype; - Oid op_righttype; - int matching_cols; - Oid expr_op; - List *expr_ops; - List *opfamilies; - List *lefttypes; - List *righttypes; - List *new_ops; - List *var_args; - List *non_var_args; - ListCell *vargs_cell; - ListCell *nargs_cell; - ListCell *opnos_cell; - ListCell *collids_cell; - - /* We have to figure out (again) how the first col matches */ - var_on_left = match_index_to_operand((Node *) linitial(clause->largs), - indexcol, index); - Assert(var_on_left || - match_index_to_operand((Node *) linitial(clause->rargs), - indexcol, index)); - - if (var_on_left) - { - var_args = clause->largs; - non_var_args = clause->rargs; - } - else - { - var_args = clause->rargs; - non_var_args = clause->largs; - } - - expr_op = linitial_oid(clause->opnos); - if (!var_on_left) - expr_op = get_commutator(expr_op); - get_op_opfamily_properties(expr_op, index->opfamily[indexcol], false, - &op_strategy, - &op_lefttype, - &op_righttype); - - /* Initialize returned list of which index columns are used */ - *indexcolnos = list_make1_int(indexcol); - - /* Build lists of ops, opfamilies and operator datatypes in case needed */ - expr_ops = list_make1_oid(expr_op); - opfamilies = list_make1_oid(index->opfamily[indexcol]); - lefttypes = list_make1_oid(op_lefttype); - righttypes = list_make1_oid(op_righttype); - - /* - * See how many of the remaining columns match some index column in the - * same way. As in match_clause_to_indexcol(), the "other" side of any - * potential index condition is OK as long as it doesn't use Vars from the - * indexed relation. - */ - matching_cols = 1; - vargs_cell = lnext(list_head(var_args)); - nargs_cell = lnext(list_head(non_var_args)); - opnos_cell = lnext(list_head(clause->opnos)); - collids_cell = lnext(list_head(clause->inputcollids)); - - while (vargs_cell != NULL) - { - Node *varop = (Node *) lfirst(vargs_cell); - Node *constop = (Node *) lfirst(nargs_cell); - int i; - - expr_op = lfirst_oid(opnos_cell); - if (!var_on_left) - { - /* indexkey is on right, so commute the operator */ - expr_op = get_commutator(expr_op); - if (expr_op == InvalidOid) - break; /* operator is not usable */ - } - if (bms_is_member(index->rel->relid, pull_varnos(constop))) - break; /* no good, Var on wrong side */ - if (contain_volatile_functions(constop)) - break; /* no good, volatile comparison value */ - - /* - * The Var side can match any column of the index. - */ - for (i = 0; i < index->nkeycolumns; i++) - { - if (match_index_to_operand(varop, i, index) && - get_op_opfamily_strategy(expr_op, - index->opfamily[i]) == op_strategy && - IndexCollMatchesExprColl(index->indexcollations[i], - lfirst_oid(collids_cell))) - - break; - } - if (i >= index->ncolumns) - break; /* no match found */ - - /* Add column number to returned list */ - *indexcolnos = lappend_int(*indexcolnos, i); - - /* Add operator info to lists */ - get_op_opfamily_properties(expr_op, index->opfamily[i], false, - &op_strategy, - &op_lefttype, - &op_righttype); - expr_ops = lappend_oid(expr_ops, expr_op); - opfamilies = lappend_oid(opfamilies, index->opfamily[i]); - lefttypes = lappend_oid(lefttypes, op_lefttype); - righttypes = lappend_oid(righttypes, op_righttype); - - /* This column matches, keep scanning */ - matching_cols++; - vargs_cell = lnext(vargs_cell); - nargs_cell = lnext(nargs_cell); - opnos_cell = lnext(opnos_cell); - collids_cell = lnext(collids_cell); - } - - /* Result is non-lossy if all columns are usable as index quals */ - *lossy = (matching_cols != list_length(clause->opnos)); - - /* - * Return clause as-is if we have var on left and it's all usable as index - * quals - */ - if (var_on_left && !*lossy) - return rinfo; - - /* - * We have to generate a modified rowcompare (possibly just one OpExpr). - * The painful part of this is changing < to <= or > to >=, so deal with - * that first. - */ - if (!*lossy) - { - /* very easy, just use the commuted operators */ - new_ops = expr_ops; - } - else if (op_strategy == BTLessEqualStrategyNumber || - op_strategy == BTGreaterEqualStrategyNumber) - { - /* easy, just use the same (possibly commuted) operators */ - new_ops = list_truncate(expr_ops, matching_cols); - } - else - { - ListCell *opfamilies_cell; - ListCell *lefttypes_cell; - ListCell *righttypes_cell; - - if (op_strategy == BTLessStrategyNumber) - op_strategy = BTLessEqualStrategyNumber; - else if (op_strategy == BTGreaterStrategyNumber) - op_strategy = BTGreaterEqualStrategyNumber; - else - elog(ERROR, "unexpected strategy number %d", op_strategy); - new_ops = NIL; - forthree(opfamilies_cell, opfamilies, - lefttypes_cell, lefttypes, - righttypes_cell, righttypes) - { - Oid opfam = lfirst_oid(opfamilies_cell); - Oid lefttype = lfirst_oid(lefttypes_cell); - Oid righttype = lfirst_oid(righttypes_cell); - - expr_op = get_opfamily_member(opfam, lefttype, righttype, - op_strategy); - if (!OidIsValid(expr_op)) /* should not happen */ - elog(ERROR, "missing operator %d(%u,%u) in opfamily %u", - op_strategy, lefttype, righttype, opfam); - new_ops = lappend_oid(new_ops, expr_op); - } - } - - /* If we have more than one matching col, create a subset rowcompare */ - if (matching_cols > 1) - { - RowCompareExpr *rc = makeNode(RowCompareExpr); - - rc->rctype = (RowCompareType) op_strategy; - rc->opnos = new_ops; - rc->opfamilies = list_truncate(list_copy(clause->opfamilies), - matching_cols); - rc->inputcollids = list_truncate(list_copy(clause->inputcollids), - matching_cols); - rc->largs = list_truncate(copyObject(var_args), - matching_cols); - rc->rargs = list_truncate(copyObject(non_var_args), - matching_cols); - return make_simple_restrictinfo((Expr *) rc); - } - else - { - Expr *op; - - /* We don't report an index column list in this case */ - *indexcolnos = NIL; - - op = make_opclause(linitial_oid(new_ops), BOOLOID, false, - copyObject(linitial(var_args)), - copyObject(linitial(non_var_args)), - InvalidOid, - linitial_oid(clause->inputcollids)); - return make_simple_restrictinfo(op); - } -} - -/* - * Given a fixed prefix that all the "leftop" values must have, - * generate suitable indexqual condition(s). opfamily is the index - * operator family; we use it to deduce the appropriate comparison - * operators and operand datatypes. collation is the input collation to use. - */ -static List * -prefix_quals(Node *leftop, Oid opfamily, Oid collation, - Const *prefix_const, Pattern_Prefix_Status pstatus) -{ - List *result; - Oid ldatatype = exprType(leftop); - Oid rdatatype; - Oid oproid; - Expr *expr; - FmgrInfo ltproc; - Const *greaterstr; - - Assert(pstatus != Pattern_Prefix_None); - - switch (opfamily) - { - case TEXT_BTREE_FAM_OID: - case TEXT_PATTERN_BTREE_FAM_OID: - case TEXT_SPGIST_FAM_OID: - rdatatype = TEXTOID; - break; - - case BPCHAR_BTREE_FAM_OID: - case BPCHAR_PATTERN_BTREE_FAM_OID: - rdatatype = BPCHAROID; - break; - - case BYTEA_BTREE_FAM_OID: - rdatatype = BYTEAOID; - break; - - default: - /* shouldn't get here */ - elog(ERROR, "unexpected opfamily: %u", opfamily); - return NIL; - } - - /* - * If necessary, coerce the prefix constant to the right type. The given - * prefix constant is either text or bytea type. - */ - if (prefix_const->consttype != rdatatype) - { - char *prefix; - - switch (prefix_const->consttype) - { - case TEXTOID: - prefix = TextDatumGetCString(prefix_const->constvalue); - break; - case BYTEAOID: - prefix = DatumGetCString(DirectFunctionCall1(byteaout, - prefix_const->constvalue)); - break; - default: - elog(ERROR, "unexpected const type: %u", - prefix_const->consttype); - return NIL; - } - prefix_const = string_to_const(prefix, rdatatype); - pfree(prefix); - } - - /* - * If we found an exact-match pattern, generate an "=" indexqual. - */ - if (pstatus == Pattern_Prefix_Exact) - { - oproid = get_opfamily_member(opfamily, ldatatype, rdatatype, - BTEqualStrategyNumber); - if (oproid == InvalidOid) - elog(ERROR, "no = operator for opfamily %u", opfamily); - expr = make_opclause(oproid, BOOLOID, false, - (Expr *) leftop, (Expr *) prefix_const, - InvalidOid, collation); - result = list_make1(make_simple_restrictinfo(expr)); - return result; - } - - /* - * Otherwise, we have a nonempty required prefix of the values. - * - * We can always say "x >= prefix". - */ - oproid = get_opfamily_member(opfamily, ldatatype, rdatatype, - BTGreaterEqualStrategyNumber); - if (oproid == InvalidOid) - elog(ERROR, "no >= operator for opfamily %u", opfamily); - expr = make_opclause(oproid, BOOLOID, false, - (Expr *) leftop, (Expr *) prefix_const, - InvalidOid, collation); - result = list_make1(make_simple_restrictinfo(expr)); - - /*------- - * If we can create a string larger than the prefix, we can say - * "x < greaterstr". NB: we rely on make_greater_string() to generate - * a guaranteed-greater string, not just a probably-greater string. - * In general this is only guaranteed in C locale, so we'd better be - * using a C-locale index collation. - *------- - */ - oproid = get_opfamily_member(opfamily, ldatatype, rdatatype, - BTLessStrategyNumber); - if (oproid == InvalidOid) - elog(ERROR, "no < operator for opfamily %u", opfamily); - fmgr_info(get_opcode(oproid), <proc); - greaterstr = make_greater_string(prefix_const, <proc, collation); - if (greaterstr) - { - expr = make_opclause(oproid, BOOLOID, false, - (Expr *) leftop, (Expr *) greaterstr, - InvalidOid, collation); - result = lappend(result, make_simple_restrictinfo(expr)); - } - - return result; -} - -/* - * Given a leftop and a rightop, and an inet-family sup/sub operator, - * generate suitable indexqual condition(s). expr_op is the original - * operator, and opfamily is the index opfamily. - */ -static List * -network_prefix_quals(Node *leftop, Oid expr_op, Oid opfamily, Datum rightop) -{ - bool is_eq; - Oid datatype; - Oid opr1oid; - Oid opr2oid; - Datum opr1right; - Datum opr2right; - List *result; - Expr *expr; - - switch (expr_op) - { - case OID_INET_SUB_OP: - datatype = INETOID; - is_eq = false; - break; - case OID_INET_SUBEQ_OP: - datatype = INETOID; - is_eq = true; - break; - default: - elog(ERROR, "unexpected operator: %u", expr_op); - return NIL; - } - - /* - * create clause "key >= network_scan_first( rightop )", or ">" if the - * operator disallows equality. - */ - if (is_eq) - { - opr1oid = get_opfamily_member(opfamily, datatype, datatype, - BTGreaterEqualStrategyNumber); - if (opr1oid == InvalidOid) - elog(ERROR, "no >= operator for opfamily %u", opfamily); - } - else - { - opr1oid = get_opfamily_member(opfamily, datatype, datatype, - BTGreaterStrategyNumber); - if (opr1oid == InvalidOid) - elog(ERROR, "no > operator for opfamily %u", opfamily); - } - - opr1right = network_scan_first(rightop); - - expr = make_opclause(opr1oid, BOOLOID, false, - (Expr *) leftop, - (Expr *) makeConst(datatype, -1, - InvalidOid, /* not collatable */ - -1, opr1right, - false, false), - InvalidOid, InvalidOid); - result = list_make1(make_simple_restrictinfo(expr)); - - /* create clause "key <= network_scan_last( rightop )" */ - - opr2oid = get_opfamily_member(opfamily, datatype, datatype, - BTLessEqualStrategyNumber); - if (opr2oid == InvalidOid) - elog(ERROR, "no <= operator for opfamily %u", opfamily); - - opr2right = network_scan_last(rightop); - - expr = make_opclause(opr2oid, BOOLOID, false, - (Expr *) leftop, - (Expr *) makeConst(datatype, -1, - InvalidOid, /* not collatable */ - -1, opr2right, - false, false), - InvalidOid, InvalidOid); - result = lappend(result, make_simple_restrictinfo(expr)); - - return result; -} - -/* - * Handy subroutines for match_special_index_operator() and friends. - */ - -/* - * Generate a Datum of the appropriate type from a C string. - * Note that all of the supported types are pass-by-ref, so the - * returned value should be pfree'd if no longer needed. - */ -static Datum -string_to_datum(const char *str, Oid datatype) -{ - /* - * We cheat a little by assuming that CStringGetTextDatum() will do for - * bpchar and varchar constants too... - */ - if (datatype == NAMEOID) - return DirectFunctionCall1(namein, CStringGetDatum(str)); - else if (datatype == BYTEAOID) - return DirectFunctionCall1(byteain, CStringGetDatum(str)); - else - return CStringGetTextDatum(str); -} - -/* - * Generate a Const node of the appropriate type from a C string. - */ -static Const * -string_to_const(const char *str, Oid datatype) -{ - Datum conval = string_to_datum(str, datatype); - Oid collation; - int constlen; - - /* - * We only need to support a few datatypes here, so hard-wire properties - * instead of incurring the expense of catalog lookups. - */ - switch (datatype) - { - case TEXTOID: - case VARCHAROID: - case BPCHAROID: - collation = DEFAULT_COLLATION_OID; - constlen = -1; - break; - - case NAMEOID: - collation = C_COLLATION_OID; - constlen = NAMEDATALEN; - break; - - case BYTEAOID: - collation = InvalidOid; - constlen = -1; - break; - - default: - elog(ERROR, "unexpected datatype in string_to_const: %u", - datatype); - return NULL; - } - - return makeConst(datatype, -1, collation, constlen, - conval, false, false); -} diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index 20eead1..53e1bf6 100644 --- a/src/backend/utils/adt/Makefile +++ b/src/backend/utils/adt/Makefile @@ -17,7 +17,8 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \ float.o format_type.o formatting.o genfile.o \ geo_ops.o geo_selfuncs.o geo_spgist.o inet_cidr_ntop.o inet_net_pton.o \ int.o int8.o json.o jsonb.o jsonb_gin.o jsonb_op.o jsonb_util.o \ - jsonfuncs.o like.o lockfuncs.o mac.o mac8.o misc.o name.o \ + jsonfuncs.o like.o likesupport.o lockfuncs.o \ + mac.o mac8.o misc.o name.o \ network.o network_gist.o network_selfuncs.o network_spgist.o \ numeric.o numutils.o oid.o oracle_compat.o \ orderedsetaggs.o partitionfuncs.o pg_locale.o pg_lsn.o \ diff --git a/src/backend/utils/adt/likesupport.c b/src/backend/utils/adt/likesupport.c index e69de29..fa36310 100644 --- a/src/backend/utils/adt/likesupport.c +++ b/src/backend/utils/adt/likesupport.c @@ -0,0 +1,516 @@ +/*------------------------------------------------------------------------- + * + * likesupport.c + * Planner support functions for LIKE, regex, and related operators. + * + * These routines handle special optimization of operators that can be + * used with index scans even though they are not known to the executor's + * indexscan machinery. The key idea is that these operators allow us + * to derive approximate indexscan qual clauses, such that any tuples + * that pass the operator clause itself must also satisfy the simpler + * indexscan condition(s). Then we can use the indexscan machinery + * to avoid scanning as much of the table as we'd otherwise have to, + * while applying the original operator as a qpqual condition to ensure + * we deliver only the tuples we want. (In essence, we're using a regular + * index as if it were a lossy index.) + * + * An example of what we're doing is + * textfield LIKE 'abc%' + * from which we can generate the indexscanable conditions + * textfield >= 'abc' AND textfield < 'abd' + * which allow efficient scanning of an index on textfield. + * (In reality, character set and collation issues make the transformation + * from LIKE to indexscan limits rather harder than one might think ... + * but that's the basic idea.) + * + * Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/likesupport.c + * + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "access/stratnum.h" +#include "catalog/pg_opfamily.h" +#include "catalog/pg_type.h" +#include "nodes/makefuncs.h" +#include "nodes/nodeFuncs.h" +#include "nodes/supportnodes.h" +#include "utils/builtins.h" +#include "utils/fmgroids.h" +#include "utils/lsyscache.h" +#include "utils/pg_locale.h" +#include "utils/selfuncs.h" + + +static Node *like_regex_support(Node *rawreq, Pattern_Type ptype); +static List *match_pattern_prefix(Node *leftop, + Node *rightop, + Pattern_Type ptype, + Oid expr_coll, + Oid opfamily, + Oid indexcollation); +static List *match_network_function(Node *leftop, + Node *rightop, + int indexarg, + Oid funcid, + Oid opfamily); +static List *match_network_subset(Node *leftop, + Node *rightop, + bool is_eq, + Oid opfamily); + + +/* + * Planner support functions for LIKE, regex, and related operators + */ +Datum +textlike_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + + PG_RETURN_POINTER(like_regex_support(rawreq, Pattern_Type_Like)); +} + +Datum +texticlike_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + + PG_RETURN_POINTER(like_regex_support(rawreq, Pattern_Type_Like_IC)); +} + +Datum +textregexeq_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + + PG_RETURN_POINTER(like_regex_support(rawreq, Pattern_Type_Regex)); +} + +Datum +texticregexeq_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + + PG_RETURN_POINTER(like_regex_support(rawreq, Pattern_Type_Regex_IC)); +} + +/* Common code for the above */ +static Node * +like_regex_support(Node *rawreq, Pattern_Type ptype) +{ + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestIndexCondition)) + { + /* Try to convert operator/function call to index conditions */ + SupportRequestIndexCondition *req = (SupportRequestIndexCondition *) rawreq; + + /* + * Currently we have no "reverse" match operators with the pattern on + * the left, so we only need consider cases with the indexkey on the + * left. + */ + if (req->indexarg != 0) + return NULL; + + if (is_opclause(req->node)) + { + OpExpr *clause = (OpExpr *) req->node; + + Assert(list_length(clause->args) == 2); + ret = (Node *) + match_pattern_prefix((Node *) linitial(clause->args), + (Node *) lsecond(clause->args), + ptype, + clause->inputcollid, + req->opfamily, + req->indexcollation); + } + else if (is_funcclause(req->node)) /* be paranoid */ + { + FuncExpr *clause = (FuncExpr *) req->node; + + Assert(list_length(clause->args) == 2); + ret = (Node *) + match_pattern_prefix((Node *) linitial(clause->args), + (Node *) lsecond(clause->args), + ptype, + clause->inputcollid, + req->opfamily, + req->indexcollation); + } + } + + return ret; +} + +/* + * Planner support function for network subset/superset operators + */ +Datum +network_subset_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestIndexCondition)) + { + /* Try to convert operator/function call to index conditions */ + SupportRequestIndexCondition *req = (SupportRequestIndexCondition *) rawreq; + + if (is_opclause(req->node)) + { + OpExpr *clause = (OpExpr *) req->node; + + Assert(list_length(clause->args) == 2); + ret = (Node *) + match_network_function((Node *) linitial(clause->args), + (Node *) lsecond(clause->args), + req->indexarg, + req->funcid, + req->opfamily); + } + else if (is_funcclause(req->node)) /* be paranoid */ + { + FuncExpr *clause = (FuncExpr *) req->node; + + Assert(list_length(clause->args) == 2); + ret = (Node *) + match_network_function((Node *) linitial(clause->args), + (Node *) lsecond(clause->args), + req->indexarg, + req->funcid, + req->opfamily); + } + } + + PG_RETURN_POINTER(ret); +} + + +/* + * match_pattern_prefix + * Try to generate an indexqual for a LIKE or regex operator. + */ +static List * +match_pattern_prefix(Node *leftop, + Node *rightop, + Pattern_Type ptype, + Oid expr_coll, + Oid opfamily, + Oid indexcollation) +{ + List *result; + Const *patt; + Const *prefix; + Pattern_Prefix_Status pstatus; + Oid ldatatype; + Oid rdatatype; + Oid oproid; + Expr *expr; + FmgrInfo ltproc; + Const *greaterstr; + + /* + * Can't do anything with a non-constant or NULL pattern argument. + * + * Note that since we restrict ourselves to cases with a hard constant on + * the RHS, it's a-fortiori a pseudoconstant, and we don't need to worry + * about verifying that. + */ + if (!IsA(rightop, Const) || + ((Const *) rightop)->constisnull) + return NIL; + patt = (Const *) rightop; + + /* + * Try to extract a fixed prefix from the pattern. + */ + pstatus = pattern_fixed_prefix(patt, ptype, expr_coll, + &prefix, NULL); + + /* fail if no fixed prefix */ + if (pstatus == Pattern_Prefix_None) + return NIL; + + /* + * Must also check that index's opfamily supports the operators we will + * want to apply. (A hash index, for example, will not support ">=".) + * Currently, only btree and spgist support the operators we need. + * + * Note: actually, in the Pattern_Prefix_Exact case, we only need "=" so a + * hash index would work. Currently it doesn't seem worth checking for + * that, however. + * + * We insist on the opfamily being one of the specific ones we expect, + * else we'd do the wrong thing if someone were to make a reverse-sort + * opfamily with the same operators. + * + * The non-pattern opclasses will not sort the way we need in most non-C + * locales. We can use such an index anyway for an exact match (simple + * equality), but not for prefix-match cases. Note that here we are + * looking at the index's collation, not the expression's collation -- + * this test is *not* dependent on the LIKE/regex operator's collation. + * + * While we're at it, identify the type the comparison constant(s) should + * have, based on the opfamily. + */ + switch (opfamily) + { + case TEXT_BTREE_FAM_OID: + if (!(pstatus == Pattern_Prefix_Exact || + lc_collate_is_c(indexcollation))) + return NIL; + rdatatype = TEXTOID; + break; + + case TEXT_PATTERN_BTREE_FAM_OID: + case TEXT_SPGIST_FAM_OID: + rdatatype = TEXTOID; + break; + + case BPCHAR_BTREE_FAM_OID: + if (!(pstatus == Pattern_Prefix_Exact || + lc_collate_is_c(indexcollation))) + return NIL; + rdatatype = BPCHAROID; + break; + + case BPCHAR_PATTERN_BTREE_FAM_OID: + rdatatype = BPCHAROID; + break; + + case BYTEA_BTREE_FAM_OID: + rdatatype = BYTEAOID; + break; + + default: + return NIL; + } + + /* OK, prepare to create the indexqual(s) */ + ldatatype = exprType(leftop); + + /* + * If necessary, coerce the prefix constant to the right type. The given + * prefix constant is either text or bytea type, therefore the only case + * where we need to do anything is when converting text to bpchar. Those + * two types are binary-compatible, so relabeling the Const node is + * sufficient. + */ + if (prefix->consttype != rdatatype) + { + Assert(prefix->consttype == TEXTOID && + rdatatype == BPCHAROID); + prefix->consttype = rdatatype; + } + + /* + * If we found an exact-match pattern, generate an "=" indexqual. + */ + if (pstatus == Pattern_Prefix_Exact) + { + oproid = get_opfamily_member(opfamily, ldatatype, rdatatype, + BTEqualStrategyNumber); + if (oproid == InvalidOid) + elog(ERROR, "no = operator for opfamily %u", opfamily); + expr = make_opclause(oproid, BOOLOID, false, + (Expr *) leftop, (Expr *) prefix, + InvalidOid, indexcollation); + result = list_make1(expr); + return result; + } + + /* + * Otherwise, we have a nonempty required prefix of the values. + * + * We can always say "x >= prefix". + */ + oproid = get_opfamily_member(opfamily, ldatatype, rdatatype, + BTGreaterEqualStrategyNumber); + if (oproid == InvalidOid) + elog(ERROR, "no >= operator for opfamily %u", opfamily); + expr = make_opclause(oproid, BOOLOID, false, + (Expr *) leftop, (Expr *) prefix, + InvalidOid, indexcollation); + result = list_make1(expr); + + /*------- + * If we can create a string larger than the prefix, we can say + * "x < greaterstr". NB: we rely on make_greater_string() to generate + * a guaranteed-greater string, not just a probably-greater string. + * In general this is only guaranteed in C locale, so we'd better be + * using a C-locale index collation. + *------- + */ + oproid = get_opfamily_member(opfamily, ldatatype, rdatatype, + BTLessStrategyNumber); + if (oproid == InvalidOid) + elog(ERROR, "no < operator for opfamily %u", opfamily); + fmgr_info(get_opcode(oproid), <proc); + greaterstr = make_greater_string(prefix, <proc, indexcollation); + if (greaterstr) + { + expr = make_opclause(oproid, BOOLOID, false, + (Expr *) leftop, (Expr *) greaterstr, + InvalidOid, indexcollation); + result = lappend(result, expr); + } + + return result; +} + + +/* + * match_network_function + * Try to generate an indexqual for a network subset/superset function. + * + * This layer is just concerned with identifying the function and swapping + * the arguments if necessary. + */ +static List * +match_network_function(Node *leftop, + Node *rightop, + int indexarg, + Oid funcid, + Oid opfamily) +{ + switch (funcid) + { + case F_NETWORK_SUB: + /* indexkey must be on the left */ + if (indexarg != 0) + return NIL; + return match_network_subset(leftop, rightop, false, opfamily); + + case F_NETWORK_SUBEQ: + /* indexkey must be on the left */ + if (indexarg != 0) + return NIL; + return match_network_subset(leftop, rightop, true, opfamily); + + case F_NETWORK_SUP: + /* indexkey must be on the right */ + if (indexarg != 1) + return NIL; + return match_network_subset(rightop, leftop, false, opfamily); + + case F_NETWORK_SUPEQ: + /* indexkey must be on the right */ + if (indexarg != 1) + return NIL; + return match_network_subset(rightop, leftop, true, opfamily); + + default: + + /* + * We'd only get here if somebody attached this support function + * to an unexpected function. Maybe we should complain, but for + * now, do nothing. + */ + return NIL; + } +} + +/* + * match_network_subset + * Try to generate an indexqual for a network subset function. + */ +static List * +match_network_subset(Node *leftop, + Node *rightop, + bool is_eq, + Oid opfamily) +{ + List *result; + Datum rightopval; + Oid datatype = INETOID; + Oid opr1oid; + Oid opr2oid; + Datum opr1right; + Datum opr2right; + Expr *expr; + + /* + * Can't do anything with a non-constant or NULL comparison value. + * + * Note that since we restrict ourselves to cases with a hard constant on + * the RHS, it's a-fortiori a pseudoconstant, and we don't need to worry + * about verifying that. + */ + if (!IsA(rightop, Const) || + ((Const *) rightop)->constisnull) + return NIL; + rightopval = ((Const *) rightop)->constvalue; + + /* + * Must check that index's opfamily supports the operators we will want to + * apply. + * + * We insist on the opfamily being the specific one we expect, else we'd + * do the wrong thing if someone were to make a reverse-sort opfamily with + * the same operators. + */ + if (opfamily != NETWORK_BTREE_FAM_OID) + return NIL; + + /* + * create clause "key >= network_scan_first( rightopval )", or ">" if the + * operator disallows equality. + * + * Note: seeing that this function supports only fixed values for opfamily + * and datatype, we could just hard-wire the operator OIDs instead of + * looking them up. But for now it seems better to be general. + */ + if (is_eq) + { + opr1oid = get_opfamily_member(opfamily, datatype, datatype, + BTGreaterEqualStrategyNumber); + if (opr1oid == InvalidOid) + elog(ERROR, "no >= operator for opfamily %u", opfamily); + } + else + { + opr1oid = get_opfamily_member(opfamily, datatype, datatype, + BTGreaterStrategyNumber); + if (opr1oid == InvalidOid) + elog(ERROR, "no > operator for opfamily %u", opfamily); + } + + opr1right = network_scan_first(rightopval); + + expr = make_opclause(opr1oid, BOOLOID, false, + (Expr *) leftop, + (Expr *) makeConst(datatype, -1, + InvalidOid, /* not collatable */ + -1, opr1right, + false, false), + InvalidOid, InvalidOid); + result = list_make1(expr); + + /* create clause "key <= network_scan_last( rightopval )" */ + + opr2oid = get_opfamily_member(opfamily, datatype, datatype, + BTLessEqualStrategyNumber); + if (opr2oid == InvalidOid) + elog(ERROR, "no <= operator for opfamily %u", opfamily); + + opr2right = network_scan_last(rightopval); + + expr = make_opclause(opr2oid, BOOLOID, false, + (Expr *) leftop, + (Expr *) makeConst(datatype, -1, + InvalidOid, /* not collatable */ + -1, opr2right, + false, false), + InvalidOid, InvalidOid); + result = lappend(result, expr); + + return result; +} diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 221ffbd..b8dede6 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -189,17 +189,21 @@ prosrc => 'i4tochar' }, { oid => '79', - proname => 'nameregexeq', prorettype => 'bool', proargtypes => 'name text', - prosrc => 'nameregexeq' }, + proname => 'nameregexeq', prosupport => 'textregexeq_support', + prorettype => 'bool', proargtypes => 'name text', prosrc => 'nameregexeq' }, { oid => '1252', proname => 'nameregexne', prorettype => 'bool', proargtypes => 'name text', prosrc => 'nameregexne' }, { oid => '1254', - proname => 'textregexeq', prorettype => 'bool', proargtypes => 'text text', - prosrc => 'textregexeq' }, + proname => 'textregexeq', prosupport => 'textregexeq_support', + prorettype => 'bool', proargtypes => 'text text', prosrc => 'textregexeq' }, { oid => '1256', proname => 'textregexne', prorettype => 'bool', proargtypes => 'text text', prosrc => 'textregexne' }, +{ oid => '1364', descr => 'planner support for textregexeq', + proname => 'textregexeq_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'textregexeq_support' }, + { oid => '1257', descr => 'length', proname => 'textlen', prorettype => 'int4', proargtypes => 'text', prosrc => 'textlen' }, @@ -1637,8 +1641,11 @@ proname => 'position', prorettype => 'int4', proargtypes => 'text text', prosrc => 'textpos' }, { oid => '850', - proname => 'textlike', prorettype => 'bool', proargtypes => 'text text', - prosrc => 'textlike' }, + proname => 'textlike', prosupport => 'textlike_support', prorettype => 'bool', + proargtypes => 'text text', prosrc => 'textlike' }, +{ oid => '1023', descr => 'planner support for textlike', + proname => 'textlike_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'textlike_support' }, { oid => '851', proname => 'textnlike', prorettype => 'bool', proargtypes => 'text text', prosrc => 'textnlike' }, @@ -1663,8 +1670,8 @@ proargtypes => 'int4 int8', prosrc => 'int48ge' }, { oid => '858', - proname => 'namelike', prorettype => 'bool', proargtypes => 'name text', - prosrc => 'namelike' }, + proname => 'namelike', prosupport => 'textlike_support', prorettype => 'bool', + proargtypes => 'name text', prosrc => 'namelike' }, { oid => '859', proname => 'namenlike', prorettype => 'bool', proargtypes => 'name text', prosrc => 'namenlike' }, @@ -2354,14 +2361,17 @@ prosrc => 'int8smaller' }, { oid => '1238', - proname => 'texticregexeq', prorettype => 'bool', proargtypes => 'text text', - prosrc => 'texticregexeq' }, + proname => 'texticregexeq', prosupport => 'texticregexeq_support', + prorettype => 'bool', proargtypes => 'text text', prosrc => 'texticregexeq' }, +{ oid => '1024', descr => 'planner support for texticregexeq', + proname => 'texticregexeq_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'texticregexeq_support' }, { oid => '1239', proname => 'texticregexne', prorettype => 'bool', proargtypes => 'text text', prosrc => 'texticregexne' }, { oid => '1240', - proname => 'nameicregexeq', prorettype => 'bool', proargtypes => 'name text', - prosrc => 'nameicregexeq' }, + proname => 'nameicregexeq', prosupport => 'texticregexeq_support', + prorettype => 'bool', proargtypes => 'name text', prosrc => 'nameicregexeq' }, { oid => '1241', proname => 'nameicregexne', prorettype => 'bool', proargtypes => 'name text', prosrc => 'nameicregexne' }, @@ -3130,14 +3140,14 @@ prosrc => 'bittypmodout' }, { oid => '1569', descr => 'matches LIKE expression', - proname => 'like', prorettype => 'bool', proargtypes => 'text text', - prosrc => 'textlike' }, + proname => 'like', prosupport => 'textlike_support', prorettype => 'bool', + proargtypes => 'text text', prosrc => 'textlike' }, { oid => '1570', descr => 'does not match LIKE expression', proname => 'notlike', prorettype => 'bool', proargtypes => 'text text', prosrc => 'textnlike' }, { oid => '1571', descr => 'matches LIKE expression', - proname => 'like', prorettype => 'bool', proargtypes => 'name text', - prosrc => 'namelike' }, + proname => 'like', prosupport => 'textlike_support', prorettype => 'bool', + proargtypes => 'name text', prosrc => 'namelike' }, { oid => '1572', descr => 'does not match LIKE expression', proname => 'notlike', prorettype => 'bool', proargtypes => 'name text', prosrc => 'namenlike' }, @@ -3301,21 +3311,24 @@ proargtypes => 'float8 interval', prosrc => 'mul_d_interval' }, { oid => '1631', - proname => 'bpcharlike', prorettype => 'bool', proargtypes => 'bpchar text', - prosrc => 'textlike' }, + proname => 'bpcharlike', prosupport => 'textlike_support', + prorettype => 'bool', proargtypes => 'bpchar text', prosrc => 'textlike' }, { oid => '1632', proname => 'bpcharnlike', prorettype => 'bool', proargtypes => 'bpchar text', prosrc => 'textnlike' }, { oid => '1633', - proname => 'texticlike', prorettype => 'bool', proargtypes => 'text text', - prosrc => 'texticlike' }, + proname => 'texticlike', prosupport => 'texticlike_support', + prorettype => 'bool', proargtypes => 'text text', prosrc => 'texticlike' }, +{ oid => '1025', descr => 'planner support for texticlike', + proname => 'texticlike_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'texticlike_support' }, { oid => '1634', proname => 'texticnlike', prorettype => 'bool', proargtypes => 'text text', prosrc => 'texticnlike' }, { oid => '1635', - proname => 'nameiclike', prorettype => 'bool', proargtypes => 'name text', - prosrc => 'nameiclike' }, + proname => 'nameiclike', prosupport => 'texticlike_support', + prorettype => 'bool', proargtypes => 'name text', prosrc => 'nameiclike' }, { oid => '1636', proname => 'nameicnlike', prorettype => 'bool', proargtypes => 'name text', prosrc => 'nameicnlike' }, @@ -3324,20 +3337,21 @@ prosrc => 'like_escape' }, { oid => '1656', - proname => 'bpcharicregexeq', prorettype => 'bool', - proargtypes => 'bpchar text', prosrc => 'texticregexeq' }, + proname => 'bpcharicregexeq', prosupport => 'texticregexeq_support', + prorettype => 'bool', proargtypes => 'bpchar text', + prosrc => 'texticregexeq' }, { oid => '1657', proname => 'bpcharicregexne', prorettype => 'bool', proargtypes => 'bpchar text', prosrc => 'texticregexne' }, { oid => '1658', - proname => 'bpcharregexeq', prorettype => 'bool', - proargtypes => 'bpchar text', prosrc => 'textregexeq' }, + proname => 'bpcharregexeq', prosupport => 'textregexeq_support', + prorettype => 'bool', proargtypes => 'bpchar text', prosrc => 'textregexeq' }, { oid => '1659', proname => 'bpcharregexne', prorettype => 'bool', proargtypes => 'bpchar text', prosrc => 'textregexne' }, { oid => '1660', - proname => 'bpchariclike', prorettype => 'bool', proargtypes => 'bpchar text', - prosrc => 'texticlike' }, + proname => 'bpchariclike', prosupport => 'texticlike_support', + prorettype => 'bool', proargtypes => 'bpchar text', prosrc => 'texticlike' }, { oid => '1661', proname => 'bpcharicnlike', prorettype => 'bool', proargtypes => 'bpchar text', prosrc => 'texticnlike' }, @@ -3878,17 +3892,21 @@ proname => 'network_cmp', proleakproof => 't', prorettype => 'int4', proargtypes => 'inet inet', prosrc => 'network_cmp' }, { oid => '927', - proname => 'network_sub', prorettype => 'bool', proargtypes => 'inet inet', - prosrc => 'network_sub' }, + proname => 'network_sub', prosupport => 'network_subset_support', + prorettype => 'bool', proargtypes => 'inet inet', prosrc => 'network_sub' }, { oid => '928', - proname => 'network_subeq', prorettype => 'bool', proargtypes => 'inet inet', - prosrc => 'network_subeq' }, + proname => 'network_subeq', prosupport => 'network_subset_support', + prorettype => 'bool', proargtypes => 'inet inet', prosrc => 'network_subeq' }, { oid => '929', - proname => 'network_sup', prorettype => 'bool', proargtypes => 'inet inet', - prosrc => 'network_sup' }, + proname => 'network_sup', prosupport => 'network_subset_support', + prorettype => 'bool', proargtypes => 'inet inet', prosrc => 'network_sup' }, { oid => '930', - proname => 'network_supeq', prorettype => 'bool', proargtypes => 'inet inet', - prosrc => 'network_supeq' }, + proname => 'network_supeq', prosupport => 'network_subset_support', + prorettype => 'bool', proargtypes => 'inet inet', prosrc => 'network_supeq' }, +{ oid => '1173', descr => 'planner support for network_sub/superset', + proname => 'network_subset_support', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'network_subset_support' }, + { oid => '3551', proname => 'network_overlap', prorettype => 'bool', proargtypes => 'inet inet', prosrc => 'network_overlap' }, @@ -5482,14 +5500,14 @@ prosrc => 'select $1::pg_catalog.text || $2' }, { oid => '2005', - proname => 'bytealike', prorettype => 'bool', proargtypes => 'bytea bytea', - prosrc => 'bytealike' }, + proname => 'bytealike', prosupport => 'textlike_support', + prorettype => 'bool', proargtypes => 'bytea bytea', prosrc => 'bytealike' }, { oid => '2006', proname => 'byteanlike', prorettype => 'bool', proargtypes => 'bytea bytea', prosrc => 'byteanlike' }, { oid => '2007', descr => 'matches LIKE expression', - proname => 'like', prorettype => 'bool', proargtypes => 'bytea bytea', - prosrc => 'bytealike' }, + proname => 'like', prosupport => 'textlike_support', prorettype => 'bool', + proargtypes => 'bytea bytea', prosrc => 'bytealike' }, { oid => '2008', descr => 'does not match LIKE expression', proname => 'notlike', prorettype => 'bool', proargtypes => 'bytea bytea', prosrc => 'byteanlike' }, diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h index 453079a..f938925 100644 --- a/src/include/nodes/nodes.h +++ b/src/include/nodes/nodes.h @@ -510,7 +510,8 @@ typedef enum NodeTag T_SupportRequestSimplify, /* in nodes/supportnodes.h */ T_SupportRequestSelectivity, /* in nodes/supportnodes.h */ T_SupportRequestCost, /* in nodes/supportnodes.h */ - T_SupportRequestRows /* in nodes/supportnodes.h */ + T_SupportRequestRows, /* in nodes/supportnodes.h */ + T_SupportRequestIndexCondition /* in nodes/supportnodes.h */ } NodeTag; /* diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h index 1a3a36b..5778fcb 100644 --- a/src/include/nodes/supportnodes.h +++ b/src/include/nodes/supportnodes.h @@ -35,7 +35,8 @@ #include "nodes/primnodes.h" -struct PlannerInfo; /* avoid including relation.h here */ +struct PlannerInfo; /* avoid including pathnodes.h here */ +struct IndexOptInfo; struct SpecialJoinInfo; @@ -167,4 +168,41 @@ typedef struct SupportRequestRows double rows; /* number of rows expected to be returned */ } SupportRequestRows; +/* + * The IndexCondition request allows the support function to generate + * a directly-indexable condition based on a target function call that is + * not itself indexable. The target function call must appear at the top + * level of WHERE or JOIN/ON, so this applies only to functions returning + * boolean. + * + * The "node" argument is the parse node that is invoking the target function; + * currently this will always be a FuncExpr or OpExpr. The call is made + * only if at least one function argument matches an index column's variable + * or expression. "indexarg" identifies the matching argument (it's the + * zero-based position in the node's args list). + * + * If the transformation is possible, return a List of directly-indexable + * condition expressions, else return NULL. + * + * XXX much more to write here. + */ +typedef struct SupportRequestIndexCondition +{ + NodeTag type; + + /* Input fields: */ + struct PlannerInfo *root; /* Planner's infrastructure */ + Oid funcid; /* function we are inquiring about */ + Node *node; /* parse node invoking function */ + int indexarg; /* index of function arg matching indexcol */ + struct IndexOptInfo *index; /* planner's info about target index */ + int indexcol; /* index of target index column (0-based) */ + Oid opfamily; /* index column's operator family */ + Oid indexcollation; /* index column's collation */ + + /* Output fields: */ + bool lossy; /* set to false if index condition is an exact + * equivalent of the function call */ +} SupportRequestIndexCondition; + #endif /* SUPPORTNODES_H */ diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out index 0bd48dc..b21298a 100644 --- a/src/test/regress/expected/btree_index.out +++ b/src/test/regress/expected/btree_index.out @@ -105,6 +105,15 @@ SELECT b.* set enable_seqscan to false; set enable_indexscan to true; set enable_bitmapscan to false; +explain (costs off) +select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1; + QUERY PLAN +------------------------------------------------------------------------------ + Index Only Scan using pg_proc_proname_args_nsp_index on pg_proc + Index Cond: ((proname >= 'RI_FKey'::text) AND (proname < 'RI_FKez'::text)) + Filter: (proname ~~ 'RI\_FKey%del'::text) +(3 rows) + select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1; proname ------------------------ @@ -115,8 +124,42 @@ select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1; RI_FKey_setnull_del (5 rows) +explain (costs off) +select proname from pg_proc where proname ilike '00%foo' order by 1; + QUERY PLAN +-------------------------------------------------------------------- + Index Only Scan using pg_proc_proname_args_nsp_index on pg_proc + Index Cond: ((proname >= '00'::text) AND (proname < '01'::text)) + Filter: (proname ~~* '00%foo'::text) +(3 rows) + +select proname from pg_proc where proname ilike '00%foo' order by 1; + proname +--------- +(0 rows) + +explain (costs off) +select proname from pg_proc where proname ilike 'ri%foo' order by 1; + QUERY PLAN +----------------------------------------------------------------- + Index Only Scan using pg_proc_proname_args_nsp_index on pg_proc + Filter: (proname ~~* 'ri%foo'::text) +(2 rows) + set enable_indexscan to false; set enable_bitmapscan to true; +explain (costs off) +select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1; + QUERY PLAN +------------------------------------------------------------------------------------------ + Sort + Sort Key: proname + -> Bitmap Heap Scan on pg_proc + Filter: (proname ~~ 'RI\_FKey%del'::text) + -> Bitmap Index Scan on pg_proc_proname_args_nsp_index + Index Cond: ((proname >= 'RI_FKey'::text) AND (proname < 'RI_FKez'::text)) +(6 rows) + select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1; proname ------------------------ @@ -127,6 +170,34 @@ select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1; RI_FKey_setnull_del (5 rows) +explain (costs off) +select proname from pg_proc where proname ilike '00%foo' order by 1; + QUERY PLAN +-------------------------------------------------------------------------------- + Sort + Sort Key: proname + -> Bitmap Heap Scan on pg_proc + Filter: (proname ~~* '00%foo'::text) + -> Bitmap Index Scan on pg_proc_proname_args_nsp_index + Index Cond: ((proname >= '00'::text) AND (proname < '01'::text)) +(6 rows) + +select proname from pg_proc where proname ilike '00%foo' order by 1; + proname +--------- +(0 rows) + +explain (costs off) +select proname from pg_proc where proname ilike 'ri%foo' order by 1; + QUERY PLAN +----------------------------------------------------------------- + Index Only Scan using pg_proc_proname_args_nsp_index on pg_proc + Filter: (proname ~~* 'ri%foo'::text) +(2 rows) + +reset enable_seqscan; +reset enable_indexscan; +reset enable_bitmapscan; -- -- Test B-tree page deletion. In particular, deleting a non-leaf page. -- diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out index 4932869..5d4eb59 100644 --- a/src/test/regress/expected/create_index.out +++ b/src/test/regress/expected/create_index.out @@ -3201,6 +3201,24 @@ explain (costs off) Index Cond: (b = false) (3 rows) +explain (costs off) + select * from boolindex where b is true order by i desc limit 10; + QUERY PLAN +---------------------------------------------------------------- + Limit + -> Index Scan Backward using boolindex_b_i_key on boolindex + Index Cond: (b = true) +(3 rows) + +explain (costs off) + select * from boolindex where b is false order by i desc limit 10; + QUERY PLAN +---------------------------------------------------------------- + Limit + -> Index Scan Backward using boolindex_b_i_key on boolindex + Index Cond: (b = false) +(3 rows) + -- -- Test for multilevel page deletion -- diff --git a/src/test/regress/expected/inet.out b/src/test/regress/expected/inet.out index be9427e..2420237 100644 --- a/src/test/regress/expected/inet.out +++ b/src/test/regress/expected/inet.out @@ -242,6 +242,15 @@ SELECT '' AS ten, set_masklen(inet(text(i)), 24) FROM INET_TBL; -- check that btree index works correctly CREATE INDEX inet_idx1 ON inet_tbl(i); SET enable_seqscan TO off; +EXPLAIN (COSTS OFF) +SELECT * FROM inet_tbl WHERE i<<'192.168.1.0/24'::cidr; + QUERY PLAN +------------------------------------------------------------------------------- + Index Scan using inet_idx1 on inet_tbl + Index Cond: ((i > '192.168.1.0/24'::inet) AND (i <= '192.168.1.255'::inet)) + Filter: (i << '192.168.1.0/24'::inet) +(3 rows) + SELECT * FROM inet_tbl WHERE i<<'192.168.1.0/24'::cidr; c | i ----------------+------------------ @@ -250,6 +259,15 @@ SELECT * FROM inet_tbl WHERE i<<'192.168.1.0/24'::cidr; 192.168.1.0/26 | 192.168.1.226 (3 rows) +EXPLAIN (COSTS OFF) +SELECT * FROM inet_tbl WHERE i<<='192.168.1.0/24'::cidr; + QUERY PLAN +-------------------------------------------------------------------------------- + Index Scan using inet_idx1 on inet_tbl + Index Cond: ((i >= '192.168.1.0/24'::inet) AND (i <= '192.168.1.255'::inet)) + Filter: (i <<= '192.168.1.0/24'::inet) +(3 rows) + SELECT * FROM inet_tbl WHERE i<<='192.168.1.0/24'::cidr; c | i ----------------+------------------ @@ -261,6 +279,43 @@ SELECT * FROM inet_tbl WHERE i<<='192.168.1.0/24'::cidr; 192.168.1.0/26 | 192.168.1.226 (6 rows) +EXPLAIN (COSTS OFF) +SELECT * FROM inet_tbl WHERE '192.168.1.0/24'::cidr >>= i; + QUERY PLAN +-------------------------------------------------------------------------------- + Index Scan using inet_idx1 on inet_tbl + Index Cond: ((i >= '192.168.1.0/24'::inet) AND (i <= '192.168.1.255'::inet)) + Filter: ('192.168.1.0/24'::inet >>= i) +(3 rows) + +SELECT * FROM inet_tbl WHERE '192.168.1.0/24'::cidr >>= i; + c | i +----------------+------------------ + 192.168.1.0/24 | 192.168.1.0/24 + 192.168.1.0/24 | 192.168.1.226/24 + 192.168.1.0/24 | 192.168.1.255/24 + 192.168.1.0/24 | 192.168.1.0/25 + 192.168.1.0/24 | 192.168.1.255/25 + 192.168.1.0/26 | 192.168.1.226 +(6 rows) + +EXPLAIN (COSTS OFF) +SELECT * FROM inet_tbl WHERE '192.168.1.0/24'::cidr >> i; + QUERY PLAN +------------------------------------------------------------------------------- + Index Scan using inet_idx1 on inet_tbl + Index Cond: ((i > '192.168.1.0/24'::inet) AND (i <= '192.168.1.255'::inet)) + Filter: ('192.168.1.0/24'::inet >> i) +(3 rows) + +SELECT * FROM inet_tbl WHERE '192.168.1.0/24'::cidr >> i; + c | i +----------------+------------------ + 192.168.1.0/24 | 192.168.1.0/25 + 192.168.1.0/24 | 192.168.1.255/25 + 192.168.1.0/26 | 192.168.1.226 +(3 rows) + SET enable_seqscan TO on; DROP INDEX inet_idx1; -- check that gist index works correctly diff --git a/src/test/regress/expected/rowtypes.out b/src/test/regress/expected/rowtypes.out index 054faabb..6ff2fd3 100644 --- a/src/test/regress/expected/rowtypes.out +++ b/src/test/regress/expected/rowtypes.out @@ -294,6 +294,105 @@ order by thousand, tenthous; 999 | 9999 (25 rows) +explain (costs off) +select thousand, tenthous, four from tenk1 +where (thousand, tenthous, four) > (998, 5000, 3) +order by thousand, tenthous; + QUERY PLAN +----------------------------------------------------------------------- + Sort + Sort Key: thousand, tenthous + -> Bitmap Heap Scan on tenk1 + Filter: (ROW(thousand, tenthous, four) > ROW(998, 5000, 3)) + -> Bitmap Index Scan on tenk1_thous_tenthous + Index Cond: (ROW(thousand, tenthous) >= ROW(998, 5000)) +(6 rows) + +select thousand, tenthous, four from tenk1 +where (thousand, tenthous, four) > (998, 5000, 3) +order by thousand, tenthous; + thousand | tenthous | four +----------+----------+------ + 998 | 5998 | 2 + 998 | 6998 | 2 + 998 | 7998 | 2 + 998 | 8998 | 2 + 998 | 9998 | 2 + 999 | 999 | 3 + 999 | 1999 | 3 + 999 | 2999 | 3 + 999 | 3999 | 3 + 999 | 4999 | 3 + 999 | 5999 | 3 + 999 | 6999 | 3 + 999 | 7999 | 3 + 999 | 8999 | 3 + 999 | 9999 | 3 +(15 rows) + +explain (costs off) +select thousand, tenthous from tenk1 +where (998, 5000) < (thousand, tenthous) +order by thousand, tenthous; + QUERY PLAN +---------------------------------------------------------- + Index Only Scan using tenk1_thous_tenthous on tenk1 + Index Cond: (ROW(thousand, tenthous) > ROW(998, 5000)) +(2 rows) + +select thousand, tenthous from tenk1 +where (998, 5000) < (thousand, tenthous) +order by thousand, tenthous; + thousand | tenthous +----------+---------- + 998 | 5998 + 998 | 6998 + 998 | 7998 + 998 | 8998 + 998 | 9998 + 999 | 999 + 999 | 1999 + 999 | 2999 + 999 | 3999 + 999 | 4999 + 999 | 5999 + 999 | 6999 + 999 | 7999 + 999 | 8999 + 999 | 9999 +(15 rows) + +explain (costs off) +select thousand, hundred from tenk1 +where (998, 5000) < (thousand, hundred) +order by thousand, hundred; + QUERY PLAN +----------------------------------------------------------- + Sort + Sort Key: thousand, hundred + -> Bitmap Heap Scan on tenk1 + Filter: (ROW(998, 5000) < ROW(thousand, hundred)) + -> Bitmap Index Scan on tenk1_thous_tenthous + Index Cond: (thousand >= 998) +(6 rows) + +select thousand, hundred from tenk1 +where (998, 5000) < (thousand, hundred) +order by thousand, hundred; + thousand | hundred +----------+--------- + 999 | 99 + 999 | 99 + 999 | 99 + 999 | 99 + 999 | 99 + 999 | 99 + 999 | 99 + 999 | 99 + 999 | 99 + 999 | 99 +(10 rows) + -- Test case for bug #14010: indexed row comparisons fail with nulls create temp table test_table (a text, b text); insert into test_table values ('a', 'b'); diff --git a/src/test/regress/sql/btree_index.sql b/src/test/regress/sql/btree_index.sql index 21171f7..2b087be 100644 --- a/src/test/regress/sql/btree_index.sql +++ b/src/test/regress/sql/btree_index.sql @@ -59,11 +59,29 @@ SELECT b.* set enable_seqscan to false; set enable_indexscan to true; set enable_bitmapscan to false; +explain (costs off) select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1; +select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1; +explain (costs off) +select proname from pg_proc where proname ilike '00%foo' order by 1; +select proname from pg_proc where proname ilike '00%foo' order by 1; +explain (costs off) +select proname from pg_proc where proname ilike 'ri%foo' order by 1; set enable_indexscan to false; set enable_bitmapscan to true; +explain (costs off) +select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1; select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1; +explain (costs off) +select proname from pg_proc where proname ilike '00%foo' order by 1; +select proname from pg_proc where proname ilike '00%foo' order by 1; +explain (costs off) +select proname from pg_proc where proname ilike 'ri%foo' order by 1; + +reset enable_seqscan; +reset enable_indexscan; +reset enable_bitmapscan; -- -- Test B-tree page deletion. In particular, deleting a non-leaf page. diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql index 59da6b6..67ecad8 100644 --- a/src/test/regress/sql/create_index.sql +++ b/src/test/regress/sql/create_index.sql @@ -1135,6 +1135,10 @@ explain (costs off) select * from boolindex where b = true order by i desc limit 10; explain (costs off) select * from boolindex where not b order by i limit 10; +explain (costs off) + select * from boolindex where b is true order by i desc limit 10; +explain (costs off) + select * from boolindex where b is false order by i desc limit 10; -- -- Test for multilevel page deletion diff --git a/src/test/regress/sql/inet.sql b/src/test/regress/sql/inet.sql index 880e115..bbfa9d3 100644 --- a/src/test/regress/sql/inet.sql +++ b/src/test/regress/sql/inet.sql @@ -65,8 +65,18 @@ SELECT '' AS ten, set_masklen(inet(text(i)), 24) FROM INET_TBL; -- check that btree index works correctly CREATE INDEX inet_idx1 ON inet_tbl(i); SET enable_seqscan TO off; +EXPLAIN (COSTS OFF) +SELECT * FROM inet_tbl WHERE i<<'192.168.1.0/24'::cidr; SELECT * FROM inet_tbl WHERE i<<'192.168.1.0/24'::cidr; +EXPLAIN (COSTS OFF) SELECT * FROM inet_tbl WHERE i<<='192.168.1.0/24'::cidr; +SELECT * FROM inet_tbl WHERE i<<='192.168.1.0/24'::cidr; +EXPLAIN (COSTS OFF) +SELECT * FROM inet_tbl WHERE '192.168.1.0/24'::cidr >>= i; +SELECT * FROM inet_tbl WHERE '192.168.1.0/24'::cidr >>= i; +EXPLAIN (COSTS OFF) +SELECT * FROM inet_tbl WHERE '192.168.1.0/24'::cidr >> i; +SELECT * FROM inet_tbl WHERE '192.168.1.0/24'::cidr >> i; SET enable_seqscan TO on; DROP INDEX inet_idx1; diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql index 454d462..ea93347 100644 --- a/src/test/regress/sql/rowtypes.sql +++ b/src/test/regress/sql/rowtypes.sql @@ -119,6 +119,33 @@ select thousand, tenthous from tenk1 where (thousand, tenthous) >= (997, 5000) order by thousand, tenthous; +explain (costs off) +select thousand, tenthous, four from tenk1 +where (thousand, tenthous, four) > (998, 5000, 3) +order by thousand, tenthous; + +select thousand, tenthous, four from tenk1 +where (thousand, tenthous, four) > (998, 5000, 3) +order by thousand, tenthous; + +explain (costs off) +select thousand, tenthous from tenk1 +where (998, 5000) < (thousand, tenthous) +order by thousand, tenthous; + +select thousand, tenthous from tenk1 +where (998, 5000) < (thousand, tenthous) +order by thousand, tenthous; + +explain (costs off) +select thousand, hundred from tenk1 +where (998, 5000) < (thousand, hundred) +order by thousand, hundred; + +select thousand, hundred from tenk1 +where (998, 5000) < (thousand, hundred) +order by thousand, hundred; + -- Test case for bug #14010: indexed row comparisons fail with nulls create temp table test_table (a text, b text); insert into test_table values ('a', 'b');
On Mon, Jan 28, 2019 at 9:51 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > is people like PostGIS, who already cleared that bar. I hope that > we'll soon have a bunch of examples, like those in the 0004 patch, > that people can look at to see how to do things in this area. I see > no reason to believe it'll be all that much harder than anything > else extension authors have to do. It's a little harder :) So... trying to figure out how to use SupportRequestIndexCondition to convert a call to Intersects() in to a call that also has the operator && as well. Looking at the examples, they are making use of the opfamily that comes in SupportRequestIndexCondition.opfamily. That opfamily Oid is the first one in the IndexOptInfo.opfamily array. Here's where my thread of understanding fails to follow. I have, in PostGIS, actually no operator families defined (CREATE OPERATOR FAMILY). I do, however, have quite a few operator classes defined for geometry: 10, actually! btree_geometry_ops hash_geometry_ops gist_geometry_ops_2d gist_geometry_ops_nd brin_geometry_inclusion_ops_2d brin_geometry_inclusion_ops_3d brin_geometry_inclusion_ops_4d spgist_geometry_ops_2d spgist_geometry_ops_nd spgist_geometry_ops_nd Some of those are not useful to me (btree, hash) for sure. Some of them (gist_geometry_ops_2d, spgist_geometry_ops_2d ) use the && operator to indicate the lossy operation I would like to combine with ST_Intersects. Some of them (gist_geometry_ops_nd, spgist_geometry_ops_nd) use the &&& operator to indicate the lossy operation I would like to combine with ST_Intersects. A given call to ST_Intersects(tbl1.geom, tbl2.geom) could have two indexes to apply the problem, but SupportRequestIndexCondition.opfamily will, I assume, only be exposing one to me: which one? Anyways, to true up how hard this is, I've been carefully reading the implementations for network address types and LIKE, and I'm still barely at the WTF stage. The selectivity and the number of rows support modes I could do. The SupportRequestIndexCondition is based on a detailed knowledge of what an operator family is, an operator class is, how those relate to types... I think I can get there, but it's going to be far from easy (for me). And it'll put a pretty high bar in front of anyone who previously just whacked an inline SQL function in place to get an index assisted function up and running. P.
Paul Ramsey <pramsey@cleverelephant.ca> writes: > So... trying to figure out how to use SupportRequestIndexCondition to > convert a call to Intersects() in to a call that also has the operator > && as well. OK. > Looking at the examples, they are making use of the opfamily that > comes in SupportRequestIndexCondition.opfamily. > That opfamily Oid is the first one in the IndexOptInfo.opfamily array. > Here's where my thread of understanding fails to follow. I have, in > PostGIS, actually no operator families defined (CREATE OPERATOR > FAMILY). I do, however, have quite a few operator classes defined for > geometry: 10, actually! Yes, you do have operator families: there's no such thing as an operator class without a containing operator family, and hasn't been for quite a long time. If you write CREATE OPERATOR CLASS without a FAMILY clause, the command silently creates an opfamily with the same name you specified for the opclass, and links the two together. > Some of them (gist_geometry_ops_2d, spgist_geometry_ops_2d ) use the > && operator to indicate the lossy operation I would like to combine > with ST_Intersects. > Some of them (gist_geometry_ops_nd, spgist_geometry_ops_nd) use the > &&& operator to indicate the lossy operation I would like to combine > with ST_Intersects. Right. So the hard part here is to figure out whether the OID you're handed matches one of these operator families. As I mentioned (in the other thread [1], maybe you didn't see it?) the best short-term idea I've got for that is to look up the opfamily by OID (see the OPFAMILYOID syscache) and check to see if its name matches one of the above. You might want to verify that the index AM's OID is what you expect, too, just for a little extra safety. > A given call to ST_Intersects(tbl1.geom, tbl2.geom) could have two > indexes to apply the problem, but > SupportRequestIndexCondition.opfamily will, I assume, only be exposing > one to me: which one? It's whichever one the index column's opclass belongs to. Basically what you're trying to do here is verify whether the index will support the optimization you want to perform. > Anyways, to true up how hard this is, I've been carefully reading the > implementations for network address types and LIKE, and I'm still > barely at the WTF stage. The selectivity and the number of rows > support modes I could do. The SupportRequestIndexCondition is based on > a detailed knowledge of what an operator family is, an operator class > is, how those relate to types... I think I can get there, but it's > going to be far from easy (for me). You definitely want to read this: https://www.postgresql.org/docs/devel/xindex.html#XINDEX-OPFAMILY and maybe some of the surrounding sections. > And it'll put a pretty high bar in > front of anyone who previously just whacked an inline SQL function in > place to get an index assisted function up and running. Sure, but that was a pretty lame way of getting the optimization, as you well know because you've been fighting its deficiencies for so long. Perhaps at some point we'll have some infrastructure that makes this less painful, but it's not happening for v12. regards, tom lane [1] https://www.postgresql.org/message-id/22876.1550591107@sss.pgh.pa.us
On Mon, Feb 25, 2019 at 3:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Looking at the examples, they are making use of the opfamily that > > comes in SupportRequestIndexCondition.opfamily. > > That opfamily Oid is the first one in the IndexOptInfo.opfamily array. > > Here's where my thread of understanding fails to follow. I have, in > > PostGIS, actually no operator families defined (CREATE OPERATOR > > FAMILY). I do, however, have quite a few operator classes defined for > > geometry: 10, actually! > > Yes, you do have operator families: there's no such thing as an operator > class without a containing operator family, and hasn't been for quite > a long time. If you write CREATE OPERATOR CLASS without a FAMILY > clause, the command silently creates an opfamily with the same name you > specified for the opclass, and links the two together. OK, starting to understand... > > Some of them (gist_geometry_ops_2d, spgist_geometry_ops_2d ) use the > > && operator to indicate the lossy operation I would like to combine > > with ST_Intersects. > > Some of them (gist_geometry_ops_nd, spgist_geometry_ops_nd) use the > > &&& operator to indicate the lossy operation I would like to combine > > with ST_Intersects. > > Right. So the hard part here is to figure out whether the OID you're > handed matches one of these operator families. As I mentioned (in > the other thread [1], maybe you didn't see it?) the best short-term > idea I've got for that is to look up the opfamily by OID (see the > OPFAMILYOID syscache) and check to see if its name matches one of > the above. You might want to verify that the index AM's OID is what > you expect, too, just for a little extra safety. I read it, I just didn't entirely understand it. I think maybe I do know? I'm reading and re-reading everything and trying to build a mental model that makes sense :) Back to SupportRequestIndexCondition.opfamily though: > It's whichever one the index column's opclass belongs to. Basically what > you're trying to do here is verify whether the index will support the > optimization you want to perform. * If I have tbl1.geom * and I have built two indexes on it, a btree_geometry_ops and a gist_geometry_ops_2d, and * and SupportRequestIndexCondition.opfamily returns me the btree family * and I look and see, "damn there is no && operator in there" * am I SOL, even though an appropriate index does exist? > Sure, but that was a pretty lame way of getting the optimization, > as you well know because you've been fighting its deficiencies for > so long. Hrm. :) I will agree to disagree. This is an intellectually interesting journey, but most of its length is quite far removed from our proximate goal of adding realistic costs to our functions, and the code added will be quite a bit harder for folks to follow than what it replaces. Reading your code is a pleasure and the comments are great, it's just a hard slog up for someone who is still going "Node*, hm, how does that work..." ATB, P > Perhaps at some point we'll have some infrastructure that makes this > less painful, but it's not happening for v12. > > regards, tom lane > > [1] https://www.postgresql.org/message-id/22876.1550591107@sss.pgh.pa.us
Paul Ramsey <pramsey@cleverelephant.ca> writes: > On Mon, Feb 25, 2019 at 3:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It's whichever one the index column's opclass belongs to. Basically what >> you're trying to do here is verify whether the index will support the >> optimization you want to perform. > * If I have tbl1.geom > * and I have built two indexes on it, a btree_geometry_ops and a > gist_geometry_ops_2d, and > * and SupportRequestIndexCondition.opfamily returns me the btree family > * and I look and see, "damn there is no && operator in there" > * am I SOL, even though an appropriate index does exist? No. If there are two indexes matching your function's argument, you'll get a separate call for each index. The support function is only responsible for thinking about one index at a time and seeing if it can be used. If more than one can be used, figuring out which one is better is done by later cost comparisons. regards, tom lane
On Mon, Feb 25, 2019 at 4:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Paul Ramsey <pramsey@cleverelephant.ca> writes: > > On Mon, Feb 25, 2019 at 3:01 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> It's whichever one the index column's opclass belongs to. Basically what > >> you're trying to do here is verify whether the index will support the > >> optimization you want to perform. > > > * If I have tbl1.geom > > * and I have built two indexes on it, a btree_geometry_ops and a > > gist_geometry_ops_2d, and > > * and SupportRequestIndexCondition.opfamily returns me the btree family > > * and I look and see, "damn there is no && operator in there" > > * am I SOL, even though an appropriate index does exist? > > No. If there are two indexes matching your function's argument, you'll > get a separate call for each index. The support function is only > responsible for thinking about one index at a time and seeing if it > can be used. If more than one can be used, figuring out which > one is better is done by later cost comparisons. Ah, wonderful! New line of questioning: under what conditions will the support function be called in a T_SupportRequestIndexCondition mode? I have created a table (foo) a geometry column (g) and an index (GIST on foo(g)) and am running a query against foo using a noop function with a support function bound to it. The support function is called, twice, once in T_SupportRequestSimplify mode and once in T_SupportRequestCost mode. What triggers T_SupportRequestIndexCondition mode? Thanks! P
Paul Ramsey <pramsey@cleverelephant.ca> writes: > New line of questioning: under what conditions will the support > function be called in a T_SupportRequestIndexCondition mode? It'll be called if the target function appears at top level of a WHERE or JOIN condition and any one of the function's arguments syntactically matches some column of an index. If there's multiple arguments matching the same index column, say index on "x" and we have "f(z, x, x)", you'll get one call and it will tell you about the first match (req->indexarg == 1 in this example). Sorting out what to do in such a case is your responsibility. If there's arguments matching more than one index column, say index declared on (x, y) and we have "f(x, y)", you'll get a separate call for each index column. Again, sorting out what to do for each one is your responsibility. In most cases, multiple matching arguments are going to lead to failure to construct any useful index condition, because your comparison value has to be a pseudoconstant (ie, not a variable from the same table, so in both of the above examples there's no function argument you could compare to). But we don't prejudge that, because it's possible that a function with 3 or more arguments could produce something useful anyway. For instance, if what we've got is "f(x, y, constant)" then it's possible that the semantics of the function are such that y can be ignored and we can make something indexable like "x && constant". All this is the support function's job to know. > I have > created a table (foo) a geometry column (g) and an index (GIST on > foo(g)) and am running a query against foo using a noop function with > a support function bound to it. > The support function is called, twice, once in > T_SupportRequestSimplify mode and once in T_SupportRequestCost mode. What's the query look like exactly? The other two calls will occur anyway, but SupportRequestIndexCondition depends on the function call's placement. regards, tom lane
> On Feb 26, 2019, at 2:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> I have >> created a table (foo) a geometry column (g) and an index (GIST on >> foo(g)) and am running a query against foo using a noop function with >> a support function bound to it. > >> The support function is called, twice, once in >> T_SupportRequestSimplify mode and once in T_SupportRequestCost mode. > > What's the query look like exactly? The other two calls will occur > anyway, but SupportRequestIndexCondition depends on the function > call's placement. select geos_intersects_new(g, 'POINT(0 0)') from foo; > > regards, tom lane
Paul Ramsey <pramsey@cleverelephant.ca> writes: > On Feb 26, 2019, at 2:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> What's the query look like exactly? The other two calls will occur >> anyway, but SupportRequestIndexCondition depends on the function >> call's placement. > select geos_intersects_new(g, 'POINT(0 0)') from foo; Right, so that's not useful for an index scan. Try select * from foo where geos_intersects_new(g, 'POINT(0 0)'). regards, tom lane
> On Feb 26, 2019, at 2:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > In most cases, multiple matching arguments are going to lead to > failure to construct any useful index condition, because your > comparison value has to be a pseudoconstant (ie, not a variable > from the same table, so in both of the above examples there's > no function argument you could compare to). This term “pseudoconstant” has been causing me some worry as it crops up in your explanations a fair amount. I expect tohave queries of the form SELECT a.*, b.* FROM a JOIN b ON ST_Intersects(a.geom, b.geom) And I expect to be able to rewrite that in terms of having an additional call to the index operator (&&) and there won’tbe a constant on either side of the operator. Am I mis-understanding the term, or are there issues with using this APIin a join context? P. > But we don't prejudge > that, because it's possible that a function with 3 or more arguments > could produce something useful anyway. For instance, if what we've > got is "f(x, y, constant)" then it's possible that the semantics of > the function are such that y can be ignored and we can make something > indexable like "x && constant". All this is the support function's > job to know. > regards, tom lane
Paul Ramsey <pramsey@cleverelephant.ca> writes: >> On Feb 26, 2019, at 2:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> In most cases, multiple matching arguments are going to lead to >> failure to construct any useful index condition, because your >> comparison value has to be a pseudoconstant (ie, not a variable >> from the same table, so in both of the above examples there's >> no function argument you could compare to). > This term “pseudoconstant” has been causing me some worry as it crops up > in your explanations a fair amount. It is defined in the documentation, but what it boils down to is that your comparison value can't contain either (1) variables from the same table the index is on or (2) volatile functions. There is a function defined in optimizer.h that can check that for you, so you don't have to worry too much about the details. > I expect to have queries of the form > SELECT a.*, b.* > FROM a > JOIN b > ON ST_Intersects(a.geom, b.geom) Sure, that's fine. If there are indexes on both a.geom and b.geom, you'll get separate opportunities to match to each of those, and what you'd be constructing in each case is an indexqual that has to be used on the inner side of a nestloop join (so that the outer side can provide the comparison value). What's not fine is "WHERE ST_Intersects(a.geom, a.othergeom)" ... you can't make an indexscan out of that, at least not with the && operator. regards, tom lane
A few more questions… The documentation says that a support function should have a signature "supportfn(internal) returns internal”, but doesn’tsay which (if any) annotations should be provided. IMMUTABLE? PARALLEL SAFE? STRICT? None? All? Variable SupportRequestCost is very exciting, but given that variable cost is usually driven by the complexity of arguments,what kind of argument is the SupportRequestCost call fed during the planning stage? Constant arguments are prettystraight forward, but what gets sent in when a column is one (or all) of the arguments? Thanks, P
Paul Ramsey <pramsey@cleverelephant.ca> writes: > The documentation says that a support function should have a signature "supportfn(internal) returns internal”, but doesn’tsay which (if any) annotations should be provided. IMMUTABLE? PARALLEL SAFE? STRICT? None? All? It doesn't matter much given that these things aren't callable from SQL. The builtin ones are marked immutable/safe/strict, but that's mostly because that's the default state for builtin functions. The only one I'd get excited about is marking it strict if you're not going to check for a null argument ... and even that is neatnik-ism, not something that will have any practical effect. > Variable SupportRequestCost is very exciting, but given that variable cost is usually driven by the complexity of arguments,what kind of argument is the SupportRequestCost call fed during the planning stage? Constant arguments are prettystraight forward, but what gets sent in when a column is one (or all) of the arguments? You'll see whatever is in the post-constant-folding parse tree. If it's a Const, you can look at the value. If it's a Var, you could perhaps look at the pg_statistic info for that column, though whether that would give you much of a leg up for cost estimation is hard to say. For any sort of expression, you're probably going to be reduced to using a default estimate. The core code generally doesn't try to be intelligent about anything beyond the Const and Var cases. regards, tom lane
> On Feb 27, 2019, at 3:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Variable SupportRequestCost is very exciting, but given that variable cost is usually driven by the complexity of arguments,what kind of argument is the SupportRequestCost call fed during the planning stage? Constant arguments are prettystraight forward, but what gets sent in when a column is one (or all) of the arguments? > > You'll see whatever is in the post-constant-folding parse tree. If it's a > Const, you can look at the value. If it's a Var, you could perhaps look > at the pg_statistic info for that column, though whether that would give > you much of a leg up for cost estimation is hard to say. For any sort of > expression, you're probably going to be reduced to using a default > estimate. The core code generally doesn't try to be intelligent about > anything beyond the Const and Var cases. Actually, this is interesting, maybe there’s something to be done looking at the vertex density of the area under consideration…would require gathering extra stats, but could be useful (maybe, at some point feeding costs into plans hasto degenerate into wankery…) Another question: I added three indexes to my test table: CREATE INDEX foo_g_gist_x ON foo USING GIST (g); CREATE INDEX foo_g_gist_nd_x ON foo USING GIST (g gist_geometry_ops); CREATE INDEX foo_g_spgist_x ON foo USING SPGIST (g); They all support the overlaps (&&) operator. So, SupportRequestIndexCondition happens three times, and each time I say “yep, sure, you can construct an index conditionby putting the && operator between left_arg and right_arg”. How does the planner end up deciding on which index to *actually* use? The selectivity is the same, the operator is the same.I found that I got the ND GIST one first, then the SPGIST and finally the 2d GIST, which is unfortunate, because the2D and SPGIST are almost certainly faster than the ND GIST. In practice, most people will just have one spatial index at a time, but I still wonder? P
Paul Ramsey <pramsey@cleverelephant.ca> writes: > I added three indexes to my test table: > CREATE INDEX foo_g_gist_x ON foo USING GIST (g); > CREATE INDEX foo_g_gist_nd_x ON foo USING GIST (g gist_geometry_ops); > CREATE INDEX foo_g_spgist_x ON foo USING SPGIST (g); > They all support the overlaps (&&) operator. > So, SupportRequestIndexCondition happens three times, and each time I say “yep, sure, you can construct an index conditionby putting the && operator between left_arg and right_arg”. Sounds right. > How does the planner end up deciding on which index to *actually* use? It's whichever has the cheapest cost estimate. In case of an exact tie, I believe it'll choose the index with lowest OID (or maybe highest OID, not sure). > The selectivity is the same, the operator is the same. I found that I got the ND GIST one first, then the SPGIST and finallythe 2d GIST, which is unfortunate, because the 2D and SPGIST are almost certainly faster than the ND GIST. Given that it'll be the same selectivity, the cost preference is likely to go to whichever index is physically smallest, at least for indexes of the same type. When they're not the same type there might be an issue with the index AM cost estimators not being lined up very well as to what they account for and how. I don't doubt that there's plenty of work to be done in making the cost estimates better in cases like this --- in particular, I don't think we have any way of accounting for the idea that one index opclass might be smarter than another one for the same query, unless that shakes out as a smaller index. But you'd have had the same issues with the old approach. regards, tom lane
So I am getting much closer to a working implementation in PostGIS, but have just run into an issue which I am assuming is my misunderstanding something... https://github.com/pramsey/postgis/blob/92268c94f3aa1fc63a2941f2b451be15b28662cf/postgis/gserialized_supportfn.c#L287 I had what seemed to be working code except for a couple rare cases, but when I fixed those cases it turned out that I had a major problem: building a <var> OP <const> expression works fine, but building a <const> OP <var> expression returns me an error. create table f as select st_makepoint(200*random() - 100, 200*random() - 100) as g from generate_series(0, 100000); create index f_g_x on f using gist (g); explain select * from baz where st_coveredby('POINT(5 0)', geom); explain select * from f where st_coveredby(g, 'POINT(5 0)'); QUERY PLAN ----------------------------------------------------------------------------------- Bitmap Heap Scan on f (cost=13.36..314.58 rows=33334 width=32) Filter: st_coveredby(g, '010100000000000000000014400000000000000000'::geometry) -> Bitmap Index Scan on f_g_x (cost=0.00..5.03 rows=100 width=0) Index Cond: (g @ '010100000000000000000014400000000000000000'::geometry) postgis=# explain select * from f where st_coveredby('POINT(5 0)', g); ERROR: index key does not match expected index column Any thoughts? P
Paul Ramsey <pramsey@cleverelephant.ca> writes: > I had what seemed to be working code except for a couple rare cases, > but when I fixed those cases it turned out that I had a major problem: > building a <var> OP <const> expression works fine, but building a > <const> OP <var> expression returns me an error. Yup, you're not supposed to do that. The output expression *must* have the index key on the left, it's up to you to commute the operator if needed to make that happen. regards, tom lane
> On Mar 4, 2019, at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Paul Ramsey <pramsey@cleverelephant.ca> writes: >> I had what seemed to be working code except for a couple rare cases, >> but when I fixed those cases it turned out that I had a major problem: >> building a <var> OP <const> expression works fine, but building a >> <const> OP <var> expression returns me an error. > > Yup, you're not supposed to do that. The output expression *must* have > the index key on the left, it's up to you to commute the operator if > needed to make that happen. Gotcha, done and now have an implementation that passes all our regression tests. Thanks! P
Paul Ramsey <pramsey@cleverelephant.ca> writes: > Gotcha, done and now have an implementation that passes all our regression tests. Very cool! So the next step, I guess, is to address your original problem by cranking up the cost estimates for these functions --- have you tried that yet? In principle you should be able to do that and not have any bad planning side-effects, but this is all pretty new territory so maybe some problems remain to be ironed out. BTW, if you'd like me to review the code you added for this, I'd be happy to do so. I've never looked at PostGIS' innards, but probably I can make sense of the code for this despite that. regards, tom lane
On Mar 4, 2019, at 2:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:Paul Ramsey <pramsey@cleverelephant.ca> writes:Gotcha, done and now have an implementation that passes all our regression tests.
Very cool! So the next step, I guess, is to address your original problem
by cranking up the cost estimates for these functions --- have you tried
that yet? In principle you should be able to do that and not have any
bad planning side-effects, but this is all pretty new territory so maybe
some problems remain to be ironed out.
BTW, if you'd like me to review the code you added for this, I'd be happy
to do so. I've never looked at PostGIS' innards, but probably I can make
sense of the code for this despite that.
I would be ecstatic for a review, I’m sure I’ve left a million loose threads dangling.
P.
Paul Ramsey <pramsey@cleverelephant.ca> writes: >> On Mar 4, 2019, at 2:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> BTW, if you'd like me to review the code you added for this, I'd be happy >> to do so. I've never looked at PostGIS' innards, but probably I can make >> sense of the code for this despite that. > I would be ecstatic for a review, I'm sure I've left a million loose threads dangling. I took a look, and saw that you'd neglected to check pseudoconstantness of the non-index argument, so this'd fail on cases like ST_DWithin(x, y) where x is indexed and y is another column in the same table. Also I thought the handling of commutation could be done better. Attached is a suggested patch atop your f731c1b7022381dbf627cae311c3d37791bf40c3 to fix those and a couple of nitpicky other things. (I haven't tested this, mind you.) One thing that makes me itch, but I didn't address in the attached, is that expandFunctionOid() is looking up a function by name without any schema-qualification. That'll fail outright if PostGIS isn't in the search path, and even if it is, you've got security issues there. One way to address this is to assume that the expandfn is in the same schema as the ST_XXX function you're attached to, so you could do "get_namespace_name(get_func_namespace(funcid))" and then include that in the list passed to LookupFuncName. Also, this might be as-intended but I was wondering: I'd sort of expected you to make, eg, _ST_DWithin() and ST_DWithin() into exact synonyms. They aren't, since the former is not connected to the support function. Is that intentional? I guess if you had a situation where you wanted to force non-use of an index, being able to use _ST_DWithin() for that would be helpful. regards, tom lane diff --git a/postgis/gserialized_supportfn.c b/postgis/gserialized_supportfn.c index 90c61f3..66b699b 100644 --- a/postgis/gserialized_supportfn.c +++ b/postgis/gserialized_supportfn.c @@ -60,7 +60,7 @@ Datum postgis_index_supportfn(PG_FUNCTION_ARGS); */ typedef struct { - char *fn_name; + const char *fn_name; int strategy_number; /* Index strategy to add */ int nargs; /* Expected number of function arguments */ int expand_arg; /* Radius argument for "within distance" search */ @@ -230,22 +230,42 @@ Datum postgis_index_supportfn(PG_FUNCTION_ARGS) } /* - * Somehow an indexed third argument has slipped in. That - * should not happen. + * We can only do something with index matches on the first or + * second argument. */ if (req->indexarg > 1) PG_RETURN_POINTER((Node *)NULL); /* - * Need the argument types (which should always be geometry or geography - * since this function is only bound to those functions) - * to use in the operator function lookup + * Make sure we have enough arguments (just paranoia really). */ - if (nargs < 2) + if (nargs < 2 || nargs < idxfn.expand_arg) elog(ERROR, "%s: associated with function with %d arguments", __func__, nargs); - leftarg = linitial(clause->args); - rightarg = lsecond(clause->args); + /* + * Extract "leftarg" as the arg matching the index, and + * "rightarg" as the other one, even if they were in the + * opposite order in the call. NOTE: the functions we deal + * with here treat their first two arguments symmetrically + * enough that we needn't distinguish the two cases beyond + * this. This might need more work later. + */ + if (req->indexarg == 0) + { + leftarg = linitial(clause->args); + rightarg = lsecond(clause->args); + } + else + { + rightarg = linitial(clause->args); + leftarg = lsecond(clause->args); + } + + /* + * Need the argument types (which should always be geometry or geography + * since this function is only bound to those functions) + * to use in the operator function lookup. + */ leftdatatype = exprType(leftarg); rightdatatype = exprType(rightarg); @@ -267,20 +287,27 @@ Datum postgis_index_supportfn(PG_FUNCTION_ARGS) */ if (idxfn.expand_arg) { - Node *indexarg = req->indexarg ? rightarg : leftarg; - Node *otherarg = req->indexarg ? leftarg : rightarg; Node *radiusarg = (Node *) list_nth(clause->args, idxfn.expand_arg-1); - // Oid indexdatatype = exprType(indexarg); - Oid otherdatatype = exprType(otherarg); - Oid expandfn_oid = expandFunctionOid(otherdatatype); + Oid expandfn_oid = expandFunctionOid(rightdatatype); + + FuncExpr *expandexpr = makeFuncExpr(expandfn_oid, rightdatatype, + list_make2(rightarg, radiusarg), + InvalidOid, InvalidOid, COERCE_EXPLICIT_CALL); - FuncExpr *expandexpr = makeFuncExpr(expandfn_oid, otherdatatype, - list_make2(otherarg, radiusarg), - InvalidOid, req->indexcollation, COERCE_EXPLICIT_CALL); + /* + * The comparison expression has to be a pseudoconstant, + * ie not volatile nor dependent on the target index's + * table. (Including the expandfn itself in this test is + * probably unnecessary, but let's be paranoid.) + */ + if (!is_pseudo_constant_for_index((Node *) expandexpr, + req->index)) + PG_RETURN_POINTER((Node *)NULL); + /* OK, we can make the index condition */ Expr *expr = make_opclause(oproid, BOOLOID, false, - (Expr *) indexarg, (Expr *) expandexpr, - InvalidOid, req->indexcollation); + (Expr *) leftarg, (Expr *) expandexpr, + InvalidOid, InvalidOid); ret = (Node *)(list_make1(expr)); } @@ -289,28 +316,24 @@ Datum postgis_index_supportfn(PG_FUNCTION_ARGS) * an index OpExpr with the original arguments on each * side. * st_intersects(g1, g2) yields: g1 && g2 + * if g1 is the indexable expression (otherwise g2 && g1) */ else { - Expr *expr; /* - * PgSQL wants the left-hand side to be the non-const - * term, so if we have a const left we swap with - * the right + * The comparison expression has to be a pseudoconstant, + * ie not volatile nor dependent on the target index's + * table. */ - if (IsA(leftarg, Const)) - { - Node *tmp; - oproid = get_commutator(oproid); - if (!OidIsValid(oproid)) - PG_RETURN_POINTER((Node *)NULL); - tmp = leftarg; - leftarg = rightarg; - rightarg = tmp; - } - expr = make_opclause(oproid, BOOLOID, false, + if (!is_pseudo_constant_for_index(rightarg, + req->index)) + PG_RETURN_POINTER((Node *)NULL); + + /* OK, we can make the index condition */ + Expr *expr = make_opclause(oproid, BOOLOID, false, (Expr *) leftarg, (Expr *) rightarg, InvalidOid, InvalidOid); + ret = (Node *)(list_make1(expr)); }
> On Mar 4, 2019, at 4:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Paul Ramsey <pramsey@cleverelephant.ca> writes: >>> On Mar 4, 2019, at 2:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> BTW, if you'd like me to review the code you added for this, I'd be happy >>> to do so. I've never looked at PostGIS' innards, but probably I can make >>> sense of the code for this despite that. > >> I would be ecstatic for a review, I'm sure I've left a million loose threads dangling. > > I took a look, and saw that you'd neglected to check pseudoconstantness > of the non-index argument, so this'd fail on cases like ST_DWithin(x, y) > where x is indexed and y is another column in the same table. Also > I thought the handling of commutation could be done better. Attached is > a suggested patch atop your f731c1b7022381dbf627cae311c3d37791bf40c3 to > fix those and a couple of nitpicky other things. (I haven't tested this, > mind you.) > > One thing that makes me itch, but I didn't address in the attached, > is that expandFunctionOid() is looking up a function by name without > any schema-qualification. That'll fail outright if PostGIS isn't in > the search path, and even if it is, you've got security issues there. > One way to address this is to assume that the expandfn is in the same > schema as the ST_XXX function you're attached to, so you could > do "get_namespace_name(get_func_namespace(funcid))" and then include > that in the list passed to LookupFuncName. Thanks for the patch, I’ve applied and smoothed and taken your advice on schema-qualified lookups as well. > Also, this might be as-intended but I was wondering: I'd sort of expected > you to make, eg, _ST_DWithin() and ST_DWithin() into exact synonyms. > They aren't, since the former is not connected to the support function. > Is that intentional? I guess if you had a situation where you wanted to > force non-use of an index, being able to use _ST_DWithin() for that would > be helpful. Yes, this is by design. Other parts of the internal code base still like access to _ST_Functions, and there’s a non-zerochance that some 3rd party callers still want them too. There’s a certain utility in having “guaranteed not indexed”things so you can combine them with your other indexes of choice (particularly given the insane zoo of indexes builtagainst geometry). Again, many many thanks for your help! Next stop, costing. P.
Paul Ramsey <pramsey@cleverelephant.ca> writes: > Thanks for the patch, I’ve applied and smoothed and taken your advice on schema-qualified lookups as well. Hm, I think your addition of this bit is wrong: + /* + * Arguments were swapped to put the index value on the + * left, so we need the commutated operator for + * the OpExpr + */ + if (swapped) + { + oproid = get_commutator(oproid); + if (!OidIsValid(oproid)) PG_RETURN_POINTER((Node *)NULL); + } We already did the operator lookup with the argument types in the desired order, so this is introducing an extra swap. The only reason it appears to work, I suspect, is that all your index operators are self-commutators. regards, tom lane
> On Mar 5, 2019, at 3:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Paul Ramsey <pramsey@cleverelephant.ca> writes: >> Thanks for the patch, I’ve applied and smoothed and taken your advice on schema-qualified lookups as well. > > Hm, I think your addition of this bit is wrong: > > + /* > + * Arguments were swapped to put the index value on the > + * left, so we need the commutated operator for > + * the OpExpr > + */ > + if (swapped) > + { > + oproid = get_commutator(oproid); > + if (!OidIsValid(oproid)) > PG_RETURN_POINTER((Node *)NULL); > + } > > We already did the operator lookup with the argument types in the desired > order, so this is introducing an extra swap. The only reason it appears > to work, I suspect, is that all your index operators are self-commutators. I was getting regression failures until I re-swapped the operator… SELEcT * FROM foobar WHERE ST_Within(ConstA, VarB) Place the indexed operator in the Left, now: Left == VarB Right == ConstA Strategy == Within get_opfamily_member(opfamilyoid, Left, Right, Within) Unless we change the strategy number when we assign the left/right we’re looking up an operator for “B within A”, so we’rebackwards. I feel OK about it, if for no other reason than it passes all the tests :) P
Paul Ramsey <pramsey@cleverelephant.ca> writes: > On Mar 5, 2019, at 3:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Hm, I think your addition of this bit is wrong: >> >> + /* >> + * Arguments were swapped to put the index value on the >> + * left, so we need the commutated operator for >> + * the OpExpr >> + */ >> + if (swapped) >> + { >> + oproid = get_commutator(oproid); >> + if (!OidIsValid(oproid)) >> PG_RETURN_POINTER((Node *)NULL); >> + } >> >> We already did the operator lookup with the argument types in the desired >> order, so this is introducing an extra swap. The only reason it appears >> to work, I suspect, is that all your index operators are self-commutators. > I was getting regression failures until I re-swapped the operator… > SELEcT * FROM foobar WHERE ST_Within(ConstA, VarB) Ah ... so the real problem here is that *not* all of your functions treat their first two inputs alike, and the hypothetical future improvement I commented about is needed right now. I should've looked more closely at the strategies in your table; then I would've realized the patch as I proposed it didn't work. But this code isn't right either. I'm surprised you're not getting crashes --- perhaps there aren't cases where the first and second args are of incompatible types? Also, it's certainly wrong to be doing this sort of swap in only one of the two code paths. There's more than one way you could handle this, but the way that I was vaguely imagining was to have two strategy entries in each IndexableFunction entry, one to apply if the first function argument is the indexable one, and the other to apply if the second function argument is the indexable one. If you leave the operator lookup as I had it (using the already-swapped data types) then you'd have to make sure that the latter set of strategy entries are written as if the arguments get swapped before selecting the strategy, which would be confusing perhaps :-( --- for instance, st_within would use RTContainedByStrategyNumber in the first case but RTContainsStrategyNumber in the second. But otherwise you need the separate get_commutator step, which seems like one more catalog lookup than you really need. > I feel OK about it, if for no other reason than it passes all the tests :) Then you're at least missing adequate tests for the 3-arg functions... 3 args with the index column second will not work as this stands. regards, tom lane
> On Mar 5, 2019, at 3:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Paul Ramsey <pramsey@cleverelephant.ca> writes: >> On Mar 5, 2019, at 3:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Hm, I think your addition of this bit is wrong: >>> >>> + /* >>> + * Arguments were swapped to put the index value on the >>> + * left, so we need the commutated operator for >>> + * the OpExpr >>> + */ >>> + if (swapped) >>> + { >>> + oproid = get_commutator(oproid); >>> + if (!OidIsValid(oproid)) >>> PG_RETURN_POINTER((Node *)NULL); >>> + } >>> >>> We already did the operator lookup with the argument types in the desired >>> order, so this is introducing an extra swap. The only reason it appears >>> to work, I suspect, is that all your index operators are self-commutators. > >> I was getting regression failures until I re-swapped the operator… >> SELEcT * FROM foobar WHERE ST_Within(ConstA, VarB) > > Ah ... so the real problem here is that *not* all of your functions > treat their first two inputs alike, and the hypothetical future > improvement I commented about is needed right now. I should've > looked more closely at the strategies in your table; then I would've > realized the patch as I proposed it didn't work. > > But this code isn't right either. I'm surprised you're not getting > crashes --- perhaps there aren't cases where the first and second args > are of incompatible types? Also, it's certainly wrong to be doing this > sort of swap in only one of the two code paths. > > There's more than one way you could handle this, but the way that > I was vaguely imagining was to have two strategy entries in each > IndexableFunction entry, one to apply if the first function argument > is the indexable one, and the other to apply if the second function > argument is the indexable one. If you leave the operator lookup as > I had it (using the already-swapped data types) then you'd have to > make sure that the latter set of strategy entries are written as > if the arguments get swapped before selecting the strategy, which > would be confusing perhaps :-( --- for instance, st_within would > use RTContainedByStrategyNumber in the first case but > RTContainsStrategyNumber in the second. But otherwise you need the > separate get_commutator step, which seems like one more catalog lookup > than you really need. > >> I feel OK about it, if for no other reason than it passes all the tests :) > > Then you're at least missing adequate tests for the 3-arg functions... > 3 args with the index column second will not work as this stands. Some of the operators are indifferent to order (&&, overlaps) and others are not (@, within) (~, contains). The 3-arg functions fortunately all have && strategies. The types on either side of the operators are always the same (geometry && geometry), ST_Intersects(geometry, geometry). I could simply be getting a free pass from the simplicity of my setup? P.
Paul Ramsey <pramsey@cleverelephant.ca> writes: >> On Mar 5, 2019, at 3:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Then you're at least missing adequate tests for the 3-arg functions... >> 3 args with the index column second will not work as this stands. > Some of the operators are indifferent to order (&&, overlaps) and others are not (@, within) (~, contains). Right. > The 3-arg functions fortunately all have && strategies. Hm ... that probably explains why it's okay to apply the "expand" behavior to the non-indexed argument regardless of which one that is. I imagine the official definition of those functions isn't really symmetrical about which argument the expansion applies to, though? > The types on either side of the operators are always the same (geometry && geometry), ST_Intersects(geometry, geometry). > I could simply be getting a free pass from the simplicity of my setup? Yeah, seems so. The real reason I'm pestering you about this is that, since you're the first outside user of the support-function infrastructure, other people are likely to be looking at your code to see how to do things. So I'd like your code to not contain unnecessary dependencies on accidents like that ... regards, tom lane