Обсуждение: Why is FOR ORDER BY function getting called when the index is handling ordering?

Поиск
Список
Период
Сортировка
Sorry to have to ask for help here, but no amount of stepping through code is giving me the answer.

I'm building an index access method which supports an ordering operator:

    CREATE OPERATOR pg_catalog.<<=>> (
        FUNCTION = rdb.rank_match,
        LEFTARG = record,
        RIGHTARG = rdb.RankSpec
    );

    CREATE OPERATOR CLASS rdb_ops DEFAULT FOR TYPE record USING rdb AS
        OPERATOR 1 pg_catalog.<===> (record, rdb.userqueryspec),
        OPERATOR 2 pg_catalog.<<=>> (record, rdb.rankspec) FOR ORDER BY pg_catalog.float_ops;

Here is the supporting code (in Rust):

    #[derive(Serialize, Deserialize, PostgresType, Debug)]
    pub struct RankSpec {
        pub fastrank: String,
        pub slowrank: String,
        pub candidates: i32,
    }

    #[pg_extern(
    sql = "CREATE FUNCTION rdb.rank_match(rec record, rankspec rdb.RankSpec) \
    RETURNS real IMMUTABLE STRICT PARALLEL SAFE LANGUAGE c \
    AS 'MODULE_PATHNAME', 'rank_match_wrapper';",
    requires = [RankSpec],
    no_guard
    )]
    fn rank_match(_fcinfo: pg_sys::FunctionCallInfo) -> f32 {
        // todo make this an error
        pgrx::log!("Error -- this ranking method can only be called when there is an RDB full-text search in the WHERE clause.");
        42.0
    }

    #[pg_extern(immutable, strict, parallel_safe)]
    fn rank(fastrank: String, slowrank: String, candidates: i32) -> RankSpec {
        RankSpec {
            fastrank,
            slowrank,
            candidates,
        }
    }

The index access method works fine. It successfully gets the keys and the orderbys in the amrescan() method, and it dutifully returns the appropriate scan.xs_heaptid in the amgettuple() method. It returns the tids in the proper order. It also sets:

scandesc.xs_recheck = false;
scandesc.xs_recheckorderby = false;

so there's no reason the system needs to know the actual float value returned by rank_match(), the ordering operator distance function. In any case, that value can only be calculated based on information in the index itself, and can't be calculated by rank_match().

Nevertheless, the system calls rank_match() after every call to amgettuple(), and I can't figure out why. I've stepped through the code, and it looks like it has something to do with ScanState.ps.ps.ps_ProjInfo, but I can't figure out where or why it's getting set.

Here's a sample query. I have not found a query that does *not* call rank_match():

SELECT *
FROM products
WHERE products <===> rdb.userquery('teddy')
ORDER BY products <<=>> rdb.rank();

I'd be grateful for any help or insights.


--
Chris Cleveland
312-339-2677 mobile

Re: Why is FOR ORDER BY function getting called when the index is handling ordering?

От
Matthias van de Meent
Дата:
On Thu, 2 May 2024 at 18:50, Chris Cleveland <ccleveland@dieselpoint.com> wrote:
>
> Sorry to have to ask for help here, but no amount of stepping through code is giving me the answer.
>
> I'm building an index access method which supports an ordering operator:
[...]
> so there's no reason the system needs to know the actual float value returned by rank_match(), the ordering operator
distancefunction. In any case, that value can only be calculated based on information in the index itself, and can't be
calculatedby rank_match(). 
>
> Nevertheless, the system calls rank_match() after every call to amgettuple(), and I can't figure out why. I've
steppedthrough the code, and it looks like it has something to do with ScanState.ps.ps.ps_ProjInfo, but I can't figure
outwhere or why it's getting set. 
>
> Here's a sample query. I have not found a query that does *not* call rank_match():
[...]
> I'd be grateful for any help or insights.

The ordering clause produces a junk column that's used to keep track
of the ordering, and because it's a projected column (not the indexed
value, but an expression over that column) the executor will execute
that projection. This happens regardless of it's use in downstream
nodes due to planner or executor limitations.

See also Heikki's thread over at [0], and a comment of me about the
same issue over at pgvector's issue board at [1].

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

[0] https://www.postgresql.org/message-id/2ca5865b-4693-40e5-8f78-f3b45d5378fb%40iki.fi
[1] https://github.com/pgvector/pgvector/issues/359#issuecomment-1840786021



Chris Cleveland <ccleveland@dieselpoint.com> writes:
> I'm building an index access method which supports an ordering operator:

>     CREATE OPERATOR pg_catalog.<<=>> (
>         FUNCTION = rdb.rank_match,
>         LEFTARG = record,
>         RIGHTARG = rdb.RankSpec
>     );

Okay ...

> ... there's no reason the system needs to know the actual float value
> returned by rank_match(), the ordering operator distance function. In any
> case, that value can only be calculated based on information in the index
> itself, and can't be calculated by rank_match().

This seems to me to be a very poorly designed concept.  An index
ordering operator is an optimization that the planner may or may
not choose to employ.  If you've designed your code on the assumption
that that's the only possible plan, it will break for any but the
most trivial queries.

> Nevertheless, the system calls rank_match() after every call to
> amgettuple(), and I can't figure out why.

The ORDER BY value is included in the set of values that the plan
is expected to output.  This is so because it's set up to still
work if the planner needs to use an explicit sort step.  For
instance, using a trivial table with a couple of bigint columns:

regression=# explain verbose select * from int8_tbl order by q1/2;
                              QUERY PLAN
----------------------------------------------------------------------
 Sort  (cost=1.12..1.13 rows=5 width=24)
   Output: q1, q2, ((q1 / 2))
   Sort Key: ((int8_tbl.q1 / 2))
   ->  Seq Scan on public.int8_tbl  (cost=0.00..1.06 rows=5 width=24)
         Output: q1, q2, (q1 / 2)
(5 rows)

The q1/2 column is marked "resjunk", so it doesn't actually get
sent to the client, but it's computed so that the sort step can
use it.  Even if I do this:

regression=# create index on int8_tbl ((q1/2));
CREATE INDEX
regression=# set enable_seqscan TO 0;
SET
regression=# explain verbose select * from int8_tbl order by q1/2;
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Index Scan using int8_tbl_expr_idx on public.int8_tbl  (cost=0.13..12.22 rows=5 width=24)
   Output: q1, q2, (q1 / 2)
(2 rows)

... it's still there, because the set of columns that the scan node is
expected to emit doesn't change based on the plan type.  We could make
it change perhaps, if we tried hard enough, but so far nobody has
wanted to invest work in that.  Note that even if this indexscan
is chosen, that doesn't ensure that we won't need an explicit sort
later, since the query might need joins or aggregation on top of
the scan node.  So it's far from trivial to decide that the scan
node doesn't need to emit the sort column.

In any case, I'm uninterested in making the world safe for a
design that's going to fail if the planner doesn't choose an
indexscan on a specific index.  That's too fragile.

            regards, tom lane




> ... there's no reason the system needs to know the actual float value
> returned by rank_match(), the ordering operator distance function. In any
> case, that value can only be calculated based on information in the index
> itself, and can't be calculated by rank_match().

This seems to me to be a very poorly designed concept.  An index
ordering operator is an optimization that the planner may or may
not choose to employ.  If you've designed your code on the assumption
that that's the only possible plan, it will break for any but the
most trivial queries.

So the use-case here is an enterprise search engine built using an index access method. A search engine ranks its result set based on many, many factors. Among them: token-specific weights, token statistics calculated across the entire index, field lens, average field lens calculated across the index, various kinds of linguistic analysis (verbs? nouns?), additional terms added to the query drawn from other parts of the index, fuzzy terms based on ngrams from across the index, and a great many other magic tricks. There are also index-specific parameters that are specified at index time, both as parameters to the op classes attached to each column, and options specified by CREATE INDEX ... WITH (...).

If the system must generate a score for ranking using a simple ORDER BY column_op_constant, it can't, because all that information within the index itself isn't available.

In any case, I'm uninterested in making the world safe for a
design that's going to fail if the planner doesn't choose an
indexscan on a specific index.  That's too fragile.


But that's the reality of search engines. It's also the reason that the built-in pg full-text search has real limitations.

This isn't just a search engine problem. *Any* index type that depends on whole-table statistics, or index options, or knowledge about other items in the index will not be able to calculate a proper score without access to the index. This applies to certain types of vector search. It could also apply to a recommendation engine.

In general, it hasn't been a problem for my project because I adjust costs to force the use of the index. (Yes, I know that doing this is controversial, but I have little choice, and it works.)

The only area where it has become a problem is in the creation of the junk column.

I do understand the need for the index to report the value of the sort key up the food chain, because yes, all kinds of arbitrary re-ordering could occur. We already have a mechanism for that, though: IndexScanDesc.xs_orderbyvals. If there were a way for the system to use that instead of a call to the ordering function, we'd be all set.

It would also be nice if the orderbyval could be made available in the projection. That way we could report the score() in the result set.

Matthias' response and links touch on some of these issues.

--
Chris Cleveland
312-339-2677 mobile