Обсуждение: Returning multiple rows in materialized mode inside the extension

Поиск
Список
Период
Сортировка

Returning multiple rows in materialized mode inside the extension

От
Piotr Styczyński
Дата:

Hi,

I represent a small group of developers. We are working on an open-source PostgreSQL extension to enable processing of genomic data inside Postgres. We have an extensive knowledge of molecular biology or data science and none of the Postgres internals.

I don’t know if this mailing list is a good place to ask this question, but if it’s not, just correct me.

The problem:

We currently have a one-to-many function (an operation that produces multiple rows per one one input row). Now we would like to translate that functionality to a sensible many-to-many. We need to know how we are constrained by the internals of Postgres itself and what syntax we should use.

Also, the operation we are implementing requires knowing the full set of inputs before it can be computed.

Current solution:

There is ValuePerCall (1/0 returned rows) or Materialize mode (any number of returned rows), however the second one does not offer any invocation counter (like ValuePerCall does). Hence to provide any persistence between subcalls we introduced the following syntax:

SELECT _ FROM table t, my_function(t.a, t.b, t.c, number_of_rows);

Where by FROM a, b we mean cartesian product a times b. And my_function for first (number_of_rows - 1) invocations returns an empty set and the full result set for the last one.

Sadly this syntax requires us to enter a number of rows which is not very convenient.

Do you know how to handle this situation correctly? We looked for example at the code of tablefunc but the syntax there requires a full SQL query as an input, so that wasn’t useful.

Re: Returning multiple rows in materialized mode inside the extension

От
"David G. Johnston"
Дата:
On Wed, Mar 30, 2022 at 9:01 AM Piotr Styczyński <piotr@styczynski.in> wrote:

I don’t know if this mailing list is a good place to ask this question, but if it’s not, just correct me.

pgsql-general is probably better
 

The problem:

We currently have a one-to-many function (an operation that produces multiple rows per one one input row).

Now we would like to translate that functionality to a sensible many-to-many.

This seems like a big gap.

Input Situation Rows:
1
2
3
What is the expected output
1 A
1 B
1 C
2 A
2 B
2 C
3 A
3 B
3 C

I really don't know how you would change the internals to handle this - I'm doubting it would even be possible.  If asked to accomplish this using just standard PostgreSQL I would turn the inputs into an array

{1,2,3} 

and pass that array into a set-returning function.  Now I have:

{1,2,3} A
{1,2,3} B
{1,2,3} C

as an output, and I can just unnest the array column to produce the final result.

Something like (not tested):

SELECT unnest(arr_input.arr), func_call
FROM
(SELECT array_agg(inputvals) AS arr FROM tbl) AS arr_input
LATERAL func_call(arr_input.arr)
;

David J.