Обсуждение: Re: [GENERAL] Bug and/or feature? Complex data types in tables...

Поиск
Список
Период
Сортировка

Re: [GENERAL] Bug and/or feature? Complex data types in tables...

От
Tom Lane
Дата:
[ moved to pg-hackers, since it's *way* off topic for -general ]

Michael Glaesemann <grzm@myrealbox.com> writes:
> On Jan 3, 2004, at 2:31 PM, Tom Lane wrote:
>> The thing we are missing (i.e., what makes it crash) is an internal
>> representation that allows a tuple to be embedded as a field of a 
>> larger
>> tuple.  I've looked at this a couple of times, and each time concluded
>> that it was more work than I could afford to spend at the moment.  The
>> support-such-as-it-is for tuple return values uses a structure that has
>> embedded pointers, and it doesn't make any effort to get rid of
>> out-of-line TOAST pointers within the tuple.  Neither one of those
>> things are acceptable for a tuple that's trying to act like a Datum.

> Would you mind explaining this a little more, or pointing me to where I 
> can learn more about this?

Well, to make composite data types into real first-class citizens, we
have to be able to represent their values as ordinary Datums that don't
act differently from run-of-the-mill Datums, except when some operation
that actually wants to understand the contents of the value is invoked.
I think the only workable representation is as a variable-length datum
along the lines of
    int32 length word (overall length of datum)    OID type indicator (OID of the composite type)    header fields
similarto a normal on-disk tuple    null bitmap if needed    values of fields (themselves also Datums)
 

It's possible we could leave out the type OID, but given that we found
it useful to include an element type OID in array headers, I'm betting
we want one for composite-type values too.  Without it, we must always
know the exact composite type makeup from context.  (But see below.)

Now, this structure could be TOASTed as a whole, since it's just a
varlena data type.  But we cannot expect the toasting routines to look
inside it --- that would imply that it's not like other varlena data
types after all.  That means that the contained fields had better not be
out-of-line TOAST value references, because there's no way to keep track
of them and keep from deleting the referenced value too soon.  (It would
be workable for them to be compressed inline, but I suspect we don't
really want that either, since it'd interfere with attempts to compress
the overall datum.)  So somehow we'd need to expand out any toasted
component fields, at least before attempting to store such a datum on
disk.  Not sure where to do that cleanly.

The other point was that what's actually returned at the moment from a
function-returning-tuple is a Datum that contains a pointer to a
TupleTableSlot, not a pointer to a datum of this kind.  (Look in
executor/functions.c and executor/execQual.c to see the related code.)
We'd need to change that API, which would be a good thing to do, but
it'd no doubt break some user-written functions.

In particular I do not know how we'd handle functions declared to return
RECORD --- in general, they may need to return composite types that are
defined on-the-fly and don't have any associated type OID.  The
TupleTableSlot convention works for this since it can include a
tupledescriptor that was built on the fly.  We can't have tupdescs
embedded in datums stored on disk, however.
        regards, tom lane


Re: [GENERAL] Bug and/or feature? Complex data types in

От
Joe Conway
Дата:
Tom Lane wrote:
>         int32 length word (overall length of datum)
>         OID type indicator (OID of the composite type)
>         header fields similar to a normal on-disk tuple
>         null bitmap if needed
>         values of fields (themselves also Datums)
> 
> It's possible we could leave out the type OID, but given that we found
> it useful to include an element type OID in array headers, I'm betting
> we want one for composite-type values too.  Without it, we must always
> know the exact composite type makeup from context.  (But see below.)

Makes sense. But see below...

> Now, this structure could be TOASTed as a whole, since it's just a
> varlena data type.  But we cannot expect the toasting routines to look
> inside it --- that would imply that it's not like other varlena data
> types after all.  That means that the contained fields had better not be
> out-of-line TOAST value references, because there's no way to keep track
> of them and keep from deleting the referenced value too soon.

Why wouldn't we handle this just like we do when we build an array from 
elemental datums (i.e. allocate sufficient space and copy the individual 
datums into the structure)?

Continuing the analogy:
    int32   size;      /* overall length of datum */    int     flags;     /* null-bitmap indicator, others reserved */
  Oid     relid;     /* OID of the composite type */    int16   t_natts;   /* number of attributes */    bits8
t_bits[1];/* null bitmap if needed */    Datum  *values     /* values of fields */
 

values would be built similar to how its done in 
construct_md_array/CopyArrayEls/ArrayCastAndSet

The overlying datatype would be similar to anyarray.

AFAICS SQL2003 (and SQL99) defines something similar to this as a "row 
type". It looks like this:
  ROW ( column definition list )

But it also seems to equate a table's-row type to a "row type" in 
section 4.8 (Row types):
  "A row type is a sequence of (<field name> <data type>) pairs, called 
fields. It is described by a row type descriptor. A row type descriptor 
consists of the field descriptor of every field of the row type.
  The most specific type of a row of a table is a row type. In this 
case, each column of the row corresponds to the field of the row type 
that has the same ordinal position as the column."

So maybe as an extension to the standard, we could allow something like:
  ROW composite_type_name

Example:

CREATE TABLE foo (id int, tup ROW (f1 int, f2 text));

or ...

CREATE TABLE bar (f1 int, f2 text);
CREATE TABLE foo (id int, tup ROW bar);

> The other point was that what's actually returned at the moment from a
> function-returning-tuple is a Datum that contains a pointer to a
> TupleTableSlot, not a pointer to a datum of this kind.

If you had something akin to arrayin/arrayout, would this still need to 
be changed?

Joe



Re: [GENERAL] Bug and/or feature? Complex data types in tables...

От
Tom Lane
Дата:
Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>> ... That means that the contained fields had better not be
>> out-of-line TOAST value references, because there's no way to keep track
>> of them and keep from deleting the referenced value too soon.

> Why wouldn't we handle this just like we do when we build an array from 
> elemental datums (i.e. allocate sufficient space and copy the individual 
> datums into the structure)?

Well, the problem is that there are tuples and there are tuples.  We do
*not* want to force expansion of TOAST references every time we build an
intermediate tuple to return up one level in a plan.  That would cost
gobs of memory, and it's possible the expanded value will never actually
be used at all (eg, the row might fail a join qual further up the plan).
Ideally the forced expansion should only occur if a composite-type tuple
is actually about to be stored on disk.  Maybe this says that the
toaster routines are the right place to take care of it after all, but
I'm not quite sure where it should go.

BTW, you could argue that TOAST references in a constructed array ought
not be expanded until/unless the array gets written to disk, too.  But
the expense of scanning a large array on the off chance there's some
TOAST references in there might dissuade us from doing that.  (Hmm ...
maybe use a flag bit in the array flag word?)

>> The other point was that what's actually returned at the moment from a
>> function-returning-tuple is a Datum that contains a pointer to a
>> TupleTableSlot, not a pointer to a datum of this kind.

> If you had something akin to arrayin/arrayout, would this still need to 
> be changed?

I don't see the connection.  This is an internal representation either
way, and there's no point at which one would want to invoke an I/O
routine.
        regards, tom lane