Обсуждение: Re: [GENERAL] contrib module intagg crashing the backend

Поиск
Список
Период
Сортировка

Re: [GENERAL] contrib module intagg crashing the backend

От
Michael Fuhr
Дата:
On Tue, Mar 22, 2005 at 05:21:32PM -0800, Ron Mayer wrote:
>
> If one of the contrib modules (int_array_aggregate in contrib/intagg) is
> crashing my backend, where's a good place to report a bug or get hints
> where to look further?

Bug reports should go to the pgsql-bugs mailing list (I've cc'd it
on this message).  Searching the list archives for similar problems
can be helpful.

If the backend crashed then there might be a core dump somewhere
under the $PGDATA directory.  Using a debugger like gdb to get a
stack trace from the core dump can help pinpoint the problem.

> It seems to work fine on small arrays, but crashes on large ones.
> The second query would have put about 500 rows in the aggregate.

I can duplicate the crash in 8.0.1 (REL8_0_STABLE) with the following
(512 works, 513 crashes):

CREATE TABLE foo (id integer, name text);
INSERT INTO foo (id, name)
  SELECT x, 'test' FROM generate_series(1, 513) AS g(x);
SELECT name, int_array_aggregate(id) FROM foo GROUP BY name;

Here's a partial stack trace:

#0  0x81e003b in pfree (pointer=0x839f130) at mcxt.c:583
#1  0x810d044 in advance_transition_function (aggstate=0x839d188, peraggstate=0x839d908,
    pergroupstate=0x83a1aa0, newVal=513, isNull=0 '\000') at nodeAgg.c:395
#2  0x810d0f4 in advance_aggregates (aggstate=0x839d188, pergroup=0x83a1aa0) at nodeAgg.c:439
#3  0x810d782 in agg_fill_hash_table (aggstate=0x839d188) at nodeAgg.c:935
#4  0x810d448 in ExecAgg (node=0x839d188) at nodeAgg.c:674

Line 583 in mcxt.c is:

    (*header->context->methods->free_p) (header->context, pointer);

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: [GENERAL] contrib module intagg crashing the backend

От
Tom Lane
Дата:
Michael Fuhr <mike@fuhr.org> writes:
> I can duplicate the crash in 8.0.1 (REL8_0_STABLE) with the following

Grumble ... I seem to have managed to promote intagg from
broken-on-64bit-platforms to broken-on-every-platform ...
will look into a fix tomorrow.

            regards, tom lane

Re: [GENERAL] contrib module intagg crashing the backend

От
Tom Lane
Дата:
I wrote:
> Grumble ... I seem to have managed to promote intagg from
> broken-on-64bit-platforms to broken-on-every-platform ...
> will look into a fix tomorrow.

Ron's problem is essentially a double-free bug.  In this patch:

2005-01-27 16:35  tgl

    * contrib/intagg/: int_aggregate.c, int_aggregate.sql.in
    (REL7_3_STABLE), int_aggregate.c, int_aggregate.sql.in
    (REL7_4_STABLE), int_aggregate.c, int_aggregate.sql.in
    (REL8_0_STABLE), int_aggregate.c, int_aggregate.sql.in: Fix
    security and 64-bit issues in contrib/intagg.  This code could
    stand to be rewritten altogether, but for now just stick a finger
    in the dike.

I modified intagg to declare its transition data type as int4[] (which
is what it really is) rather than int4.  Unfortunately that means that
nodeAgg.c is now aware that the transition value is pass-by-reference,
and so it thinks it needs to manage the memory used for it; which
intagg.c is also trying to do; so they both free the same bit of memory.

There is already a "proper" fix for this problem in CVS tip, but it's
too invasive to consider back-patching; not least because nodeAgg's
memory management strategy has changed since 7.3 and the fix would
probably not work that far back.

What I'm thinking I have to do is revert intagg in the back branches to
lie about its transition data type, but still have it pull the pointer
out of the passed Datum with DatumGetPointer (as opposed to the old,
definitely 64-bit-broken method of DatumGetInt32 and then cast to pointer).
This should work because nodeAgg doesn't inquire into the actual
contents of any Datum it doesn't think is pass-by-reference; so it will
never discard the upper bits of the pointer.

Ugh.  Glad we have a cleaner solution to go forward with.

            regards, tom lane