Обсуждение: BUG #10976: Two memory leaks in regcomp cleanup

Поиск
Список
Период
Сортировка

BUG #10976: Two memory leaks in regcomp cleanup

От
arthur.j.odwyer@gmail.com
Дата:
The following bug has been logged on the website:

Bug reference:      10976
Logged by:          Arthur O'Dwyer
Email address:      arthur.j.odwyer@gmail.com
PostgreSQL version: 9.3.0
Operating system:   Ubuntu Linux
Description:

When MALLOC fails, pg_regcomp leaks memory in at least two places:

(A) In freev(), the line
    freesubre(info, v, v->tree);
should be
    freesubre(info, NULL, v->tree);
as otherwise the "freed" subres will end up on v->treefree, which is leaked
by the cleanst() two lines later.
That is, given the precondition that there are things in v->tree that aren't
in v->treechain.
This precondition is invariably true if we are being called because
nfatree() has run out of memory here:

    markst(v->tree);
    cleanst(info, v);  /* clears v->treechain without clearing v->tree */
    [...some comments...]
    re->re_info |= nfatree(info, v, v->tree, debug);
    CNOERR();  /* calls freev() */

(B) newlacon() leaks memory if REALLOC returns NULL on this line:

    v->lacons = (struct subre *) REALLOC(v->lacons,
        (v->nlacons + 1) * sizeof(struct subre));

The fix is to use the same idiom already used everywhere else REALLOC is
called in this module.

Re: BUG #10976: Two memory leaks in regcomp cleanup

От
Tom Lane
Дата:
arthur.j.odwyer@gmail.com writes:
> When MALLOC fails, pg_regcomp leaks memory in at least two places:

> (A) In freev(), the line
>     freesubre(info, v, v->tree);
> should be
>     freesubre(info, NULL, v->tree);
> as otherwise the "freed" subres will end up on v->treefree, which is leaked
> by the cleanst() two lines later.

Hmm ... what version of the code are you looking at exactly?  There's no
"info" argument here in Postgres.

> That is, given the precondition that there are things in v->tree that aren't
> in v->treechain.

The problem with this proposal is that if there are subres in v->tree
that *are* in the treechain, we'll possibly try to free them twice
(if they're not marked INUSE), and definitely will be accessing
already-freed memory when cleanst looks at them.

It looks to me like there are two different operating regimes in this
code: one where all the subres are still in the treechain, and one where
we've marked as INUSE all the ones that are reachable from v->tree and
garbage-collected the rest.  The markst/cleanst steps in pg_regcomp are
where the conversion is made.  freev needs to work correctly either
before or after that.

In short, I think you're right that there's a bug here, but this is
not a good fix for it.  I'm not sure freev has enough info to do the
right thing; we may need to rethink the data structure invariants,
so that there's not two different ways to clean up.

> (B) newlacon() leaks memory if REALLOC returns NULL on this line:

>     v->lacons = (struct subre *) REALLOC(v->lacons,
>         (v->nlacons + 1) * sizeof(struct subre));

> The fix is to use the same idiom already used everywhere else REALLOC is
> called in this module.

A temp variable, you mean.  Yeah, that's a bug.

            regards, tom lane

Re: BUG #10976: Two memory leaks in regcomp cleanup

От
"Arthur O'Dwyer"
Дата:
On Thu, Jul 17, 2014 at 10:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> arthur.j.odwyer@gmail.com writes:
>> When MALLOC fails, pg_regcomp leaks memory in at least two places:
>
>> (A) In freev(), the line
>>     freesubre(info, v, v->tree);
>> should be
>>     freesubre(info, NULL, v->tree);
>> as otherwise the "freed" subres will end up on v->treefree, which is leaked
>> by the cleanst() two lines later.
>
> Hmm ... what version of the code are you looking at exactly?  There's no
> "info" argument here in Postgres.

Ah, yes, we piped an "info" argument through all the places that call
MALLOC/FREE, so that compiling a regex wouldn't have to touch global
state. You can safely pretend I didn't write "info," there.

>> That is, given the precondition that there are things in v->tree that aren't
>> in v->treechain.
>
> The problem with this proposal is that if there are subres in v->tree
> that *are* in the treechain, we'll possibly try to free them twice
> (if they're not marked INUSE), and definitely will be accessing
> already-freed memory when cleanst looks at them.

Hmm. I think you're right --- I *think* the subres in v->tree are
INUSE by definition, so double-free isn't an issue, but cleanst will
definitely be looking at them after they've been freed, which is still
a bug. What if you just swap the order that freev() does cleanst() and
freesubre() so that the cleanst() happens first?

> It looks to me like there are two different operating regimes in this
> code: one where all the subres are still in the treechain, and one where
> we've marked as INUSE all the ones that are reachable from v->tree and
> garbage-collected the rest.  The markst/cleanst steps in pg_regcomp are
> where the conversion is made.  freev needs to work correctly either
> before or after that.
>
> In short, I think you're right that there's a bug here, but this is
> not a good fix for it.  I'm not sure freev has enough info to do the
> right thing; we may need to rethink the data structure invariants,
> so that there's not two different ways to clean up.

Please keep me cc'ed and/or send me an email when there's an
"official" patch for this leak.

Thanks,
Arthur

Re: BUG #10976: Two memory leaks in regcomp cleanup

От
Tom Lane
Дата:
"Arthur O'Dwyer" <arthur.j.odwyer@gmail.com> writes:
> On Thu, Jul 17, 2014 at 10:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The problem with this proposal is that if there are subres in v->tree
>> that *are* in the treechain, we'll possibly try to free them twice
>> (if they're not marked INUSE), and definitely will be accessing
>> already-freed memory when cleanst looks at them.

> Hmm. I think you're right --- I *think* the subres in v->tree are
> INUSE by definition, so double-free isn't an issue, but cleanst will
> definitely be looking at them after they've been freed, which is still
> a bug. What if you just swap the order that freev() does cleanst() and
> freesubre() so that the cleanst() happens first?

No, the INUSE marking doesn't happen till pg_regcomp runs markst(), so
that would break cleanup of failures occurring before that.  There's
somewhat of a narrow window for this case, since v->tree doesn't get
set until parse() returns, but failures there certainly are possible.

After some reflection I decided that what we need is to teach freesubre
to stick things back into the treefree list if and only if treechain is
non-NULL.  This guarantees we can't corrupt the treechain list with a
premature free of a subre, and it preserves the existing not-broken
logic for cleaning up after an error occuring before we reach markst().
So it ends up being one line of code change, though I added a bunch of
commentary as well:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1567e659a877d35ab4b85dafff41b2845d50990f

            regards, tom lane

Re: BUG #10976: Two memory leaks in regcomp cleanup

От
"Arthur O'Dwyer"
Дата:
Thanks! Your patch successfully passes my tests (the ones that found
the original freev() issue =E2=80=94 the REALLOC bug was found by reading t=
he
code).

=E2=80=93Arthur

On Fri, Jul 18, 2014 at 10:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Arthur O'Dwyer" <arthur.j.odwyer@gmail.com> writes:
>> On Thu, Jul 17, 2014 at 10:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> The problem with this proposal is that if there are subres in v->tree
>>> that *are* in the treechain, we'll possibly try to free them twice
>>> (if they're not marked INUSE), and definitely will be accessing
>>> already-freed memory when cleanst looks at them.
>
>> Hmm. I think you're right --- I *think* the subres in v->tree are
>> INUSE by definition, so double-free isn't an issue, but cleanst will
>> definitely be looking at them after they've been freed, which is still
>> a bug. What if you just swap the order that freev() does cleanst() and
>> freesubre() so that the cleanst() happens first?
>
> No, the INUSE marking doesn't happen till pg_regcomp runs markst(), so
> that would break cleanup of failures occurring before that.  There's
> somewhat of a narrow window for this case, since v->tree doesn't get
> set until parse() returns, but failures there certainly are possible.
>
> After some reflection I decided that what we need is to teach freesubre
> to stick things back into the treefree list if and only if treechain is
> non-NULL.  This guarantees we can't corrupt the treechain list with a
> premature free of a subre, and it preserves the existing not-broken
> logic for cleaning up after an error occuring before we reach markst().
> So it ends up being one line of code change, though I added a bunch of
> commentary as well:
>
> http://git.postgresql.org/gitweb/?p=3Dpostgresql.git;a=3Dcommitdiff;h=3D1=
567e659a877d35ab4b85dafff41b2845d50990f
>
>                         regards, tom lane