Обсуждение: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

Поиск
Список
Период
Сортировка

Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

От
Ranier Vilela
Дата:
Hi.

Per Coverity.
     
2. returned_null: SearchSysCacheAttName returns NULL (checked 20 out of 21 times).
3. var_assigned: Assigning: ptup = NULL return value from SearchSysCacheAttName.
 964                ptup = SearchSysCacheAttName(relidattname);
CID 1545986: (#1 of 1): Dereference null return value (NULL_RETURNS)
4. dereference: Dereferencing ptup, which is known to be NULL.

The functions SearchSysCacheAttNum and SearchSysCacheAttName,
need to have the result checked.

The commit 5091995, left an oversight.

Fixed by the patch attached, a change of style, unfortunately, was necessary.

best regards,
Ranier Vilela
Вложения
Em qua., 22 de mai. de 2024 às 11:44, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Hi.

Per Coverity.
     
2. returned_null: SearchSysCacheAttName returns NULL (checked 20 out of 21 times).
3. var_assigned: Assigning: ptup = NULL return value from SearchSysCacheAttName.
 964                ptup = SearchSysCacheAttName(relidattname);
CID 1545986: (#1 of 1): Dereference null return value (NULL_RETURNS)
4. dereference: Dereferencing ptup, which is known to be NULL.

The functions SearchSysCacheAttNum and SearchSysCacheAttName,
need to have the result checked.

The commit 5091995, left an oversight.

Fixed by the patch attached, a change of style, unfortunately, was necessary.
v1 Attached, fix wrong column variable name in error report.

best regards,
Ranier Vilela
Вложения
Em qua., 22 de mai. de 2024 às 13:09, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Em qua., 22 de mai. de 2024 às 11:44, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Hi.

Per Coverity.
     
2. returned_null: SearchSysCacheAttName returns NULL (checked 20 out of 21 times).
3. var_assigned: Assigning: ptup = NULL return value from SearchSysCacheAttName.
 964                ptup = SearchSysCacheAttName(relidattname);
CID 1545986: (#1 of 1): Dereference null return value (NULL_RETURNS)
4. dereference: Dereferencing ptup, which is known to be NULL.

The functions SearchSysCacheAttNum and SearchSysCacheAttName,
need to have the result checked.

The commit 5091995, left an oversight.

Fixed by the patch attached, a change of style, unfortunately, was necessary.
v1 Attached, fix wrong column variable name in error report.
1. Another concern is the function *get_partition_ancestors*,
which may return NIL, which may affect *llast_oid*, which does not handle NIL entries.

2. Is checking *relispartition* enough?
There a function *check_rel_can_be_partition* (src/backend/utils/adt/partitionfuncs.c),
which performs a much more robust check, would it be worth using it?

With the v2 attached, 1 is handled, but, in this case,
will it be the most correct?

best regards,
Ranier Vilela
Вложения

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

От
Michael Paquier
Дата:
On Wed, May 22, 2024 at 03:28:48PM -0300, Ranier Vilela wrote:
> 1. Another concern is the function *get_partition_ancestors*,
> which may return NIL, which may affect *llast_oid*, which does not handle
> NIL entries.

Hm?  We already know in the code path that the relation we are dealing
with when calling get_partition_ancestors() *is* a partition thanks to
the check on relispartition, no?  In this case, calling
get_partition_ancestors() is valid and there should be a top-most
parent in any case all the time.  So I don't get the point of checking
get_partition_ancestors() for NIL-ness just for the sake of assuming
that it would be possible.

> 2. Is checking *relispartition* enough?
> There a function *check_rel_can_be_partition*
> (src/backend/utils/adt/partitionfuncs.c),
> which performs a much more robust check, would it be worth using it?
>
> With the v2 attached, 1 is handled, but, in this case,
> will it be the most correct?

Saying that, your point about the result of SearchSysCacheAttName not
checked if it is a valid tuple is right.  We paint errors in these
cases even if they should not happen as that's useful when it comes to
debugging, at least.
--
Michael

Вложения

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

От
Ashutosh Bapat
Дата:


On Thu, May 23, 2024 at 5:52 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, May 22, 2024 at 03:28:48PM -0300, Ranier Vilela wrote:
> 1. Another concern is the function *get_partition_ancestors*,
> which may return NIL, which may affect *llast_oid*, which does not handle
> NIL entries.

Hm?  We already know in the code path that the relation we are dealing
with when calling get_partition_ancestors() *is* a partition thanks to
the check on relispartition, no?  In this case, calling
get_partition_ancestors() is valid and there should be a top-most
parent in any case all the time.  So I don't get the point of checking
get_partition_ancestors() for NIL-ness just for the sake of assuming
that it would be possible.

+1.
 

> 2. Is checking *relispartition* enough?
> There a function *check_rel_can_be_partition*
> (src/backend/utils/adt/partitionfuncs.c),
> which performs a much more robust check, would it be worth using it?
>
> With the v2 attached, 1 is handled, but, in this case,
> will it be the most correct?

Saying that, your point about the result of SearchSysCacheAttName not
checked if it is a valid tuple is right.  We paint errors in these
cases even if they should not happen as that's useful when it comes to
debugging, at least.

I think an Assert would do instead of whole ereport(). The callers have already resolved attribute name to attribute number. Hence the attribute *should* exist in both partition as well as topmost partitioned table.

   relid = llast_oid(ancestors);
+
  ptup = SearchSysCacheAttName(relid, attname);
+ if (!HeapTupleIsValid(ptup))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" of relation \"%s\" does not exist",
+ attname, RelationGetRelationName(rel))));

We changed the relid from OID of partition to that of topmost partitioned table but didn't change rel; which still points to partition relation. We have to invoke relation_open() with new relid, in order to use rel in the error message. I don't think all that is worth it, unless we find a scenario when SearchSysCacheAttName() returns NULL.

--
Best Wishes,
Ashutosh Bapat
Hi Micheal,

Em qua., 22 de mai. de 2024 às 21:21, Michael Paquier <michael@paquier.xyz> escreveu:
On Wed, May 22, 2024 at 03:28:48PM -0300, Ranier Vilela wrote:
> 1. Another concern is the function *get_partition_ancestors*,
> which may return NIL, which may affect *llast_oid*, which does not handle
> NIL entries.

Hm?  We already know in the code path that the relation we are dealing
with when calling get_partition_ancestors() *is* a partition thanks to
the check on relispartition, no?  In this case, calling
get_partition_ancestors() is valid and there should be a top-most
parent in any case all the time.  So I don't get the point of checking
get_partition_ancestors() for NIL-ness just for the sake of assuming
that it would be possible.
I don't have strong feelings about this.
But analyzing the function, *pg_partition_root* (src/backend/utils/adt/partitionfuncs.c),
we see that checking whether it is a partition is done by
check_rel_can_be_partition.
And it doesn't trust get_partition_ancestors, checking
if the return is NIL.

> 2. Is checking *relispartition* enough?
> There a function *check_rel_can_be_partition*
> (src/backend/utils/adt/partitionfuncs.c),
> which performs a much more robust check, would it be worth using it?
>
> With the v2 attached, 1 is handled, but, in this case,
> will it be the most correct?

Saying that, your point about the result of SearchSysCacheAttName not
checked if it is a valid tuple is right.  We paint errors in these
cases even if they should not happen as that's useful when it comes to
debugging, at least.
Thanks.

best regards,
Ranier Vilela
--
Michael
Em qui., 23 de mai. de 2024 às 06:27, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> escreveu:


On Thu, May 23, 2024 at 5:52 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, May 22, 2024 at 03:28:48PM -0300, Ranier Vilela wrote:
> 1. Another concern is the function *get_partition_ancestors*,
> which may return NIL, which may affect *llast_oid*, which does not handle
> NIL entries.

Hm?  We already know in the code path that the relation we are dealing
with when calling get_partition_ancestors() *is* a partition thanks to
the check on relispartition, no?  In this case, calling
get_partition_ancestors() is valid and there should be a top-most
parent in any case all the time.  So I don't get the point of checking
get_partition_ancestors() for NIL-ness just for the sake of assuming
that it would be possible.

+1.
 

> 2. Is checking *relispartition* enough?
> There a function *check_rel_can_be_partition*
> (src/backend/utils/adt/partitionfuncs.c),
> which performs a much more robust check, would it be worth using it?
>
> With the v2 attached, 1 is handled, but, in this case,
> will it be the most correct?

Saying that, your point about the result of SearchSysCacheAttName not
checked if it is a valid tuple is right.  We paint errors in these
cases even if they should not happen as that's useful when it comes to
debugging, at least.

I think an Assert would do instead of whole ereport().
IMO, Assert there is no better solution here.
 
The callers have already resolved attribute name to attribute number. Hence the attribute *should* exist in both partition as well as topmost partitioned table.

   relid = llast_oid(ancestors);
+
  ptup = SearchSysCacheAttName(relid, attname);
+ if (!HeapTupleIsValid(ptup))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_COLUMN),
+ errmsg("column \"%s\" of relation \"%s\" does not exist",
+ attname, RelationGetRelationName(rel))));

We changed the relid from OID of partition to that of topmost partitioned table but didn't change rel; which still points to partition relation. We have to invoke relation_open() with new relid, in order to use rel in the error message. I don't think all that is worth it, unless we find a scenario when SearchSysCacheAttName() returns NULL.
All calls to functions like SearchSysCacheAttName, in the whole codebase, checks if returns are valid.
It must be for a very strong reason, such a style.

So, v3, implements it this way.

best regards,
Ranier Vilela
Вложения

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

От
Michael Paquier
Дата:
On Thu, May 23, 2024 at 08:54:12AM -0300, Ranier Vilela wrote:
> All calls to functions like SearchSysCacheAttName, in the whole codebase,
> checks if returns are valid.
> It must be for a very strong reason, such a style.

Usually good practice, as I've outlined once upthread, because we do
expect the attributes to exist in this case.  Or if you want, an error
is better than a crash if a concurrent path causes this area to lead
to inconsistent lookups, which is something I've seen in the past
while hacking on my own stuff, or just fix other things causing
syscache lookup inconsistencies.  You'd be surprised to hear that
dropped attributes being mishandled is not that uncommon, especially
in out-of-core code, as one example.  FWIW, I don't see much a point
in using ereport(), the two checks ought to be elog()s pointing to an
internal error as these two errors should never happen.  Still, it is
a good idea to check that they never happen: aka an internal
error state is better than a crash if a problem arises.

> So, v3, implements it this way.

I don't understand the point behind the open/close of attrelation,
TBH.  That's not needed.

Except fot these two points, this is just moving the calls to make
sure that we have valid tuples from the syscache, which is a better
practice.  509199587df7 is recent enough that this should be fixed now
rather than later.
--
Michael

Вложения

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

От
Ashutosh Bapat
Дата:

On Fri, May 24, 2024 at 11:03 AM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, May 23, 2024 at 08:54:12AM -0300, Ranier Vilela wrote:
> All calls to functions like SearchSysCacheAttName, in the whole codebase,
> checks if returns are valid.
> It must be for a very strong reason, such a style.

Usually good practice, as I've outlined once upthread, because we do
expect the attributes to exist in this case.  Or if you want, an error
is better than a crash if a concurrent path causes this area to lead
to inconsistent lookups, which is something I've seen in the past
while hacking on my own stuff, or just fix other things causing
syscache lookup inconsistencies.  You'd be surprised to hear that
dropped attributes being mishandled is not that uncommon, especially
in out-of-core code, as one example.  FWIW, I don't see much a point
in using ereport(), the two checks ought to be elog()s pointing to an
internal error as these two errors should never happen.  Still, it is
a good idea to check that they never happen: aka an internal
error state is better than a crash if a problem arises.

> So, v3, implements it this way.

I don't understand the point behind the open/close of attrelation,
TBH.  That's not needed.

Except fot these two points, this is just moving the calls to make
sure that we have valid tuples from the syscache, which is a better
practice.  509199587df7 is recent enough that this should be fixed now
rather than later.

If we are looking for avoiding a segfault and get a message which helps debugging, using get_attname and get_attnum might be better options. get_attname throws an error. get_attnum doesn't throw an error and returns InvalidAttnum which won't return any valid identity sequence, and thus return a NIL sequence list which is handled in that function already. Using these two functions will avoid the clutter as well as segfault. If that's acceptable, I will provide a patch. 

--
Best Wishes,
Ashutosh Bapat

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

От
Michael Paquier
Дата:
On Fri, May 24, 2024 at 11:58:51AM +0530, Ashutosh Bapat wrote:
> If we are looking for avoiding a segfault and get a message which helps
> debugging, using get_attname and get_attnum might be better options.
> get_attname throws an error. get_attnum doesn't throw an error and returns
> InvalidAttnum which won't return any valid identity sequence, and thus
> return a NIL sequence list which is handled in that function already. Using
> these two functions will avoid the clutter as well as segfault. If that's
> acceptable, I will provide a patch.

Yeah, you could do that with these two routines as well.  The result
would be the same in terms of runtime validity checks.
--
Michael

Вложения

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

От
Ashutosh Bapat
Дата:


On Fri, May 24, 2024 at 12:16 PM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, May 24, 2024 at 11:58:51AM +0530, Ashutosh Bapat wrote:
> If we are looking for avoiding a segfault and get a message which helps
> debugging, using get_attname and get_attnum might be better options.
> get_attname throws an error. get_attnum doesn't throw an error and returns
> InvalidAttnum which won't return any valid identity sequence, and thus
> return a NIL sequence list which is handled in that function already. Using
> these two functions will avoid the clutter as well as segfault. If that's
> acceptable, I will provide a patch.

Yeah, you could do that with these two routines as well.  The result
would be the same in terms of runtime validity checks.

PFA patch using those two routines. 

--
Best Wishes,
Ashutosh Bapat
Вложения
Em sex., 24 de mai. de 2024 às 08:48, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> escreveu:


On Fri, May 24, 2024 at 12:16 PM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, May 24, 2024 at 11:58:51AM +0530, Ashutosh Bapat wrote:
> If we are looking for avoiding a segfault and get a message which helps
> debugging, using get_attname and get_attnum might be better options.
> get_attname throws an error. get_attnum doesn't throw an error and returns
> InvalidAttnum which won't return any valid identity sequence, and thus
> return a NIL sequence list which is handled in that function already. Using
> these two functions will avoid the clutter as well as segfault. If that's
> acceptable, I will provide a patch.

Yeah, you could do that with these two routines as well.  The result
would be the same in terms of runtime validity checks.

PFA patch using those two routines. 
The function *get_attname* palloc the result name (pstrdup).
Isn't it necessary to free the memory here (pfree)?

best regards,
Ranier Vilela

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

От
Michael Paquier
Дата:
On Fri, May 24, 2024 at 09:05:35AM -0300, Ranier Vilela wrote:
> The function *get_attname* palloc the result name (pstrdup).
> Isn't it necessary to free the memory here (pfree)?

This is going to be freed with the current memory context, and all the
callers of getIdentitySequence() are in query execution paths, so I
don't see much the point.  A second thing was a missing check on the
attnum returned by get_attnum() with InvalidAttrNumber.  I'd be
tempted to introduce a missing_ok to this routine after looking at the
callers in all the tree, as some of them want to fail still would not
expect it, so that would reduce a bit the elog churn.  That's a story
for a different day, though.
--
Michael

Вложения

Re: Avoid possible dereference null pointer (src/backend/catalog/pg_depend.c)

От
Ashutosh Bapat
Дата:
Thanks a lot Michael.

On Sun, May 26, 2024 at 4:40 PM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, May 24, 2024 at 09:05:35AM -0300, Ranier Vilela wrote:
> The function *get_attname* palloc the result name (pstrdup).
> Isn't it necessary to free the memory here (pfree)?

This is going to be freed with the current memory context, and all the
callers of getIdentitySequence() are in query execution paths, so I
don't see much the point.  A second thing was a missing check on the
attnum returned by get_attnum() with InvalidAttrNumber.  I'd be
tempted to introduce a missing_ok to this routine after looking at the
callers in all the tree, as some of them want to fail still would not
expect it, so that would reduce a bit the elog churn.  That's a story
for a different day, though.
--
Michael


--
Best Wishes,
Ashutosh Bapat