Обсуждение: Inconsistency in vacuum behavior

Поиск
Список
Период
Сортировка

Inconsistency in vacuum behavior

От
Alexander Pyhalov
Дата:
Hi.

We've run regress isolation tests on partitioned tables and found 
interesting VACUUM behavior. I'm not sure, if it's intended.

In the following example, partitioned tables and regular tables behave 
differently:

CREATE TABLE vacuum_tab (a int) PARTITION BY HASH (a);
CREATE TABLE vacuum_tab_1 PARTITION OF vacuum_tab FOR VALUES WITH 
(MODULUS 2, REMAINDER 0);
CREATE TABLE vacuum_tab_2 PARTITION OF vacuum_tab FOR VALUES WITH 
(MODULUS 2, REMAINDER 1);
CREATE ROLE regress_vacuum_conflict;

In the first session:

begin;
  LOCK vacuum_tab IN SHARE UPDATE EXCLUSIVE MODE;

In the second:
SET ROLE regress_vacuum_conflict;
  VACUUM vacuum_tab;
  WARNING:  permission denied to vacuum "vacuum_tab", skipping it <---- 
hangs here, trying to lock vacuum_tab_1

In non-partitioned case second session exits after emitting warning. In 
partitioned case, it hangs, trying to get locks.
This is due to the fact that in expand_vacuum_rel() we skip parent table 
if vacuum_is_permitted_for_relation(), but don't perform such check for 
its child.
The check will be performed later in vacuum_rel(), but after 
vacuum_open_relation(), which leads to hang in the second session.

Is it intended? Why don't we perform vacuum_is_permitted_for_relation() 
check for inheritors in expand_vacuum_rel()?

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional



Re: Inconsistency in vacuum behavior

От
Nikita Malakhov
Дата:
Hi!

I've checked this expand_vacuum_rel() and made a quick fix for this.Here's the result of the test:

postgres@postgres=# set role regress_vacuum_conflict;
SET
Time: 0.369 ms
postgres@postgres=> vacuum vacuum_tab;
WARNING:  permission denied to vacuum "vacuum_tab", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_1", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_2", skipping it
VACUUM
Time: 0.936 ms
postgres@postgres=>

Looks like it's a subject for a patch.

On Mon, Jan 16, 2023 at 11:18 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
Hi.

We've run regress isolation tests on partitioned tables and found
interesting VACUUM behavior. I'm not sure, if it's intended.

In the following example, partitioned tables and regular tables behave
differently:

CREATE TABLE vacuum_tab (a int) PARTITION BY HASH (a);
CREATE TABLE vacuum_tab_1 PARTITION OF vacuum_tab FOR VALUES WITH
(MODULUS 2, REMAINDER 0);
CREATE TABLE vacuum_tab_2 PARTITION OF vacuum_tab FOR VALUES WITH
(MODULUS 2, REMAINDER 1);
CREATE ROLE regress_vacuum_conflict;

In the first session:

begin;
  LOCK vacuum_tab IN SHARE UPDATE EXCLUSIVE MODE;

In the second:
SET ROLE regress_vacuum_conflict;
  VACUUM vacuum_tab;
  WARNING:  permission denied to vacuum "vacuum_tab", skipping it <----
hangs here, trying to lock vacuum_tab_1

In non-partitioned case second session exits after emitting warning. In
partitioned case, it hangs, trying to get locks.
This is due to the fact that in expand_vacuum_rel() we skip parent table
if vacuum_is_permitted_for_relation(), but don't perform such check for
its child.
The check will be performed later in vacuum_rel(), but after
vacuum_open_relation(), which leads to hang in the second session.

Is it intended? Why don't we perform vacuum_is_permitted_for_relation()
check for inheritors in expand_vacuum_rel()?

--
Best regards,
Alexander Pyhalov,
Postgres Professional




--
Regards,
Nikita Malakhov
Postgres Professional 

Re: Inconsistency in vacuum behavior

От
Nikita Malakhov
Дата:
Hi!

Here's the patch that fixes this case, please check it out.
The patch adds vacuum_is_permitted_for_relation() check before adding
partition relation to the vacuum list, and if permission is denied the relation
is not added, so it is not passed to vacuum_rel() and there are no try to
acquire the lock.

Cheers!

On Mon, Jan 16, 2023 at 4:48 PM Nikita Malakhov <hukutoc@gmail.com> wrote:
Hi!

I've checked this expand_vacuum_rel() and made a quick fix for this.Here's the result of the test:

postgres@postgres=# set role regress_vacuum_conflict;
SET
Time: 0.369 ms
postgres@postgres=> vacuum vacuum_tab;
WARNING:  permission denied to vacuum "vacuum_tab", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_1", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_2", skipping it
VACUUM
Time: 0.936 ms
postgres@postgres=>

Looks like it's a subject for a patch.

On Mon, Jan 16, 2023 at 11:18 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
Hi.

We've run regress isolation tests on partitioned tables and found
interesting VACUUM behavior. I'm not sure, if it's intended.

In the following example, partitioned tables and regular tables behave
differently:

CREATE TABLE vacuum_tab (a int) PARTITION BY HASH (a);
CREATE TABLE vacuum_tab_1 PARTITION OF vacuum_tab FOR VALUES WITH
(MODULUS 2, REMAINDER 0);
CREATE TABLE vacuum_tab_2 PARTITION OF vacuum_tab FOR VALUES WITH
(MODULUS 2, REMAINDER 1);
CREATE ROLE regress_vacuum_conflict;

In the first session:

begin;
  LOCK vacuum_tab IN SHARE UPDATE EXCLUSIVE MODE;

In the second:
SET ROLE regress_vacuum_conflict;
  VACUUM vacuum_tab;
  WARNING:  permission denied to vacuum "vacuum_tab", skipping it <----
hangs here, trying to lock vacuum_tab_1

In non-partitioned case second session exits after emitting warning. In
partitioned case, it hangs, trying to get locks.
This is due to the fact that in expand_vacuum_rel() we skip parent table
if vacuum_is_permitted_for_relation(), but don't perform such check for
its child.
The check will be performed later in vacuum_rel(), but after
vacuum_open_relation(), which leads to hang in the second session.

Is it intended? Why don't we perform vacuum_is_permitted_for_relation()
check for inheritors in expand_vacuum_rel()?

--
Best regards,
Alexander Pyhalov,
Postgres Professional




--
Regards,
Nikita Malakhov
Postgres Professional 


--
Regards,
Nikita Malakhov
Postgres Professional 
Вложения

Re: Inconsistency in vacuum behavior

От
Alexander Pyhalov
Дата:
Nikita Malakhov писал 2023-01-16 17:26:
> Hi!
> 
> Here's the patch that fixes this case, please check it out.
> The patch adds vacuum_is_permitted_for_relation() check before adding
> partition relation to the vacuum list, and if permission is denied the
> relation
> is not added, so it is not passed to vacuum_rel() and there are no try
> to
> acquire the lock.
> 
> Cheers!

Hi.

The patch seems to solve the issue.
Two minor questions I have:
1) should we error out if HeapTupleIsValid(part_tuple) is false?
2) comment "Check partition relations for vacuum permit" seems to be 
broken in some way.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional



Re: Inconsistency in vacuum behavior

От
Nikita Malakhov
Дата:
Hi,

Currently there is no error in this case, so additional thrown error would require a new test.
Besides, throwing an error here does not make sense - it is just a check for a vacuum
permission, I think the right way is to just skip a relation that is not suitable for vacuum.
Any thoughts or objections?

On Mon, Jan 16, 2023 at 7:46 PM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
Nikita Malakhov писал 2023-01-16 17:26:
> Hi!
>
> Here's the patch that fixes this case, please check it out.
> The patch adds vacuum_is_permitted_for_relation() check before adding
> partition relation to the vacuum list, and if permission is denied the
> relation
> is not added, so it is not passed to vacuum_rel() and there are no try
> to
> acquire the lock.
>
> Cheers!

Hi.

The patch seems to solve the issue.
Two minor questions I have:
1) should we error out if HeapTupleIsValid(part_tuple) is false?
2) comment "Check partition relations for vacuum permit" seems to be
broken in some way.

--
Best regards,
Alexander Pyhalov,
Postgres Professional


--
Regards,
Nikita Malakhov
Postgres Professional 

Re: Inconsistency in vacuum behavior

От
Alexander Pyhalov
Дата:
Nikita Malakhov писал 2023-01-16 20:12:
> Hi,
> 
> Currently there is no error in this case, so additional thrown error
> would require a new test.
> Besides, throwing an error here does not make sense - it is just a
> check for a vacuum
> permission, I think the right way is to just skip a relation that is
> not suitable for vacuum.
> Any thoughts or objections?
> 

No objections for not throwing an error.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional



Re: Inconsistency in vacuum behavior

От
Nikita Malakhov
Дата:
Hi hackers!

Alexander found a very good issue.
Please check the solution above. Any objections? It's a production case, please review,
any thoughts and objections are welcome.

On Mon, Jan 16, 2023 at 8:15 PM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
Nikita Malakhov писал 2023-01-16 20:12:
> Hi,
>
> Currently there is no error in this case, so additional thrown error
> would require a new test.
> Besides, throwing an error here does not make sense - it is just a
> check for a vacuum
> permission, I think the right way is to just skip a relation that is
> not suitable for vacuum.
> Any thoughts or objections?
>

No objections for not throwing an error.

--
Best regards,
Alexander Pyhalov,
Postgres Professional


--
Regards,
Nikita Malakhov
Postgres Professional 

Re: Inconsistency in vacuum behavior

От
Justin Pryzby
Дата:
On Mon, Jan 16, 2023 at 08:12:18PM +0300, Nikita Malakhov wrote:
> Hi,
> 
> Currently there is no error in this case, so additional thrown error would
> require a new test.
> Besides, throwing an error here does not make sense - it is just a check
> for a vacuum
> permission, I think the right way is to just skip a relation that is not
> suitable for vacuum.
> Any thoughts or objections?

Could you check if this is consistent between the behavior of VACUUM
FULL and CLUSTER ?  See also Nathan's patches.

-- 
Justin



Re: Inconsistency in vacuum behavior

От
Nikita Malakhov
Дата:
Hi!

I've found the discussion you'd mentioned before, checking now.

On Thu, Jan 19, 2023 at 4:49 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Mon, Jan 16, 2023 at 08:12:18PM +0300, Nikita Malakhov wrote:
> Hi,
>
> Currently there is no error in this case, so additional thrown error would
> require a new test.
> Besides, throwing an error here does not make sense - it is just a check
> for a vacuum
> permission, I think the right way is to just skip a relation that is not
> suitable for vacuum.
> Any thoughts or objections?

Could you check if this is consistent between the behavior of VACUUM
FULL and CLUSTER ?  See also Nathan's patches.

--
Justin


--
Regards,
Nikita Malakhov
Postgres Professional 

Re: Inconsistency in vacuum behavior

От
Alexander Pyhalov
Дата:
Justin Pryzby писал 2023-01-19 04:49:
> On Mon, Jan 16, 2023 at 08:12:18PM +0300, Nikita Malakhov wrote:
>> Hi,
>> 
>> Currently there is no error in this case, so additional thrown error 
>> would
>> require a new test.
>> Besides, throwing an error here does not make sense - it is just a 
>> check
>> for a vacuum
>> permission, I think the right way is to just skip a relation that is 
>> not
>> suitable for vacuum.
>> Any thoughts or objections?
> 
> Could you check if this is consistent between the behavior of VACUUM
> FULL and CLUSTER ?  See also Nathan's patches.

Hi.

Cluster behaves in a different way - it errors out immediately if 
relation is not owned by user. For partitioned rel it would anyway raise 
error later.
VACUUM and VACUUM FULL behave consistently after applying Nikita's patch 
(for partitioned and regular tables) - issue warning "skipping 
TABLE_NAME --- only table or database owner can vacuum it" and return 
success status.

-- 
Best regards,
Alexander Pyhalov,
Postgres Professional



Re: Inconsistency in vacuum behavior

От
Nikita Malakhov
Дата:
Hi!

For the test Alexander described in the beginning of the discussion - the results are
postgres@postgres=# set role regress_vacuum_conflict;
SET
Time: 0.324 ms
postgres@postgres=> vacuum vacuum_tab;
WARNING:  permission denied to vacuum "vacuum_tab", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_1", skipping it
WARNING:  permission denied to vacuum "vacuum_tab_2", skipping it
VACUUM
Time: 1.782 ms
postgres@postgres=> vacuum full;
WARNING:  permission denied to vacuum "pg_statistic", skipping it
WARNING:  permission denied to vacuum "vacuum_tab", skipping it
...
postgres@postgres=> cluster vacuum_tab;
ERROR:  must be owner of table vacuum_tab
Time: 0.384 ms

For the standard role "Postgres" the behavior is the same as before patch.

On Thu, Jan 19, 2023 at 10:37 AM Alexander Pyhalov <a.pyhalov@postgrespro.ru> wrote:
Justin Pryzby писал 2023-01-19 04:49:
> On Mon, Jan 16, 2023 at 08:12:18PM +0300, Nikita Malakhov wrote:
>> Hi,
>>
>> Currently there is no error in this case, so additional thrown error
>> would
>> require a new test.
>> Besides, throwing an error here does not make sense - it is just a
>> check
>> for a vacuum
>> permission, I think the right way is to just skip a relation that is
>> not
>> suitable for vacuum.
>> Any thoughts or objections?
>
> Could you check if this is consistent between the behavior of VACUUM
> FULL and CLUSTER ?  See also Nathan's patches.

Hi.

Cluster behaves in a different way - it errors out immediately if
relation is not owned by user. For partitioned rel it would anyway raise
error later.
VACUUM and VACUUM FULL behave consistently after applying Nikita's patch
(for partitioned and regular tables) - issue warning "skipping
TABLE_NAME --- only table or database owner can vacuum it" and return
success status.

--
Best regards,
Alexander Pyhalov,
Postgres Professional


--
Regards,
Nikita Malakhov
Postgres Professional 

Re: Inconsistency in vacuum behavior

От
Nathan Bossart
Дата:
On Mon, Jan 16, 2023 at 11:18:08AM +0300, Alexander Pyhalov wrote:
> Is it intended? Why don't we perform vacuum_is_permitted_for_relation()
> check for inheritors in expand_vacuum_rel()?

Since no lock is held on the partition, the calls to functions like
object_ownercheck() and pg_class_aclcheck() in
vacuum_is_permitted_for_relation() will produce cache lookup ERRORs if the
relation is concurrently dropped.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Inconsistency in vacuum behavior

От
Nikita Malakhov
Дата:
Hi!

Yes, I've checked that. What would be desirable behavior in the case above?
Anyway, waiting for table unlock seems to be not quite right.

On Sat, Jan 21, 2023 at 4:12 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Jan 16, 2023 at 11:18:08AM +0300, Alexander Pyhalov wrote:
> Is it intended? Why don't we perform vacuum_is_permitted_for_relation()
> check for inheritors in expand_vacuum_rel()?

Since no lock is held on the partition, the calls to functions like
object_ownercheck() and pg_class_aclcheck() in
vacuum_is_permitted_for_relation() will produce cache lookup ERRORs if the
relation is concurrently dropped.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




--
Regards,
Nikita Malakhov
Postgres Professional