Обсуждение: pgsql: Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option

Поиск
Список
Период
Сортировка

pgsql: Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option

От
David Rowley
Дата:
Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option

Add new options to the VACUUM and ANALYZE commands called
BUFFER_USAGE_LIMIT to allow users more control over how large to make the
buffer access strategy that is used to limit the usage of buffers in
shared buffers.  Larger rings can allow VACUUM to run more quickly but
have the drawback of VACUUM possibly evicting more buffers from shared
buffers that might be useful for other queries running on the database.

Here we also add a new GUC named vacuum_buffer_usage_limit which controls
how large to make the access strategy when it's not specified in the
VACUUM/ANALYZE command.  This defaults to 256KB, which is the same size as
the access strategy was prior to this change.  This setting also
controls how large to make the buffer access strategy for autovacuum.

Per idea by Andres Freund.

Author: Melanie Plageman
Reviewed-by: David Rowley
Reviewed-by: Andres Freund
Reviewed-by: Justin Pryzby
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb@awork3.anarazel.de

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/1cbbee03385763b066ae3961fc61f2cd01a0d0d7

Modified Files
--------------
doc/src/sgml/config.sgml                      | 30 +++++++++
doc/src/sgml/ref/analyze.sgml                 | 20 ++++++
doc/src/sgml/ref/vacuum.sgml                  | 24 +++++++
src/backend/commands/vacuum.c                 | 95 ++++++++++++++++++++++++++-
src/backend/commands/vacuumparallel.c         | 14 +++-
src/backend/postmaster/autovacuum.c           | 18 +++--
src/backend/storage/buffer/README             | 12 ++--
src/backend/storage/buffer/freelist.c         | 61 ++++++++++++++---
src/backend/utils/init/globals.c              |  5 +-
src/backend/utils/misc/guc_tables.c           | 11 ++++
src/backend/utils/misc/postgresql.conf.sample |  3 +
src/bin/psql/tab-complete.c                   |  4 +-
src/include/miscadmin.h                       |  9 +++
src/include/storage/bufmgr.h                  |  5 ++
src/include/utils/guc_hooks.h                 |  2 +
src/test/regress/expected/vacuum.out          | 19 ++++++
src/test/regress/sql/vacuum.sql               | 15 +++++
17 files changed, 322 insertions(+), 25 deletions(-)


Re: pgsql: Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option

От
Kyotaro Horiguchi
Дата:
Hello.

> Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option

This commit added the following error message.

>         errmsg("value: \"%s\": is invalid for buffer_usage_limit",

It looks as the follows on terminal.

postgres=# vacuum (buffer_usage_limit 'x');
ERROR:  value: "x": is invalid for buffer_usage_limit

I'm not sure why the message has two colons.  [1] talks about the
message but doesn't really explain the reason for this.

[1] https://www.postgresql.org/message-id/CAApHDvqs%2BFcw9Yrn4ORCAX0xKK1-SiCC0w1j7Dhg%3DG9hrvag7g%40mail.gmail.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: pgsql: Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option

От
David Rowley
Дата:
On Tue, 11 Apr 2023 at 13:23, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> This commit added the following error message.
>
> >                errmsg("value: \"%s\": is invalid for buffer_usage_limit",
>
> It looks as the follows on terminal.
>
> postgres=# vacuum (buffer_usage_limit 'x');
> ERROR:  value: "x": is invalid for buffer_usage_limit
>
> I'm not sure why the message has two colons.  [1] talks about the
> message but doesn't really explain the reason for this.

Probably we can use what a GUC like work_mem does for this case:

postgres=# set work_mem = 'x';
ERROR:  invalid value for parameter "work_mem": "x"

with the attached, what you tried becomes:

postgres=# vacuum (buffer_usage_limit 'x');
ERROR:  invalid value for "buffer_usage_limit": "x"

How's that?

David

Вложения

Re: pgsql: Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option

От
Kyotaro Horiguchi
Дата:
I apologize for writing mails in such a sporadic manner..

At Tue, 11 Apr 2023 10:23:35 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> Hello.
> 
> > Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option
> 
> This commit added the following error message.
> 
> >         errmsg("value: \"%s\": is invalid for buffer_usage_limit",
> 
> It looks as the follows on terminal.
> 
> postgres=# vacuum (buffer_usage_limit 'x');
> ERROR:  value: "x": is invalid for buffer_usage_limit
> 
> I'm not sure why the message has two colons.  [1] talks about the
> message but doesn't really explain the reason for this.
> 
> [1] https://www.postgresql.org/message-id/CAApHDvqs%2BFcw9Yrn4ORCAX0xKK1-SiCC0w1j7Dhg%3DG9hrvag7g%40mail.gmail.com

vacuum.c:198
>      if (opt->arg == NULL)
>      {
>        ereport(ERROR,
>            (errcode(ERRCODE_SYNTAX_ERROR),
>             errmsg("buffer_usage_limit option requires a valid value"),
>             parser_errposition(pstate, opt->location)));

> postgres=# vacuum (buffer_usage_limit);
> ERROR:  buffer_usage_limit requires a valid value

The error message doesn't really make much sense to me.  In the same
context, most of the code seems to use ("%s requires a parameter",
buffer_usage_limit) without "valid".

vacuum.c:347
>         errmsg("BUFFER_USAGE_LIMIT cannot be specified for VACUUM FULL")));

It seems like that the vacuum options are usually spelled in uppercase
at least for vacuum and analyze. In any case, shouldn't we need to
unify them in a certain area?  (For example, command options for
CREATE SUBSCRIPTION is shown in lowercase in the documentation. But,
I'm not exactly sure what we should do about it.)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: pgsql: Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option

От
David Rowley
Дата:
On Tue, 11 Apr 2023 at 15:02, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> > postgres=# vacuum (buffer_usage_limit);
> > ERROR:  buffer_usage_limit requires a valid value
>
> The error message doesn't really make much sense to me.  In the same
> context, most of the code seems to use ("%s requires a parameter",
> buffer_usage_limit) without "valid".

The only other option I see that requires a value in this context is
"parallel", and what you say is not true for that, so I'm not sure I
follow what you're referring to with "most of the code". Can you quote
the code you mean?

> vacuum.c:347
> >                errmsg("BUFFER_USAGE_LIMIT cannot be specified for VACUUM FULL")));
>
> It seems like that the vacuum options are usually spelled in uppercase
> at least for vacuum and analyze. In any case, shouldn't we need to
> unify them in a certain area?  (For example, command options for
> CREATE SUBSCRIPTION is shown in lowercase in the documentation. But,
> I'm not exactly sure what we should do about it.)

I wouldn't object to making things more consistent in this area with
regards to the casing of the options in the ERROR messages.  However,
it doesn't really seem like there is much consistency to follow that
this new code is breaking.

For example, both of these are not upper casing the option name:

ereport(ERROR,
        (errcode(ERRCODE_SYNTAX_ERROR),
        errmsg("parallel option requires a value between 0 and %d",
        MAX_PARALLEL_WORKER_LIMIT),
        parser_errposition(pstate, opt->location)));

ereport(ERROR,
        (errcode(ERRCODE_SYNTAX_ERROR),
        errmsg("parallel workers for vacuum must be between 0 and %d",
        MAX_PARALLEL_WORKER_LIMIT),
        parser_errposition(pstate, opt->location)));

But if you wanted to change that, you'll need to raise a thread on
-hackers as that code has been there for a while.

(actually, it seems pointless to have two error messages for that any
not just one)

I do agree that it's not very good that I pushed the lowercase version
of buffer_usage_limit and the same in uppercase for the VACUUM FULL
conflict.  I'll hold back from fixing that until we figure out the
other stuff.

David



Re: pgsql: Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option

От
Kyotaro Horiguchi
Дата:
At Tue, 11 Apr 2023 16:53:42 +1200, David Rowley <dgrowleyml@gmail.com> wrote in 
> On Tue, 11 Apr 2023 at 15:02, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> > > postgres=# vacuum (buffer_usage_limit);
> > > ERROR:  buffer_usage_limit requires a valid value
> >
> > The error message doesn't really make much sense to me.  In the same
> > context, most of the code seems to use ("%s requires a parameter",
> > buffer_usage_limit) without "valid".
> 
> The only other option I see that requires a value in this context is
> "parallel", and what you say is not true for that, so I'm not sure I
> follow what you're referring to with "most of the code". Can you quote
> the code you mean?

My point was just that "valid" seems redundant. (Sorry, but..)
However, looking again, many of the "cases" are like "requires a
<type> value", which looks womewhat similar to "requires a valid
value" when the <type> cannot be represented by a word. Some remaining
example of these "caes" are:

./commands/define.c�54:                 errmsg("%s requires a parameter",
  (the "parameter" means the option value for type name)

./libpq/hba.c�1319:                 errmsg("authentication method \"%s\" requires argument \"%s\" to be set", \

But, since I'm not a native speaker, so I don't stress this further
more.


> > vacuum.c:347
> > >                errmsg("BUFFER_USAGE_LIMIT cannot be specified for VACUUM FULL")));
> >
> > It seems like that the vacuum options are usually spelled in uppercase
> > at least for vacuum and analyze. In any case, shouldn't we need to
> > unify them in a certain area?  (For example, command options for
> > CREATE SUBSCRIPTION is shown in lowercase in the documentation. But,
> > I'm not exactly sure what we should do about it.)
> 
> I wouldn't object to making things more consistent in this area with
> regards to the casing of the options in the ERROR messages.  However,
> it doesn't really seem like there is much consistency to follow that
> this new code is breaking.

...

> I do agree that it's not very good that I pushed the lowercase version
> of buffer_usage_limit and the same in uppercase for the VACUUM FULL
> conflict.  I'll hold back from fixing that until we figure out the
> other stuff.

Thanks for the opinion! I agree to you.

Thanks!

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: pgsql: Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option

От
David Rowley
Дата:
On Tue, 11 Apr 2023 at 18:29, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
> My point was just that "valid" seems redundant. (Sorry, but..)
> However, looking again, many of the "cases" are like "requires a
> <type> value", which looks womewhat similar to "requires a valid
> value" when the <type> cannot be represented by a word. Some remaining
> example of these "caes" are:
>
> ./commands/define.c 54:                          errmsg("%s requires a parameter",
>   (the "parameter" means the option value for type name)

OK, that makes sense. I see that defGetString() already checks for
NULL args, so it seems better to leave it to that to check for
consistency with other ERROR messages.

I've pushed that plus the uppercasing of buffer_usage_limit that you mentioned.

I additionally adjusted the two remaining ERRORs to become one. I
thought it seemed better to instead of complaining that 'x' is an
invalid value for BUFFER_USAGE_LIMIT, just to go ahead and give more
details about which are valid values.

Thanks for looking at this change and reporting your findings.

David