Re: [PROPOSAL] VACUUM Progress Checker.

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: [PROPOSAL] VACUUM Progress Checker.
Дата
Msg-id 56DF83E2.3050105@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [PROPOSAL] VACUUM Progress Checker.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы Re: [PROPOSAL] VACUUM Progress Checker.  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Список pgsql-hackers
Horiguchi-san,

Thanks for the review!

On 2016/03/08 18:19, Kyotaro HORIGUCHI wrote:
>> Updated versions attached.
>>
>> * changed st_progress_param to int64 and so did the argument of
>> pgstat_progress_update_param().  Likewise changed param1..param10 of
>> pg_stat_get_progress_info()'s output columns to bigint.
>>
>> * Added back the Oid field st_command_target and corresponding function
>> pgstat_progress_set_command_target(Oid).
> 
> +    beentry->st_command = COMMAND_INVALID;
> +    MemSet(&beentry->st_progress_param, 0, sizeof(beentry->st_progress_param));
> 
> The MemSet seems useless since it gets the same initialization on
> setting st_command.

Right, every backend start should not have to pay that price.  Fixed in
the latest version.

> 
> +        /*
> +         * Report values for only those backends which are running the given
> +         * command.  XXX - privilege check is maybe dubious.
> +         */
> +        if (!beentry ||
> +            beentry->st_command != cmdtype ||
> +            !has_privs_of_role(GetUserId(), beentry->st_userid))
> +            continue;
> 
> We can simplly ignore unpriviledged backends, or all zeroz or
> nulls to signal that the caller has no priviledge.

As suggested by Robert, used pg_stat_get_activity() style.  In this case,
show 'pid' to all but the rest only to role members.

> 0002
> 
> +   FROM pg_stat_get_progress_info(1) AS S;
> 
> Ah... This magick number seems quite bad.. The function should
> take the command type in maybe string type.
> 
> +   FROM pg_stat_get_progress_info('lazy vacuum') AS S;
> 
> Using an array of the names would be acceptable, maybe.
> 
> | char *progress_command_names[] = {'lazy vacuum', NULL};

Hm, I think the way it is *may* be OK the way it is, but...

As done in the patch, the way we identify commands is with the enum
BackendCommandType:

+typedef enum BackendCommandType
+{
+    COMMAND_INVALID = 0,
+    COMMAND_LAZY_VACUUM
+} BackendCommandType;

Perhaps we could create a struct:

typedef struct PgStatProgressCommand
{   char                 *cmd_name;   BackendCommandType    cmd_type;
} PgStatProgressCommand;

static const struct PgStatProgressCommand commands[] = {   {"vacuum", COMMAND_LAZY_VACUUM},   {NULL, COMMAND_INVALID}
};

> However the numbers for the phases ('scanning heap' and so..) is
> acceptable for me for reasons uncertain to me, it also could be
> represented in names but is might be rahter bothersome..

In initial versions of the patch, it used to be char * that was passed for
identifying phases.  But, then we got rid of char * progress parameters
altogether. So, there are no longer any text columns in
pg_stat_get_progress_info()'s result.  It may not work out well in long
run to forever not have those (your recent example comes to mind).

> 
> +              WHEN 0 THEN 100::numeric(5, 2)
> +              ELSE ((S.param3 + 1)::numeric / S.param2 * 100)::numeric(5, 2)
> 
> This usage of numeric seems overkill to me.

Hmm, how could this rather be written?

>> * I attempted to implement a method to report index blocks done from
>> lazy_tid_reaped() albeit with some limitations. Patch 0003 is that
>> attempt.  In summary, I modified the index bulk delete callback interface
>> to receive a BlockNumber argument index_blkno:

[ snip ]

>> way of traversing the index pages. I dared only touch the btree case to
>> make it pass current block number to the callback. It finishes with
>> index_blks_done << total_index_blks since I guess the callback is called
>> only on the leaf pages. Any ideas?
> 
> To the contrary, I suppose it counts one index page more than
> once for the cases of uncorrelated heaps. index_blks_vacuumd can
> exceed RelationGetNumberOfBlocks() in extreme cases. If I'm not
> missing something, it stands on a quite fragile graound.

Yeah, the method is not entirely foolproof yet.

>> * I am also tempted to add num_dead_tuples and dead_tuples_vacuumed to add
>> granularity to 'vacuuming heap' phase but didn't in this version. Should we?
> 
> How do you think they are to be used?

I just realized there are objections to some columns be counters for pages
and others counting tuples.  So, I guess I withdraw.  I am just worried
that 'vacuuming heap' phase may take arbitrarily long if dead tuples array
is big.  Since we were thinking of adding more granularity to 'vacuuming
indexes' phase, I thought we should do for the former too.

Thanks,
Amit





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Managing a long-held TupleDesc reference
Следующее
От: Andreas Karlsson
Дата:
Сообщение: Re: syslog configurable line splitting behavior