Re: Logging parallel worker draught

Поиск
Список
Период
Сортировка
От Imseih (AWS), Sami
Тема Re: Logging parallel worker draught
Дата
Msg-id 10D5C31A-A303-4440-BCA2-921440A9E1CC@amazon.com
обсуждение исходный текст
Ответ на Re: Logging parallel worker draught  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Logging parallel worker draught  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers
Hi,

This thread has been quiet for a while, but I'd like to share some
thoughts.

+1 to the idea of improving visibility into parallel worker saturation.
But overall, we should improve parallel processing visibility, so DBAs can
detect trends in parallel usage ( is the workload doing more parallel, or doing less )
and have enough data to either tune the workload or change parallel GUCs.

>> We can output this at the LOG level to avoid running the server at
>> DEBUG1 level. There are a few other cases where we are not able to
>> spawn the worker or process and those are logged at the LOG level. For
>> example, "could not fork autovacuum launcher process .." or "too many
>> background workers". So, not sure, if this should get a separate
>> treatment. If we fear this can happen frequently enough that it can
>> spam the LOG then a GUC may be worthwhile.


> I think we should definitely be afraid of that. I am in favor of a separate GUC.

Currently explain ( analyze ) will give you the "Workers Planned"
and "Workers launched". Logging this via auto_explain is possible, so I am
not sure we need additional GUCs or debug levels for this info.

   ->  Gather  (cost=10430.00..10430.01 rows=2 width=8) (actual tim
e=131.826..134.325 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2


>> What I was wondering was whether we would be better off putting this
>> into the statistics collector, vs. doing it via logging. Both
>> approaches seem to have pros and cons.
>>
>> I think it could be easier for users to process the information if it
>> is available via some view, so there is a benefit in putting this into
>> the stats subsystem.


> Unless we do this instead.

Adding cumulative stats is a much better idea.

3 new columns can be added to pg_stat_database:
workers_planned, 
workers_launched,
parallel_operations - There could be more than 1 operation
per query, if for example there are multiple Parallel Gather
operations in a plan.

With these columns, monitoring tools can trend if there is more
or less parallel work happening over time ( by looking at parallel
operations ) or if the workload is suffering from parallel saturation.
workers_planned/workers_launched < 1 means there is a lack
of available worker processes.

Also, We could add this information on a per query level as well 
in pg_stat_statements, but this can be taken up in a seperate
discussion.

Regards,

--
Sami Imseih
Amazon Web Services (AWS)









В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: pg_resetwal: Corrections around -c option
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: should frontend tools use syncfs() ?