Обсуждение: optimizing pg_upgrade's once-in-each-database steps

Поиск
Список
Период
Сортировка

optimizing pg_upgrade's once-in-each-database steps

От
Nathan Bossart
Дата:
A number of pg_upgrade steps require connecting to each database and
running a query.  When there are many databases, these steps are
particularly time-consuming, especially since this is done sequentially in
a single process.  At a quick glance, I see the following such steps:

    * create_logical_replication_slots
    * check_for_data_types_usage
    * check_for_isn_and_int8_passing_mismatch
    * check_for_user_defined_postfix_ops
    * check_for_incompatible_polymorphics
    * check_for_tables_with_oids
    * check_for_user_defined_encoding_conversions
    * check_old_cluster_subscription_state
    * get_loadable_libraries
    * get_db_rel_and_slot_infos
    * old_9_6_invalidate_hash_indexes
    * report_extension_updates

I set out to parallelize these kinds of steps via multiple threads or
processes, but I ended up realizing that we could likely achieve much of
the same gain with libpq's asynchronous APIs.  Specifically, both
establishing the connections and running the queries can be done without
blocking, so we can just loop over a handful of slots and advance a simple
state machine for each.  The attached is a proof-of-concept grade patch for
doing this for get_db_rel_and_slot_infos(), which yielded the following
results on my laptop for "pg_upgrade --link --sync-method=syncfs --jobs 8"
for a cluster with 10K empty databases.

    total pg_upgrade_time:
    * HEAD:  14m 8s
    * patch: 10m 58s

    get_db_rel_and_slot_infos() on old cluster:
    * HEAD:  2m 45s
    * patch: 36s

    get_db_rel_and_slot_infos() on new cluster:
    * HEAD:  1m 46s
    * patch: 29s

I am posting this early to get thoughts on the general approach.  If we
proceeded with this strategy, I'd probably create some generic tooling that
each relevant step would provide a set of callback functions.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Вложения

Re: optimizing pg_upgrade's once-in-each-database steps

От
Jeff Davis
Дата:
On Thu, 2024-05-16 at 16:16 -0500, Nathan Bossart wrote:
> I am posting this early to get thoughts on the general approach.  If
> we
> proceeded with this strategy, I'd probably create some generic
> tooling that
> each relevant step would provide a set of callback functions.

The documentation states:

"pg_dump -j uses multiple database connections; it connects to the
database once with the leader process and once again for each worker
job."

That might need to be adjusted.

How much complexity do you avoid by using async instead of multiple
processes?

Also, did you consider connecting once to each database and running
many queries? Most of those seem like just checks.

Regards,
    Jeff Davis




Re: optimizing pg_upgrade's once-in-each-database steps

От
Nathan Bossart
Дата:
On Thu, May 16, 2024 at 05:09:55PM -0700, Jeff Davis wrote:
> How much complexity do you avoid by using async instead of multiple
> processes?

If we didn't want to use async, my guess is we'd want to use threads to
avoid complicated IPC.  And if we followed pgbench's example for using
threads, it might end up at a comparable level of complexity, although I'd
bet that threading would be the more complex of the two.  It's hard to say
definitively without coding it up both ways, which might be worth doing.

> Also, did you consider connecting once to each database and running
> many queries? Most of those seem like just checks.

This was the idea behind 347758b.  It may be possible to do more along
these lines.  IMO parallelizing will still be useful even if we do combine
more of the steps.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com