RE: reindex option for tuning load large data

Поиск

Список

Период

Сортировка

От	James Pang (chaolpan)
Тема	RE: reindex option for tuning load large data
Дата	18 июня 2022 г. 07:00:14
Msg-id	PH0PR11MB5191E06BA17577A5BFE77ED8D6AE9@PH0PR11MB5191.namprd11.prod.outlook.com обсуждение исходный текст
Ответ на	Re: reindex option for tuning load large data (Vitalii Tymchyshyn <vit@tym.im>)
Список	pgsql-performance

Дерево обсуждения

We have more than 8500 indexes , > 1000 tables, many partition tables too ; it’s safe to update pg_index set indisready=false and indisvisilbe=false , then reindexdb with parallel ? reindex parallel got done by multiple sessions , each session reindex one index at the same time , reindex the one index done in serial instead of parallel ?

Compared with “set max_maintain_parallel_workers, and run CREATE INDEX …” , which is faster ?

Thanks,

From: Vitalii Tymchyshyn <vit@tym.im>
Sent: Saturday, June 18, 2022 11:49 AM
To: James Pang (chaolpan) <chaolpan@cisco.com>
Cc: pgsql-performance@lists.postgresql.org
Subject: Re: reindex option for tuning load large data

I believe you should be able to use reindexdb with parallel jobs:

https://www.postgresql.org/docs/13/app-reindexdb.html

It will still create multiple connections, but you won't need to run multiple commands.

чт, 16 черв. 2022 р. о 22:34 James Pang (chaolpan) <chaolpan@cisco.com> пише:

Hi ,
We plan to migrate large database from Oracle to Postgres(version 13.6, OS Redhat8 Enterprise), we are checking options to make data load in Postgres fast. Data volume is about several TB, thousands of indexes, many large table with partitions. We want to make data load running fast and avoid miss any indexes when reindexing. There are 2 options about reindex. Could you give some suggestions about the 2 options, which option is better.

Create tables and indexes( empty database) ,   update pg_index set indisready=false and inisvalid=false, then load data use COPY from csv , then reindex table …
Reindex on Postgres 13.6 not support parallel ,right? So we need to start multiple session to reindex multiple tables/indexes in parallel.

2). Use pg_dump to dump meta data only , then copy “CREATE INDEX … sql “
        Drop indexes before data load
       After data load, increase max_parallel_maintenance_workers, maintenance_work_mem
       Run CREATE INDEX … sql to leverage parallel create index feature.

Thanks,

James

В списке pgsql-performance по дате отправления:

Предыдущее

От: Vitalii Tymchyshyn
Дата: 18 июня 2022 г., 06:48:35
Сообщение: Re: reindex option for tuning load large data

Следующее

От: Jeff Janes
Дата: 18 июня 2022 г., 23:01:00
Сообщение: Re: reindex option for tuning load large data

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

RE: reindex option for tuning load large data

Предыдущее

Следующее