Re: Disabling/Enabling index before bulk loading

Поиск
Список
Период
Сортировка
От Tim Cross
Тема Re: Disabling/Enabling index before bulk loading
Дата
Msg-id 877em2racj.fsf@gmail.com
обсуждение исходный текст
Ответ на Disabling/Enabling index before bulk loading  (Ravi Krishna <srkrishna@yahoo.com>)
Ответы Re: Disabling/Enabling index before bulk loading  (Melvin Davidson <melvin6925@gmail.com>)
Список pgsql-general
Ravi Krishna <srkrishna@yahoo.com> writes:

> We recently did a test on COPY and found that on large tables (47 million rows , 20GB of raw data) the 
> difference in COPY with 16 indexes and COPY without any index is 1:14. That is, COPY is 14 times slower 
> when data is ingested with all indexes as opposed to COPY first without index and then create all index.
>

This is very system dependent. On our system, when we tested a similar
approach, we found that the time saved through dropping the indexes
before copy was lost when rebuilding the indexes afterwards. In fact, it
ended up being slightly slower.

I suspect a lot depends on the number and types of indexes you
have. Your example had a lot more indexes than ours. We were loading 22
batches with 2.5M records per batch. While copy was significantly faster
than transaction based inserts (minutes vs hours), the differences between
indexes and no indexes was measured in minutes. We only had 3 or 4
indexes.   

> I googled for earlier posting on this and it looks like this has been asked before too.  
>
> This is what I am thinking to do:
>
> 1 - Extract index definition and save it as a SQL somewhere, either a file or a table.
> 2 - Drop all indexes.
> 3 - Ingest data via COPY
> 4 - Recreate all indexes saved in (1).
>
> Is there a generic sql or script or tool to accomplish (1).
>

We are loading data via Javascript using pg and pg-copy-streams modules. It is
pretty straight forward to drop the indexes and recreate them afterwards
via sql, so we didn't look for a tool as such.

As data is only inserted into this table and only by this process, we
also turned off autovacuum for this table, performing vacuum and analyze
manually after load. 

Tim

-- 
Tim Cross


В списке pgsql-general по дате отправления:

Предыдущее
От: Rich Shepard
Дата:
Сообщение: Re: Open Source tool to deploy/promote PostgreSQL DDL
Следующее
От: Melvin Davidson
Дата:
Сообщение: Re: Disabling/Enabling index before bulk loading