Re: Performance issue with cross table updates

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: Performance issue with cross table updates
Дата
Msg-id CAMkU=1zLUykQs9FuJhLOg6wWQNm8LRJgU-c-mjzhd1CZvLE-jw@mail.gmail.com
обсуждение исходный текст
Ответ на Performance issue with cross table updates  (Craig Gibson <craiggib@gmail.com>)
Список pgsql-general
On Mon, Sep 10, 2012 at 9:34 AM, Craig Gibson <craiggib@gmail.com> wrote:
> Hi all
>
> I am no database wizard so I am hoping someone may be able to assist me :)
>
> I get a daily CSV file of 6.5 million records. I create a temporary
> table and COPY them in. On completion I create an index on the mdnid
> column. This column is also indexed in table 2. This part is very
> fast. I had some 'checkpoint too often' issues, but that I have
> resolved.
>
> I then use the following procedure to update all the records, and if a
> record does not exist, insert it instead:
>
...
>
> From my understanding, a for loop is encapsulated in a cursor anyway
> so no need to do that. Am I fundamentally doing something wrong as the
> operation is slow as molasses?

How big is the parent table?  Are you CPU limited or IO limited?

If you are not CPU limited, then I would guess that the indexes on the
parent table do not fit in RAM or shared_buffers and that maintaining
the indexes on the parent table during the updates/inserts is the
bottleneck.

> Maybe there is a better way altogether
> that I have not thought of? The bottom line is that at no point can
> the e_lookup table be unavailable to clients,

It always has to be available for updates, or just for selects?

> else I would have just
> done a simple drop and rename post the COPY.

Maybe you can just do an atomic rename.

Cheers,

Jeff


В списке pgsql-general по дате отправления:

Предыдущее
От: Edson Richter
Дата:
Сообщение: Re: Compressed binary field
Следующее
От: Willy-Bas Loos
Дата:
Сообщение: force defaults