Re: COPY enhancements

Поиск
Список
Период
Сортировка
От Rod Taylor
Тема Re: COPY enhancements
Дата
Msg-id 751261b20910080912md3176e8ga671c768b4fee055@mail.gmail.com
обсуждение исходный текст
Ответ на Re: COPY enhancements  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: COPY enhancements  (Greg Smith <gsmith@gregsmith.com>)
Список pgsql-hackers

Yeah.  I think it's going to be hard to make this work without having
standalone transactions.  One idea would be to start a subtransaction,
insert tuples until one fails, then rollback the subtransaction and
start a new one, and continue on until the error limit is reached.

I've found performance is reasonable, for data with low numbers of errors (say 1 per 100,000 records or less) doing the following:

SAVEPOINT bulk;
Insert 1000 records using COPY.

If there is an error, rollback to bulk, and step through each line individually within its own "individual" subtransaction. All good lines are kept and bad lines are logged; client side control makes logging trivial.

The next set of 1000 records is done in bulk again.

1000 records per savepoint seems to be a good point for my data without too much time lost to overhead or too many records to retry due to a failing record. Of course, it is controlled by the client side rather than server side so reporting back broken records is trivial.


It may be possible to boost performance by:

1) Having copy remember which specific line caused the error. So it can replace lines 1 through 487 in a subtransaction since it knows those are successful. Run 488 in its on subtransaction. Run 489 through ... in a new subtransaction.
2) Increasing the number of records per subtransaction if data is clean. It wouldn't take long until you were inserting millions of records per subtransaction for a large data set. This should make the subtransaction overhead minimal. Small imports would still run slower but very large imports of clean data should be essentially the same speed in the end.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Joshua D. Drake"
Дата:
Сообщение: Re: COPY enhancements
Следующее
От: Tom Lane
Дата:
Сообщение: Re: COPY enhancements