Обсуждение: COPY ... FROM and index usage

Поиск
Список
Период
Сортировка

COPY ... FROM and index usage

От
Reg Me Please
Дата:
Hi all.

I'd like to know whether the indexes on a table are updated or not during
a "COPY ... FROM" request.

That is, should I drop all indexes during a "COPY ... FROM" in order to gain
the maximum speed to load data?

Thanks.

--
Reg me Please

Re: COPY ... FROM and index usage

От
"Josh Tolley"
Дата:
On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote:
> Hi all.
>
> I'd like to know whether the indexes on a table are updated or not during
> a "COPY ... FROM" request.
>
> That is, should I drop all indexes during a "COPY ... FROM" in order to gain
> the maximum speed to load data?
>
> Thanks.

Although questions of "which is faster" often depend very heavily on
the data involved, the database schema, the hardware, etc., typically
people find it best to drop all indexes during a large import and
recreate them afterward.

- Josh/eggyknap

Re: COPY ... FROM and index usage

От
Reg Me Please
Дата:
Il Sunday 04 November 2007 14:59:10 Josh Tolley ha scritto:
> On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote:
> > Hi all.
> >
> > I'd like to know whether the indexes on a table are updated or not during
> > a "COPY ... FROM" request.
> >
> > That is, should I drop all indexes during a "COPY ... FROM" in order to
> > gain the maximum speed to load data?
> >
> > Thanks.
>
> Although questions of "which is faster" often depend very heavily on
> the data involved, the database schema, the hardware, etc., typically
> people find it best to drop all indexes during a large import and
> recreate them afterward.
>
> - Josh/eggyknap

This sounds very reasonable to me.
But the first question remains unanswered:

Are the indexes updated during the COPY ... FROM ?

Thanks again.

--
Reg me Please

Re: COPY ... FROM and index usage

От
Erik Jones
Дата:
On Nov 4, 2007, at 9:15 AM, Reg Me Please wrote:

> Il Sunday 04 November 2007 14:59:10 Josh Tolley ha scritto:
>> On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote:
>>> Hi all.
>>>
>>> I'd like to know whether the indexes on a table are updated or
>>> not during
>>> a "COPY ... FROM" request.
>>>
>>> That is, should I drop all indexes during a "COPY ... FROM" in
>>> order to
>>> gain the maximum speed to load data?
>>>
>>> Thanks.
>>
>> Although questions of "which is faster" often depend very heavily on
>> the data involved, the database schema, the hardware, etc., typically
>> people find it best to drop all indexes during a large import and
>> recreate them afterward.
>>
>> - Josh/eggyknap
>
> This sounds very reasonable to me.
> But the first question remains unanswered:
>
> Are the indexes updated during the COPY ... FROM ?

Of course.  Why would think that data could be inserted into a table
by any means without it updating the table's indexes?  That would
make the index worthless.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



Re: COPY ... FROM and index usage

От
Reg Me Please
Дата:
Il Sunday 04 November 2007 16:21:41 Erik Jones ha scritto:
> On Nov 4, 2007, at 9:15 AM, Reg Me Please wrote:
> > Il Sunday 04 November 2007 14:59:10 Josh Tolley ha scritto:
> >> On 11/4/07, Reg Me Please <regmeplease@gmail.com> wrote:
> >>> Hi all.
> >>>
> >>> I'd like to know whether the indexes on a table are updated or
> >>> not during
> >>> a "COPY ... FROM" request.
> >>>
> >>> That is, should I drop all indexes during a "COPY ... FROM" in
> >>> order to
> >>> gain the maximum speed to load data?
> >>>
> >>> Thanks.
> >>
> >> Although questions of "which is faster" often depend very heavily on
> >> the data involved, the database schema, the hardware, etc., typically
> >> people find it best to drop all indexes during a large import and
> >> recreate them afterward.
> >>
> >> - Josh/eggyknap
> >
> > This sounds very reasonable to me.
> > But the first question remains unanswered:
> >
> > Are the indexes updated during the COPY ... FROM ?
>
> Of course.  Why would think that data could be inserted into a table
> by any means without it updating the table's indexes?  That would
> make the index worthless.
>
> Erik Jones
>
> Software Developer | Emma®
> erik@myemma.com
> 800.595.4401 or 615.292.5888
> 615.292.0777 (fax)
>
> Emma helps organizations everywhere communicate & market in style.
> Visit us online at http://www.myemma.com

I do understand your remarks and would agree.
But I was thinking about the COPY...FROM request not
as a normal INSERT INTO.
If this were the case I could have been running a REINDEX TABLE.

--
Reg me Please

Re: COPY ... FROM and index usage

От
Tom Lane
Дата:
"Josh Tolley" <eggyknap@gmail.com> writes:
> Although questions of "which is faster" often depend very heavily on
> the data involved, the database schema, the hardware, etc., typically
> people find it best to drop all indexes during a large import and
> recreate them afterward.

See also the extensive discussion of this topic at

http://www.postgresql.org/docs/8.2/static/populate.html

            regards, tom lane

Re: COPY ... FROM and index usage

От
Dimitri Fontaine
Дата:
Hi,

Le Sunday 04 November 2007 11:22:19 Reg Me Please, vous avez écrit :
> That is, should I drop all indexes during a "COPY ... FROM" in order to
> gain the maximum speed to load data?

When looking for a way to speed up data loading, you may want to consider
pgbulkload, a project which optimizes index creation while loading data, and
bypass constraints: it's useful when you know you trust input.
As I've never used it myself, I can only provides following links:

  http://pgfoundry.org/projects/pgbulkload
  http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf

Regards,
--
dim

Re: COPY ... FROM and index usage

От
Toru SHIMOGAKI
Дата:
Dimitri, thank you for your quoting. I'm a pg_bulkload author.

pg_bulkload is optimized especially for appending data to table with indexes.
If you use it, you don't need to drop index before loading data. But you have to
consider conditions carefully as Dimitri said below. See also pg_bulkload README:

http://pgfoundry.org/docman/view.php/1000261/473/README.pg_bulkload-2.2.0.txt

Best regards,

Dimitri Fontaine wrote:
> Hi,
>
> Le Sunday 04 November 2007 11:22:19 Reg Me Please, vous avez écrit :
>> That is, should I drop all indexes during a "COPY ... FROM" in order to
>> gain the maximum speed to load data?
>
> When looking for a way to speed up data loading, you may want to consider
> pgbulkload, a project which optimizes index creation while loading data, and
> bypass constraints: it's useful when you know you trust input.
> As I've never used it myself, I can only provides following links:
>
>   http://pgfoundry.org/projects/pgbulkload
>   http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf
>
> Regards,

--
Toru SHIMOGAKI<shimogaki.toru@oss.ntt.co.jp>
NTT Open Source Software Center


Re: COPY ... FROM and index usage

От
Lincoln Yeoh
Дата:
Hi,

Anyone have comparisons/benchmarks to give some
idea of the potential performance gains?

Say compared to doing the stuff here:
http://www.postgresql.org/docs/8.2/static/populate.html

Regards,
Link.

At 09:35 AM 11/5/2007, Toru SHIMOGAKI wrote:
>Dimitri, thank you for your quoting. I'm a pg_bulkload author.
>
>pg_bulkload is optimized especially for appending data to table with indexes.
>If you use it, you don't need to drop index
>before loading data. But you have to consider
>conditions carefully as Dimitri said below. See also pg_bulkload README:
>
>http://pgfoundry.org/docman/view.php/1000261/473/README.pg_bulkload-2.2.0.txt
>
>Best regards,
>
>Dimitri Fontaine wrote:
>>Hi,
>>Le Sunday 04 November 2007 11:22:19 Reg Me Please, vous avez écrit :
>>>That is, should I drop all indexes during a "COPY ... FROM" in order to
>>>gain the maximum speed to load data?
>>When looking for a way to speed up data
>>loading, you may want to consider pgbulkload, a
>>project which optimizes index creation while
>>loading data, and bypass constraints: it's
>>useful when you know you trust input.
>>As I've never used it myself, I can only provides following links:
>>   http://pgfoundry.org/projects/pgbulkload
>>   http://pgfoundry.org/docman/view.php/1000261/456/20060709_pg_bulkload.pdf
>>Regards,
>
>--
>Toru SHIMOGAKI<shimogaki.toru@oss.ntt.co.jp>
>NTT Open Source Software Center
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 6: explain analyze is your friend