Обсуждение: Yet another Performance Question

Поиск
Список
Период
Сортировка

Yet another Performance Question

От
elwood@agouros.de (Konstantinos Agouros)
Дата:
Hi,

I once read in Oracle Performance Tuning, that if one inserts or changes large
amounts of data in a table, it might be better to drop indices before doing
this and recreating them afterwards. Could someone give a hint on how this
is in Postgres 7.1? Currently I am experiencing a massive slowdown in importing
data.

Konstantin
--
Dipl-Inf. Konstantin Agouros aka Elwood Blues. Internet: elwood@agouros.de
Otkerstr. 28, 81547 Muenchen, Germany. Tel +49 89 69370185
----------------------------------------------------------------------------
"Captain, this ship will not sustain the forming of the cosmos." B'Elana Torres

Re: Yet another Performance Question

От
"Len Morgan"
Дата:
>I once read in Oracle Performance Tuning, that if one inserts or changes
large
>amounts of data in a table, it might be better to drop indices before doing
>this and recreating them afterwards. Could someone give a hint on how this
>is in Postgres 7.1? Currently I am experiencing a massive slowdown in
importing
>data.

Postgres "suffers" the same problem which is very logical if you think about
it.  Inserts must adjust the indexes for every record and don't really know
that there are a lot of other rows comming.  Also, due to the multiuser
nature of Postgres, other users could be accessing data between your rows of
inserts and that data must be valid at that time.

A non-indexed version of a table is just as accessible as an indexed one
(though not as fast) so you have to decide if it's better to slow down a
query or two while you insert/index or spend much more time having a good
index after each insert.

What would be nice is a simple "disable indexes on this (these) tables"
command.  The enable indexes.... command would then do a vacuum analyze on
the effected tables when you were done.  This would make sure that ALL of
the indexes got rebuilt (I occationally forget an index when doing it "by
hand").

len morgan


Re: Yet another Performance Question

От
Konstantinos Agouros
Дата:
On Wed, Apr 18, 2001 at 07:18:40AM -0500, Len Morgan wrote:
> >I once read in Oracle Performance Tuning, that if one inserts or changes
> large
> >amounts of data in a table, it might be better to drop indices before doing
> >this and recreating them afterwards. Could someone give a hint on how this
> >is in Postgres 7.1? Currently I am experiencing a massive slowdown in
> importing
> >data.
>
> Postgres "suffers" the same problem which is very logical if you think about
> it.  Inserts must adjust the indexes for every record and don't really know
> that there are a lot of other rows comming.  Also, due to the multiuser
> nature of Postgres, other users could be accessing data between your rows of
> inserts and that data must be valid at that time.
>
> A non-indexed version of a table is just as accessible as an indexed one
> (though not as fast) so you have to decide if it's better to slow down a
> query or two while you insert/index or spend much more time having a good
> index after each insert.
Frankly what matters is the time it takes alltogether. I have a script that
first does a few million inserts, and than queries on this data... these queries
don't terminate in a reasonable (<10days) time, if I don't use indices.

So I will add some drop/create index-commands to the script...

Konstantin
--
Dipl-Inf. Konstantin Agouros aka Elwood Blues. Internet: elwood@agouros.de
Otkerstr. 28, 81547 Muenchen, Germany. Tel +49 89 69370185
----------------------------------------------------------------------------
"Captain, this ship will not sustain the forming of the cosmos." B'Elana Torres

Re: Yet another Performance Question

От
DaVinci
Дата:
On Wed, Apr 18, 2001 at 12:55:07PM +0200, Konstantinos Agouros wrote:

> I once read in Oracle Performance Tuning, that if one inserts or changes large
> amounts of data in a table, it might be better to drop indices before doing
> this and recreating them afterwards. Could someone give a hint on how this
> is in Postgres 7.1? Currently I am experiencing a massive slowdown in importing
> data.

 I use a little script that create two sql files for dropping and recreating
 indexes of a database. Sxript is for "es" shell, but it is easily adaptable
 to sh:

------------------------------------------------------------------------

#!/usr/bin/es
#
# David Espada 2000.
#

BD = multi  # Name of database.
QUERY = 'select indexdef from pg_indexes;'
F_CREA = create_indices.sql
F_BORRA = drop_indices.sql

psql $BD -c $QUERY | grep -i 'create' | grep -v 'pg_.*_index' | awk '{print $0, ";"}' > $F_CREA

awk '{ gsub(/CREATE.*INDEX/, "DROP INDEX"); print $1, $2, $3, ";" }' $F_CREA > $F_BORRA

-------------------------------------------------------------------------

 When executing script, you have two files that can use like:

     $ psql -f drop_indices.sql
    $ psql -f create_indices.sql

 I hope it helps you.

 Greets.

                                                 David

Re: Yet another Performance Question

От
Patrik Kudo
Дата:
Hi!

I'm not 100% sure, but I think it would be much faster if you use COPY
instead of INSERT when you read in a lot of data from a file.

Regards,
Patrik Kudo

--
ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol
Känns det oklart? Fråga på!

On Wed, 18 Apr 2001, Konstantinos Agouros wrote:

> Frankly what matters is the time it takes alltogether. I have a script that
> first does a few million inserts, and than queries on this data... these queries
> don't terminate in a reasonable (<10days) time, if I don't use indices.
>
> So I will add some drop/create index-commands to the script...
>
> Konstantin


Re: Yet another Performance Question

От
Konstantinos Agouros
Дата:
On Wed, Apr 18, 2001 at 02:59:53PM +0200, Patrik Kudo wrote:
> Hi!
>
> I'm not 100% sure, but I think it would be much faster if you use COPY
> instead of INSERT when you read in a lot of data from a file.
Well unless I am the database owner I only can do \copy and the script does
some manipulation before it inserts...

Konstantin
>
> Regards,
> Patrik Kudo
>
> --
> ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol
> Känns det oklart? Fråga på!
>
> On Wed, 18 Apr 2001, Konstantinos Agouros wrote:
>
> > Frankly what matters is the time it takes alltogether. I have a script that
> > first does a few million inserts, and than queries on this data... these queries
> > don't terminate in a reasonable (<10days) time, if I don't use indices.
> >
> > So I will add some drop/create index-commands to the script...
> >
> > Konstantin
>

--
Dipl-Inf. Konstantin Agouros aka Elwood Blues. Internet: elwood@agouros.de
Otkerstr. 28, 81547 Muenchen, Germany. Tel +49 89 69370185
----------------------------------------------------------------------------
"Captain, this ship will not sustain the forming of the cosmos." B'Elana Torres

Re: Yet another Performance Question

От
"Richard Huxton"
Дата:
Konstantinos Agouros <elwood@agouros.de> said:

> On Wed, Apr 18, 2001 at 02:59:53PM +0200, Patrik Kudo wrote:
> > Hi!
> >
> > I'm not 100% sure, but I think it would be much faster if you use COPY
> > instead of INSERT when you read in a lot of data from a file.
> Well unless I am the database owner I only can do copy and the script does
> some manipulation before it inserts...

Don't forget you can do \copy from stdin so you can pipe the output of your
script to psql.

Failing that, try batching inserts in transactions, a few hundred at a time.
That can speed things up. You'll need to find what figure works best for you.

- Richard Huxton

Re: Yet another Performance Question

От
martin.chantler@convergys.com
Дата:
Have you tried running PostGres with the -F option so that
disk flushes are not performed every time. Maybe this will make a
difference
to insert performance.

NB In 7.1 this is replaced by WAL which I don't know much about

MC


--

NOTICE:  The information contained in this electronic mail transmission is
intended by Convergys Corporation for the use of the named individual or
entity to which it is directed and may contain information that is
privileged or otherwise confidential.  If you have received this electronic
mail transmission in error, please delete it from your system without
copying or forwarding it, and notify the sender of the error by reply email
or by telephone (collect), so that the sender's address records can be
corrected.



Re: Yet another Performance Question

От
elwood@agouros.de (Konstantinos Agouros)
Дата:
In <OFFC4DA3D2.A3056D98-ON80256A32.00517540@cbis.com> martin.chantler@convergys.com writes:


>Have you tried running PostGres with the -F option so that
>disk flushes are not performed every time. Maybe this will make a
>difference
>to insert performance.
In the DBI-connect I still have a -F in the options, but now I am running
7.1RC2 (didn't have the time to do the upgrade to final yet). If this is
called differently now, please let me know

>NB In 7.1 this is replaced by WAL which I don't know much about

>MC


>--

>NOTICE:  The information contained in this electronic mail transmission is
>intended by Convergys Corporation for the use of the named individual or
>entity to which it is directed and may contain information that is
>privileged or otherwise confidential.  If you have received this electronic
>mail transmission in error, please delete it from your system without
>copying or forwarding it, and notify the sender of the error by reply email
>or by telephone (collect), so that the sender's address records can be
>corrected.



>---------------------------(end of broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?

>http://www.postgresql.org/users-lounge/docs/faq.html

--
Dipl-Inf. Konstantin Agouros aka Elwood Blues. Internet: elwood@agouros.de
Otkerstr. 28, 81547 Muenchen, Germany. Tel +49 89 69370185
----------------------------------------------------------------------------
"Captain, this ship will not sustain the forming of the cosmos." B'Elana Torres