Обсуждение: Performance Problem

Поиск
Список
Период
Сортировка

Performance Problem

От
roopa perumalraja
Дата:
I am new to postgres and I have 4 doubts.
 
1) I have a performance problem as I am trying to insert around 60 million rows to a table which is partitioned. So first I copied the .csv file which contains data, with COPY command to a temp table which was quick. It took only 15 to 20 minutes. Now I am inserting data from temp table to original table using insert into org_table (select * from temp_table); which is taking more than an hour & is still inserting. Is there an easy way to do this?
 
2) I want to increase the performance of database as I find it very slow which has more than 60 million rows in one table. I increased the shared_buffer parameter in postgres.conf file to 20000 but that does help much.
 
2) I have partitioned a parent table into 100 child tables so when I insert data to parent table, it automatically inserts to child table. I have followed http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html. When I did this, I noticed that when I viewed data of parent table, it had the rows of the child table and is not empty. But the child tables do have the rows in it. I don’t understand.
 
3) I want to use materialized views, I don’t understand it from http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html, can anyone explain me with a simple example.
 
Thanks in advance.
 
Roopa


Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.

Re: Performance Problem

От
Martijn van Oosterhout
Дата:
On Thu, Oct 12, 2006 at 10:26:28PM -0700, roopa perumalraja wrote:
> I am new to postgres and I have 4 doubts.
>
>   1) I have a performance problem as I am trying to insert around 60
>   million rows to a table which is partitioned. So first I copied the
>   .csv file which contains data, with COPY command to a temp table
>   which was quick. It took only 15 to 20 minutes. Now I am inserting
>   data from temp table to original table using insert into org_table
>   (select * from temp_table); which is taking more than an hour & is
>   still inserting. Is there an easy way to do this?

Does the table you're inserting into have indexes or foreign keys?
Either of those slow down loading considerably. One commen workaround
is to drop the indexes and constraints, load the data and re-add them.

>   2) I want to increase the performance of database as I find it very
>   slow which has more than 60 million rows in one table. I increased
>   the shared_buffer parameter in postgres.conf file to 20000 but that
>   does help much.

Find out the queries that are slow and use EXPLAIN to identify possible
useful indexes.

>   2) I have partitioned a parent table into 100 child tables so when
>   I insert data to parent table, it automatically inserts to child
>   table. I have followed
>   http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html.
>   When I did this, I noticed that when I viewed data of parent table,
>   it had the rows of the child table and is not empty. But the child
>   tables do have the rows in it. I don’t understand.

When you select from a parent table, it shows the rows of the child
tables also, that's kind of the point. You can say: SELECT * FROM ONLY
parent;

The partitioning may only explain the slow loading...

>   3) I want to use materialized views, I don’t understand it from
>   http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html,
>   can anyone explain me with a simple example.

Can't help you there...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: Performance Problem

От
"Uwe C. Schroeder"
Дата:
On Friday 13 October 2006 01:22, Martijn van Oosterhout wrote:
> >   1) I have a performance problem as I am trying to insert around 60
> >   million rows to a table which is partitioned. So first I copied the
> >   .csv file which contains data, with COPY command to a temp table
> >   which was quick. It took only 15 to 20 minutes. Now I am inserting
> >   data from temp table to original table using insert into org_table
> >   (select * from temp_table); which is taking more than an hour & is
> >   still inserting. Is there an easy way to do this?
>
> Does the table you're inserting into have indexes or foreign keys?
> Either of those slow down loading considerably. One commen workaround
> is to drop the indexes and constraints, load the data and re-add them.

Why do you COPY the data into a temporary table just to do a "insert into
org_table  (select * from temp_table);" ? Since you're copying ALL records
anyways, why don't you just copy the data into the "org_table" directly?

Also look for the "autocommit" setting. If autocommit is on, every insert is a
transaction on it's own - leading to a lot of overhead. Turning autocommit
off and running the inserts in batches of - say 1000 inserts per transaction
- will increase speed considerably.


    UC

--
Open Source Solutions 4U, LLC    1618 Kelly St
Phone:  +1 707 568 3056        Santa Rosa, CA 95401
Cell:   +1 650 302 2405        United States
Fax:    +1 707 568 6416

Re: Performance Problem

От
"A. Kretschmer"
Дата:
am  Fri, dem 13.10.2006, um  1:55:06 -0700 mailte Uwe C. Schroeder folgendes:
> > Does the table you're inserting into have indexes or foreign keys?
> > Either of those slow down loading considerably. One commen workaround
> > is to drop the indexes and constraints, load the data and re-add them.
>
> Why do you COPY the data into a temporary table just to do a "insert into
> org_table  (select * from temp_table);" ? Since you're copying ALL records
> anyways, why don't you just copy the data into the "org_table" directly?

Perhaps he want to modify the data in this temp. table?


>
> Also look for the "autocommit" setting. If autocommit is on, every insert is a
> transaction on it's own - leading to a lot of overhead. Turning autocommit
> off and running the inserts in batches of - say 1000 inserts per transaction
> - will increase speed considerably.

A 'insert into org_table  (select * from temp_table);' is only ONE
transaction.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47215,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Re: Performance Problem

От
Richard Broersma Jr
Дата:
>   3) I want to use materialized views, I don?t understand it from
> http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html, can anyone explain me
> with a simple example.

The following link helps to describe the idea behind each of the methods of Materialize views.

http://jarednevans.typepad.com/technoblog/2006/03/materialized_vi.html

Regards,

Richard Broersma Jr.