Обсуждение: importing large files

Поиск
Список
Период
Сортировка

importing large files

От
"olivier.scalbert@algosyn.com"
Дата:
Hello,

I need to import between 100 millions to one billion records in a
table. Each record is composed of  two char(16) fields. Input format
is a huge csv file.I am running on a linux box with 4gb of ram.
First I create the table. Second I 'copy from' the cvs file. Third I
create the index on the first field.
The overall process takes several hours. The cpu seems to be the
limitation, not the memory or the IO.
Are there any tips to improve the speed ?

Thanks very much,

Olivier


Re: importing large files

От
Dimitri Fontaine
Дата:
Hi,

Le Friday 28 September 2007 10:22:49 olivier.scalbert@algosyn.com, vous avez
écrit :
> I need to import between 100 millions to one billion records in a
> table. Each record is composed of  two char(16) fields. Input format
> is a huge csv file.I am running on a linux box with 4gb of ram.
> First I create the table. Second I 'copy from' the cvs file. Third I
> create the index on the first field.
> The overall process takes several hours. The cpu seems to be the
> limitation, not the memory or the IO.
> Are there any tips to improve the speed ?

If you don't need to fire any trigger and trust the input data, then you may
benefit from the pgbulkload project:
  http://pgbulkload.projects.postgresql.org/

The "conditions of usage" may be lighter than what I think they are, though.

Regards,
--
dim