Обсуждение: batch insertion

Поиск
Список
Период
Сортировка

batch insertion

От
Korisk
Дата:
Hi!
I want quick insert into db a lot of data (in form of triplets). Data is formed dynamical so "COPY" is not suitable.
I tried batch insert like this:

insert into triplets values (1,1,1);
insert into triplets values (1,1,1), (3,2,5), (4,5,5);
...
insert into triplets values (1,1,1), (3,2,5), (4,5,5) .... ;

The more triplets I use the quicker operation is.
With preparation it looks like this:

res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float);",3, NULL);
...
res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float), ($4::bigint, $5::bigint,
$6::float),($7::bigint, $8::bigint, $9::float), ($10::bigint, $11::bigint, $12::float);",12, NULL); 
...

The question:
Is there any way to prepare query with any number of triplets without casting such a long string?

Thank you.


Re: batch insertion

От
Allan Kamau
Дата:



On Sun, Aug 25, 2013 at 3:15 AM, Korisk <korisk@yandex.ru> wrote:
Hi!
I want quick insert into db a lot of data (in form of triplets). Data is formed dynamical so "COPY" is not suitable.
I tried batch insert like this:

insert into triplets values (1,1,1);
insert into triplets values (1,1,1), (3,2,5), (4,5,5);
...
insert into triplets values (1,1,1), (3,2,5), (4,5,5) .... ;

The more triplets I use the quicker operation is.
With preparation it looks like this:

res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float);",3, NULL);
...
res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float), ($4::bigint, $5::bigint, $6::float), ($7::bigint, $8::bigint, $9::float), ($10::bigint, $11::bigint, $12::float);",12, NULL);
...

The question:
Is there any way to prepare query with any number of triplets without casting such a long string?

Thank you.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


You may still use COPY as follows:
Let every thread that is generating data create a file on disk (at some predetermined directory) to which the tread would write data in a format (such as CSV, TSV) that COPY can use.
Use some sort of counter that would be updated for each write of a record to this file.
Then provide a counter threshold which when the value of your counter reaches (surpasses) your thread will first generate a COPY command and send out the contents of this file via the COPY command to your DB. Now delete the contents of the file and write the record into it.
You may also write some code to do on demand writing of the contents of this file to the DB when some event such as an indication to terminate the application happens.
All the above steps are to be performed within each thread of you application in isolation.

You may also write some clean up code that would look for the existence of these files when you application starts and writes the the contents to the DB (followed by the deletion of the files), this is done to cater for situations where your application may not have gracefully shutdown.



Allan.




Re: batch insertion

От
Steve Atkins
Дата:
On Aug 24, 2013, at 5:15 PM, Korisk <Korisk@yandex.ru> wrote:

> Hi!
> I want quick insert into db a lot of data (in form of triplets). Data is formed dynamical so "COPY" is not suitable.

COPY works just fine for dynamically generated data, and it's probably the right thing to use if you're bulk loading
data(it's about as fast as you can get for a single threaded load). 

Take a look at the PQputCopyData() and PQputCopyEnd() functions.

Cheers,
  Steve

> I tried batch insert like this:
>
> insert into triplets values (1,1,1);
> insert into triplets values (1,1,1), (3,2,5), (4,5,5);
> ...
> insert into triplets values (1,1,1), (3,2,5), (4,5,5) .... ;
>
> The more triplets I use the quicker operation is.
> With preparation it looks like this:
>
> res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float);",3, NULL);
> ...
> res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float), ($4::bigint,
$5::bigint,$6::float), ($7::bigint, $8::bigint, $9::float), ($10::bigint, $11::bigint, $12::float);",12, NULL); 
> ...
>
> The question:
> Is there any way to prepare query with any number of triplets without casting such a long string?
>
> Thank you.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



Re: batch insertion

От
Adrian Klaver
Дата:
On 08/24/2013 05:15 PM, Korisk wrote:
> Hi!
> I want quick insert into db a lot of data (in form of triplets). Data is formed dynamical so "COPY" is not suitable.
> I tried batch insert like this:
>
> insert into triplets values (1,1,1);
> insert into triplets values (1,1,1), (3,2,5), (4,5,5);
> ...
> insert into triplets values (1,1,1), (3,2,5), (4,5,5) .... ;
>
> The more triplets I use the quicker operation is.
> With preparation it looks like this:
>
> res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float);",3, NULL);
> ...
> res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float), ($4::bigint,
$5::bigint,$6::float), ($7::bigint, $8::bigint, $9::float), ($10::bigint, $11::bigint, $12::float);",12, NULL); 
> ...
>
> The question:
> Is there any way to prepare query with any number of triplets without casting such a long string?,

Others have already posted about using COPY, so I will go another route.
The increased speed you see is probably a result of more data being
included in each transaction. From your example it is not clear if you
are batching your INSERTs. If not that is another way go, prepare your
statement then loop through your data in batches where a batch is
between a BEGIN and a COMMIT. This is one of the reasons COPY is so
fast, the data is dumped inside a single transaction. Unfortunately I do
not use libpq so I cannot provide an example.

>
> Thank you.
>
>


--
Adrian Klaver
adrian.klaver@gmail.com


Re: batch insertion

От
Merlin Moncure
Дата:
On Sat, Aug 24, 2013 at 7:15 PM, Korisk <korisk@yandex.ru> wrote:
> Hi!
> I want quick insert into db a lot of data (in form of triplets). Data is formed dynamical so "COPY" is not suitable.
> I tried batch insert like this:
>
> insert into triplets values (1,1,1);
> insert into triplets values (1,1,1), (3,2,5), (4,5,5);
> ...
> insert into triplets values (1,1,1), (3,2,5), (4,5,5) .... ;
>
> The more triplets I use the quicker operation is.
> With preparation it looks like this:
>
> res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float);",3, NULL);
> ...
> res = PQprepare(conn, "qu", "insert into triplets values ($1::bigint, $2::bigint, $3::float), ($4::bigint,
$5::bigint,$6::float), ($7::bigint, $8::bigint, $9::float), ($10::bigint, $11::bigint, $12::float);",12, NULL); 
> ...
>
> The question:
> Is there any way to prepare query with any number of triplets without casting such a long string?

yes.  you can use COPY with some tricks, or use arrays.

CREATE TYPE triplet_t AS (a bigint, b bigint, c bigint);

WITH data AS (SELECT UNNEST($1::triplet_t[]) INSERT INTO triplets
SELECT * FROM data;

Also see libpqtypes: http://libpqtypes.esilo.com/man3/pqt-composites.html

merlin