Re: Batch updates to 1 column using python-pgsql in multiple rows

Поиск
Список
Период
Сортировка
От Josh Kupershmidt
Тема Re: Batch updates to 1 column using python-pgsql in multiple rows
Дата
Msg-id CAK3UJRGu7=Opksb4mvgBenO=JdaSCwfzXxkS0+bfa2J+wOTKFA@mail.gmail.com
обсуждение исходный текст
Ответ на Batch updates to 1 column using python-pgsql in multiple rows  (Abhijeet R <abhijeet.1989@gmail.com>)
Ответы Re: Batch updates to 1 column using python-pgsql in multiple rows  (Abhijeet R <abhijeet.1989@gmail.com>)
Список pgsql-novice
On Wed, Aug 29, 2012 at 6:58 AM, Abhijeet R <abhijeet.1989@gmail.com> wrote:
> I have to update a certain column of a table at many rows. (order of
> thousands).
>
> For now, I am using a prepare statement and executing the queries like:
>
> query = "UPDATE mytable SET myvar = $1 WHERE myvar2=$2"
> db.execute(query, (var, var1)) #db is my connection object

Are you able to reformulate the multiple individual UPDATEs into a
single bulk UPDATE statement? That will almost certainly be the
fastest way.

> But, I still feel like I can make it faster by using db.executemany()
> method.  The thing is I am not able to get the syntax of it.

That's more of a question for your specific database adapter
(python-pgsql, according to your message subject) rather than this
list. But if you are able to construct a list of tuples, e.g.
  list_to_insert = [(var, var1) for (var, var1) in ... ]

Then you should be able to use executemany() like this:

  db.executemany(query, list_to_insert)

BTW, is there a reason you're using python-pgsql instead of, say,
psycopg2? It seems like the former may be a dead project.

> Can anyone please guide me as to how do I do batch updates in the fastest
> possible manner? Also, will executemany() really be faster? If so, how does
> that result in faster batch updates?

Again, the fastest way to perform these UPDATEs would be to
reformulate your individual queries into a single bulk UPDATE. If you
really are setting each value of "myvar" to something different, and
not a value derived from some other column within its row, and you
can't reasonably formulate that logic in a single bulk UPDATE query,
then I'd bet the next fastest way would be to use a COPY statement to
load a temporary table with your (myvar, myvar2) pairs as rows, then
perform a single:

  UPDATE mytable
    FROM my_temp_table
    SET myvar = my_temp_table.myvar,
    WHERE myvar2 = my_temp_table.myvar2;

Josh


В списке pgsql-novice по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: using index "pg_toast_..." despite IgnoreSystemIndexes
Следующее
От: Abhijeet R
Дата:
Сообщение: Re: Batch updates to 1 column using python-pgsql in multiple rows