Обсуждение: Testing of a fast method to bulk insert a Pandas DataFrame into Postgres

Поиск

Список

Период

Сортировка

Testing of a fast method to bulk insert a Pandas DataFrame into Postgres

От

Shaozhong SHI

Дата:

04 октября 2021 г., 18:44:28

Has anyone tested this one?

A Fast Method to Bulk Insert a Pandas DataFrame into Postgres · Ellis Valentiner

I tried psql_insert_copy method, but I got the following error message.

to_sql() got an unexpected keyword argument 'method'

Can anyone shed light on this?

Regards,

David

Re: Testing of a fast method to bulk insert a Pandas DataFrame into Postgres

От

Adrian Klaver

Дата:

04 октября 2021 г., 19:06:40

On 10/4/21 8:44 AM, Shaozhong SHI wrote:
> 
> Has anyone tested this one?
> A Fast Method to Bulk Insert a Pandas DataFrame into Postgres · Ellis 
> Valentiner 
> <https://ellisvalentiner.com/post/a-fast-method-to-insert-a-pandas-dataframe-into-postgres/>
> 
> I tried psql_insert_copy method, but I got the following error message.
> 
> to_sql() got an unexpected keyword argument 'method'
> 
> 
> Can anyone shed light on this?

method is a legitimate keyword:

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html?highlight=to_sql#pandas.DataFrame.to_sql

So there must be something about how you used it.

Pandas version?

The actual code you used when the error occurred?


> 
> 
> Regards,
> 
> 
> David
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Testing of a fast method to bulk insert a Pandas DataFrame into Postgres

От

Shaozhong SHI

Дата:

04 октября 2021 г., 19:20:42

Hello, Adrian Klaver,

Pandas version is 0.23.0.

I used the following code:

def psql_insert_copy(table, conn, keys, data_iter):

# gets a DBAPI connection that can provide a cursor

dbapi_conn = conn.connection

with dbapi_conn.cursor() as cur:

s_buf = StringIO()

writer = csv.writer(s_buf)

writer.writerows(data_iter)

s_buf.seek(0)

columns = ', '.join('"{}"'.format(k) for k in keys)

if table.schema:

table_name = '{}.{}'.format(table.schema, table.name)

else:

table_name = table.name

sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(

table_name, columns)

cur.copy_expert(sql=sql, file=s_buf)

engine = create_engine('postgresql+psycopg2://:5432/postgres')

try:

df.to_sql('test1', engine, schema='public', if_exists='append', index=False, method=psql_insert_copy)

I could not find obvious reasons.

Regards,

David

On Mon, 4 Oct 2021 at 17:06, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 10/4/21 8:44 AM, Shaozhong SHI wrote:
>
> Has anyone tested this one?
> A Fast Method to Bulk Insert a Pandas DataFrame into Postgres · Ellis
> Valentiner
> <https://ellisvalentiner.com/post/a-fast-method-to-insert-a-pandas-dataframe-into-postgres/>
>
> I tried psql_insert_copy method, but I got the following error message.
>
> to_sql() got an unexpected keyword argument 'method'
>
>
> Can anyone shed light on this?

method is a legitimate keyword:

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html?highlight=to_sql#pandas.DataFrame.to_sql

So there must be something about how you used it.

Pandas version?

The actual code you used when the error occurred?

>
>
> Regards,
>
>
> David
>

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Testing of a fast method to bulk insert a Pandas DataFrame into Postgres

От

Rob Sargent

Дата:

04 октября 2021 г., 19:25:12

On Oct 4, 2021, at 10:20 AM, Shaozhong SHI <shishaozhong@gmail.com> wrote:

Hello, Adrian Klaver,

Pandas version is 0.23.0.

I used the following code:

def psql_insert_copy(table, conn, keys, data_iter):
# gets a DBAPI connection that can provide a cursor
dbapi_conn = conn.connection
with dbapi_conn.cursor() as cur:
s_buf = StringIO()
writer = csv.writer(s_buf)
writer.writerows(data_iter)
s_buf.seek(0)

columns = ', '.join('"{}"'.format(k) for k in keys)
if table.schema:
table_name = '{}.{}'.format(table.schema, table.name)
else:
table_name = table.name

sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(
table_name, columns)
cur.copy_expert(sql=sql, file=s_buf)

engine = create_engine('postgresql+psycopg2://:5432/postgres')
try:
df.to_sql('test1', engine, schema='public', if_exists='append', index=False, method=psql_insert_copy)

you need to quote ‘psql_insert_copy'

Re: Testing of a fast method to bulk insert a Pandas DataFrame into Postgres

От

Adrian Klaver

Дата:

04 октября 2021 г., 19:50:52

On 10/4/21 9:20 AM, Shaozhong SHI wrote:
> Hello, Adrian Klaver,
> 
> Pandas version is 0.23.0.

The reason the below does not work is method did not show up until 
pandas 0.24.0.

> 
> I used the following code:
> 
> def psql_insert_copy(table, conn, keys, data_iter):
>      # gets a DBAPI connection that can provide a cursor
>      dbapi_conn = conn.connection
>      with dbapi_conn.cursor() as cur:
>          s_buf = StringIO()
>          writer = csv.writer(s_buf)
>          writer.writerows(data_iter)
>          s_buf.seek(0)
> 
>          columns = ', '.join('"{}"'.format(k) for k in keys)
>          if table.schema:
>              table_name = '{}.{}'.format(table.schema, table.name 
> <http://table.name>)
>          else:
>              table_name = table.name <http://table.name>
> 
>          sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(
>              table_name, columns)
>          cur.copy_expert(sql=sql, file=s_buf)
> engine = create_engine('postgresql+psycopg2://:5432/postgres')
> try:
>      df.to_sql('test1', engine, schema='public', if_exists='append', 
> index=False, method=psql_insert_copy)
> 
> I could not find obvious reasons.


> 
> Regards,
> 
> David
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Testing of a fast method to bulk insert a Pandas DataFrame into Postgres

От

Shaozhong SHI

Дата:

04 октября 2021 г., 20:10:29

Hello, Adrian Klaver,

What is the robust way to upgrade Pandas?

Regards,

David

On Monday, 4 October 2021, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 10/4/21 9:20 AM, Shaozhong SHI wrote:
Hello, Adrian Klaver,

Pandas version is 0.23.0.

The reason the below does not work is method did not show up until pandas 0.24.0.

I used the following code:

def psql_insert_copy(table, conn, keys, data_iter):
   # gets a DBAPI connection that can provide a cursor
   dbapi_conn = conn.connection
   with dbapi_conn.cursor() as cur:
   s_buf = StringIO()
   writer = csv.writer(s_buf)
   writer.writerows(data_iter)
   s_buf.seek(0)

   columns = ', '.join('"{}"'.format(k) for k in keys)
   if table.schema:
   table_name = '{}.{}'.format(table.schema, table.name <http://table.name>)
   else:
   table_name = table.name <http://table.name>

   sql = 'COPY {} ({}) FROM STDIN WITH CSV'.format(
   table_name, columns)
   cur.copy_expert(sql=sql, file=s_buf)
engine = create_engine('postgresql+psycopg2://:5432/postgres')
try:
   df.to_sql('test1', engine, schema='public', if_exists='append', index=False, method=psql_insert_copy)

I could not find obvious reasons.

Regards,

David

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Testing of a fast method to bulk insert a Pandas DataFrame into Postgres

От

Adrian Klaver

Дата:

04 октября 2021 г., 20:28:34

On 10/4/21 10:10 AM, Shaozhong SHI wrote:
> Hello, Adrian Klaver,
> What is the robust way to upgrade Pandas?

Carefully.

The most recent version is 1.3.3, which is approximately 5 versions 
ahead of where you are now. The big jump is when Pandas went from 0.25 
to 1.0. See docs here:

https://pandas.pydata.org/docs/whatsnew/v1.0.0.html?highlight=upgrade

So the process should be 0.24 -> 0.25, verify, 0.25 -> 1.0, verify. Then 
on to wherever you want to end up a step at a time.

Before each step spend time here:

https://pandas.pydata.org/docs/whatsnew/

to see what the gotcha's are.

> Regards,
> David
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: Testing of a fast method to bulk insert a Pandas DataFrame into Postgres

От

Adrian Klaver

Дата:

04 октября 2021 г., 20:30:22

On 10/4/21 10:28 AM, Adrian Klaver wrote:
> On 10/4/21 10:10 AM, Shaozhong SHI wrote:
>> Hello, Adrian Klaver,
>> What is the robust way to upgrade Pandas?
> 
> Carefully.
> 
> The most recent version is 1.3.3, which is approximately 5 versions 
> ahead of where you are now. The big jump is when Pandas went from 0.25 
> to 1.0. See docs here:
> 
> https://pandas.pydata.org/docs/whatsnew/v1.0.0.html?highlight=upgrade
> 
> So the process should be 0.24 -> 0.25, verify, 0.25 -> 1.0, verify. Then 
> on to wherever you want to end up a step at a time.
> 
> Before each step spend time here:
> 
> https://pandas.pydata.org/docs/whatsnew/
> 
> to see what the gotcha's are.

Should have added:

If you are not already working in a virtualenv it would be a good idea 
to do the above in one or more.

> 
> 
>> Regards,
>> David
>>
> 
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Testing of a fast method to bulk insert a Pandas DataFrame into Postgres