Обсуждение: Problem using COPY command to load data

Поиск
Список
Период
Сортировка

Problem using COPY command to load data

От
Glen Beane
Дата:
I'm not sure if this is the write place to post this question or not, but I
hope someone can help me out.

I am using the copy_from command from the python psycopg2 library to do some
bulk data loading of a postgres database.  This had been working OK, until
my script barfed because I was being careless, and it seemed to leave the
database in a strange state that I can't recover from.  Here is what
happened:

I am connecting to postgres via TCP, but I happen to be running the script
on the same host ast the server right now. I had forgotten to clear the
contents of one table before loading it from a file and the copy_from
command exited with error because of conflicting primary keys.  After this
any time I attempted to execute copy_from it would just hang (after
successfully connecting to the database and doing a series of DELETE FROM
SQL commands).

I just see something like this if I look through the output of ps:

postgres 30432  1.7  1.2  42884 12512 ?        Ss   13:16   0:00 postgres:
biomart mgi_biomart_intermediate 127.0.1.1(41831) COPY

The script hangs forever, and I don't see anything in the postgres log.

If I switched my script to connect through a unix socket (remove
host='hostname' from the psycopg2.connect() call), then it works again.
I've restarted postgresql many times, the TCP connection always hangs on the
COPY from my script now.

I have fixed my script to be more careful and handle the copy_from() error
in a more sane way, but I'd like to fix whatever is now wrong with postgres.

Here are some entries from pg_hba.conf:

host all all 10.0.0.0/8 md5
host all all 127.0.0.1/32 md5
host all all  127.0.1.1/32 md5
local all biomart md5

Everything else in pg_hba.conf is default
The only change to postgresql.conf is to change the listen_address to '*'



Any insight would be appreciated.




--
Glen L. Beane
Software Engineer
The Jackson Laboratory
Phone (207) 288-6153



Re: Problem using COPY command to load data

От
Tom Lane
Дата:
Glen Beane <Glen.Beane@jax.org> writes:
> I am using the copy_from command from the python psycopg2 library to do some
> bulk data loading of a postgres database.  This had been working OK, until
> my script barfed because I was being careless, and it seemed to leave the
> database in a strange state that I can't recover from.

What PG version is this exactly?  What does pg_stat_activity show?  Is
there any indication of un-granted locks in pg_locks?

The most direct evidence about why it's stuck would probably be had by
attaching to the backend process with gdb and getting a stack trace.

            regards, tom lane

Re: Problem using COPY command to load data

От
Glen Beane
Дата:


On 11/11/08 2:25 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> Glen Beane <Glen.Beane@jax.org> writes:
>> I am using the copy_from command from the python psycopg2 library to do some
>> bulk data loading of a postgres database.  This had been working OK, until
>> my script barfed because I was being careless, and it seemed to leave the
>> database in a strange state that I can't recover from.
>
> What PG version is this exactly?
8.3.3


> What does pg_stat_activity show?

This is the only non-idle connection:

 16498 | mgi_biomart_intermediate |   31356 |    16386 | biomart  | COPY
markers FROM stdin USING DELIMITERS '      ' | f       | 2008-11-11
17:10:36.124919-05 | 2008-11-11 17:10:36.125522-05 | 2008-11-11
17:10:32.213159-05 | 127.0.1.1   |       35021


>  Is
> there any indication of un-granted locks in pg_locks?

gbeane=# select * from pg_locks;
   locktype    | database | relation | page | tuple | virtualxid |
transactionid | classid | objid | objsubid | virtualtransaction |  pid  |
mode       | granted
---------------+----------+----------+------+-------+------------+----------
-----+---------+-------+----------+--------------------+-------+------------
------+---------
 virtualxid    |          |          |      |       | 5/3        |
|         |       |          | 5/3                | 31361 | ExclusiveLock
| t
 relation      |    16385 |    10969 |      |       |            |
|         |       |          | 5/3                | 31361 | AccessShareLock
| t
 relation      |    16498 |    16507 |      |       |            |
|         |       |          | 3/1151             | 31356 | RowExclusiveLock
| t
 virtualxid    |          |          |      |       | 3/1151     |
|         |       |          | 3/1151             | 31356 | ExclusiveLock
| t
 transactionid |          |          |      |       |            |
637 |         |       |          | 3/1151             | 31356 |
ExclusiveLock    | t
 relation      |    16498 |    16510 |      |       |            |
|         |       |          | 3/1151             | 31356 | RowExclusiveLock
| t


> The most direct evidence about why it's stuck would probably be had by
> attaching to the backend process with gdb and getting a stack trace.
>

I'll take a look with gdb when I have a moment.

>                         regards, tom lane

--
Glen L. Beane
Software Engineer
The Jackson Laboratory
Phone (207) 288-6153



Re: Problem using COPY command to load data

От
Glen Beane
Дата:


On 11/11/08 2:25 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> The most direct evidence about why it's stuck would probably be had by
> attaching to the backend process with gdb and getting a stack trace.

It wasn't built with debugging symbols so there is some missing info, but
here is what I get if I attach with gdb:

(gdb) where
#0  0xb7f63410 in __kernel_vsyscall ()
#1  0xb7ac9273 in read () from /lib/tls/i686/cmov/libc.so.6
#2  0xb7d42d37 in ?? () from /usr/lib/i686/cmov/libcrypto.so.0.9.8
#3  0x081a1226 in ?? ()
#4  0xb7d40db1 in BIO_read () from /usr/lib/i686/cmov/libcrypto.so.0.9.8
#5  0xb7e05362 in ssl3_read_n () from /usr/lib/i686/cmov/libssl.so.0.9.8
#6  0xb7e05b2e in ssl3_read_bytes () from /usr/lib/i686/cmov/libssl.so.0.9.8
#7  0xb7e03096 in ssl3_read () from /usr/lib/i686/cmov/libssl.so.0.9.8
#8  0xb7e13b78 in SSL_read () from /usr/lib/i686/cmov/libssl.so.0.9.8
#9  0x081a128a in secure_read ()
#10 0x081a94c1 in ?? ()
#11 0x081a95d9 in pq_getbytes ()
#12 0x081a9744 in pq_getmessage ()
#13 0x0813853c in ?? ()
#14 0x0813883a in ?? ()
#15 0x08138a05 in ?? ()
#16 0x0813b456 in DoCopy ()
#17 0x082302e4 in ProcessUtility ()
#18 0x0822cc6b in ?? ()
#19 0x0822d8e3 in ?? ()
#20 0x0822e104 in PortalRun ()
#21 0x08228eb5 in ?? ()
#22 0x08229f6b in PostgresMain ()
#23 0x081fa39d in ?? ()
#24 0x081fb3cb in PostmasterMain ()
#25 0x081ab902 in main ()

--
Glen L. Beane
Software Engineer
The Jackson Laboratory
Phone (207) 288-6153



Re: Problem using COPY command to load data

От
Tom Lane
Дата:
Glen Beane <Glen.Beane@jax.org> writes:
> On 11/11/08 2:25 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>> The most direct evidence about why it's stuck would probably be had by
>> attaching to the backend process with gdb and getting a stack trace.

> It wasn't built with debugging symbols so there is some missing info, but
> here is what I get if I attach with gdb:

> (gdb) where
> #0  0xb7f63410 in __kernel_vsyscall ()
> #1  0xb7ac9273 in read () from /lib/tls/i686/cmov/libc.so.6
> #2  0xb7d42d37 in ?? () from /usr/lib/i686/cmov/libcrypto.so.0.9.8
> #3  0x081a1226 in ?? ()
> #4  0xb7d40db1 in BIO_read () from /usr/lib/i686/cmov/libcrypto.so.0.9.8
> #5  0xb7e05362 in ssl3_read_n () from /usr/lib/i686/cmov/libssl.so.0.9.8
> #6  0xb7e05b2e in ssl3_read_bytes () from /usr/lib/i686/cmov/libssl.so.0.9.8
> #7  0xb7e03096 in ssl3_read () from /usr/lib/i686/cmov/libssl.so.0.9.8
> #8  0xb7e13b78 in SSL_read () from /usr/lib/i686/cmov/libssl.so.0.9.8
> #9  0x081a128a in secure_read ()
> #10 0x081a94c1 in ?? ()
> #11 0x081a95d9 in pq_getbytes ()
> #12 0x081a9744 in pq_getmessage ()
> #13 0x0813853c in ?? ()
> #14 0x0813883a in ?? ()
> #15 0x08138a05 in ?? ()
> #16 0x0813b456 in DoCopy ()

Well, this is sufficient to draw a conclusion: it's waiting for the
client to send it some more COPY data.  So next you need to look into
what the client's state is.  What software are you using on the client
side, anyway?

            regards, tom lane

Re: Problem using COPY command to load data

От
Glen Beane
Дата:


On 11/11/08 9:09 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> Glen Beane <Glen.Beane@jax.org> writes:
>> On 11/11/08 2:25 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>>> The most direct evidence about why it's stuck would probably be had by
>>> attaching to the backend process with gdb and getting a stack trace.
>
>> It wasn't built with debugging symbols so there is some missing info, but
>> here is what I get if I attach with gdb:
>
>> (gdb) where
>> #0  0xb7f63410 in __kernel_vsyscall ()
>> #1  0xb7ac9273 in read () from /lib/tls/i686/cmov/libc.so.6
>> #2  0xb7d42d37 in ?? () from /usr/lib/i686/cmov/libcrypto.so.0.9.8
>> #3  0x081a1226 in ?? ()
>> #4  0xb7d40db1 in BIO_read () from /usr/lib/i686/cmov/libcrypto.so.0.9.8
>> #5  0xb7e05362 in ssl3_read_n () from /usr/lib/i686/cmov/libssl.so.0.9.8
>> #6  0xb7e05b2e in ssl3_read_bytes () from /usr/lib/i686/cmov/libssl.so.0.9.8
>> #7  0xb7e03096 in ssl3_read () from /usr/lib/i686/cmov/libssl.so.0.9.8
>> #8  0xb7e13b78 in SSL_read () from /usr/lib/i686/cmov/libssl.so.0.9.8
>> #9  0x081a128a in secure_read ()
>> #10 0x081a94c1 in ?? ()
>> #11 0x081a95d9 in pq_getbytes ()
>> #12 0x081a9744 in pq_getmessage ()
>> #13 0x0813853c in ?? ()
>> #14 0x0813883a in ?? ()
>> #15 0x08138a05 in ?? ()
>> #16 0x0813b456 in DoCopy ()
>
> Well, this is sufficient to draw a conclusion: it's waiting for the
> client to send it some more COPY data.  So next you need to look into
> what the client's state is.  What software are you using on the client
> side, anyway?

Python with the psycopg2 library.  I swear this was working earlier today.
Maybe I am imagining things :)
It does work with a unix socket, and I have a deadline to meet, so for now I
just need to make sure I am running this on the same host as the postgresql
server


Re: Problem using COPY command to load data

От
Adrian Klaver
Дата:
On Tuesday 11 November 2008 6:20:09 pm Glen Beane wrote:

>
> Python with the psycopg2 library.  I swear this was working earlier today.
> Maybe I am imagining things :)
> It does work with a unix socket, and I have a deadline to meet, so for now
> I just need to make sure I am running this on the same host as the
>postgresqlserver

See these two posts for potential solution:
http://lists.initd.org/pipermail/psycopg/2008-October/006224.html
http://lists.initd.org/pipermail/psycopg/2008-October/006225.html

They point to a problem with ssl and psycopg2 COPY. Also I know there where
some problems with psycopg2 and copy that where fixed in psycopg2 version
2.0.7.
--
Adrian Klaver
aklaver@comcast.net

Re: Problem using COPY command to load data

От
Adrian Klaver
Дата:
On Tuesday 11 November 2008 6:31:04 pm Adrian Klaver wrote:
> On Tuesday 11 November 2008 6:20:09 pm Glen Beane wrote:
> > Python with the psycopg2 library.  I swear this was working earlier
> > today. Maybe I am imagining things :)
> > It does work with a unix socket, and I have a deadline to meet, so for
> > now I just need to make sure I am running this on the same host as the
> > postgresqlserver
>
> See these two posts for potential solution:
> http://lists.initd.org/pipermail/psycopg/2008-October/006224.html
> http://lists.initd.org/pipermail/psycopg/2008-October/006225.html
>
> They point to a problem with ssl and psycopg2 COPY. Also I know there where
> some problems with psycopg2 and copy that where fixed in psycopg2 version
> 2.0.7.
> --
> Adrian Klaver
> aklaver@comcast.net

That should have been version 2.0.8.

--
Adrian Klaver
aklaver@comcast.net