Обсуждение: Limitations on 7.0.3?

Поиск
Список
Период
Сортировка

Limitations on 7.0.3?

От
"ARTEAGA Jose"
Дата:
I have been using postgres 7.0.3 (Solaris 2.9) for the past 4 years
without any major problems, until about a month ago. We are now
experiencing crashes on the backend (connection lost to backend) while
running queries (inserts, etc). Anyway I don't want to make this too
technical but I'd just like to understand if maybe I'm running into some
sort of limit on the size of the database. My biggest table is currently
at 1.5B tuples.

Would appreciate if anyone could let me know or is aware of any limits
with 7.0 version.

Thx,
Jose

Re: Limitations on 7.0.3?

От
Richard Huxton
Дата:
ARTEAGA Jose wrote:
> I have been using postgres 7.0.3 (Solaris 2.9) for the past 4 years
> without any major problems, until about a month ago. We are now
> experiencing crashes on the backend (connection lost to backend) while
> running queries (inserts, etc). Anyway I don't want to make this too
> technical but I'd just like to understand if maybe I'm running into some
> sort of limit on the size of the database. My biggest table is currently
> at 1.5B tuples.
>
> Would appreciate if anyone could let me know or is aware of any limits
> with 7.0 version.

I don't remember any specific size limitations on the 7.0 series. For
more detailed help you'll have to provide some specific error messages.

Is there any chance you could move to a more recent version in the near
future? You'll get much better support with a more recent version
(there's just that many more users). You should also notice some
substantial performance improvements compared to 7.0.

--
   Richard Huxton
   Archonet Ltd

Re: Limitations on 7.0.3?

От
Alvaro Herrera
Дата:
Richard Huxton wrote:
> ARTEAGA Jose wrote:
> >I have been using postgres 7.0.3 (Solaris 2.9) for the past 4 years
> >without any major problems, until about a month ago. We are now
> >experiencing crashes on the backend (connection lost to backend) while
> >running queries (inserts, etc). Anyway I don't want to make this too
> >technical but I'd just like to understand if maybe I'm running into some
> >sort of limit on the size of the database. My biggest table is currently
> >at 1.5B tuples.
> >
> >Would appreciate if anyone could let me know or is aware of any limits
> >with 7.0 version.
>
> I don't remember any specific size limitations on the 7.0 series. For
> more detailed help you'll have to provide some specific error messages.

7.0 didn't have any protection against Xid wraparound.  As soon as you
hit the 4 billion transactions mark, your data suddenly disappeared.
That's what I heard at least -- I didn't have much experience with such
an old beast.  We switched rather quickly to 7.1.

--
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
"Before you were born your parents weren't as boring as they are now. They
got that way paying your bills, cleaning up your room and listening to you
tell them how idealistic you are."  -- Charles J. Sykes' advice to teenagers

Re: Limitations on 7.0.3?

От
Andrew Sullivan
Дата:
On Wed, Jun 06, 2007 at 02:40:08PM -0500, ARTEAGA Jose wrote:
> I have been using postgres 7.0.3 (Solaris 2.9) for the past 4 years
> without any major problems, until about a month ago. We are now
> experiencing crashes on the backend (connection lost to backend) while
> running queries (inserts, etc). Anyway I don't want to make this too
> technical but I'd just like to understand if maybe I'm running into some
> sort of limit on the size of the database. My biggest table is currently
> at 1.5B tuples.

I don't believe it's a size limit.  But 7.0 is pre-WAL.  Are you
running with the -F switch turned on, for performance?  It could well
be that you are running into data corruption that crashes the
database.  (The lack of WAL is not the only reason you might be
running into that.  7.0 is a long time ago, and there are a lot of
bugs that have been squished since then.)

A
--
Andrew Sullivan  | ajs@crankycanuck.ca
However important originality may be in some fields, restraint and
adherence to procedure emerge as the more significant virtues in a
great many others.   --Alain de Botton

Re: Limitations on 7.0.3?

От
"ARTEAGA Jose"
Дата:
I've looked at the pg_index table and we are currently at 15Mill
entries, which should be OK. After 2-3 days runtime I just get a
disconnect error from backend while doing an insert. After I restore the
DB and insert the same entries it runs fine. Following is the error I
get:
"Query pgsql8.1: PGRES_FATAL_ERROR, pqReadData() -- backend closed the
channel unexpectedly.\n\tThis probably means the backend terminated
abnormally\n\tbefore or while processing the request.\n
      INSERT INTO teststeprun (description, stepunqid, stepid,
stepstarttime, duration, parentsn, stepresult) VALUES ('Qos Reset',
'24920757/3267', ' 95.  4', '2007-06-02 19:02:05', '0 0:0:25',
'5311955', '0')"

Also worth mentioning is that I just this week found out about a very,
very important parameter "shared buffers". Ever since the original
person setup our PG (individual no longer with us) this DB had been
running without any major glitches, albeit slow. All this time the
shared buffers were running at default of "64" (8192 block size). Once I
have got this back up and running I have since set this to 1600 shared
buffers (~125MB). I've since noticed a dramatic performance improvement,
I hope that I've striked gold. But cannot claim victory yet it's only
been up for 2 days.

-Jose


-----Original Message-----
From: Alvaro Herrera [mailto:alvherre@commandprompt.com]
Sent: Thursday, June 07, 2007 7:02 AM
To: Richard Huxton
Cc: ARTEAGA Jose; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Limitations on 7.0.3?

Richard Huxton wrote:
> ARTEAGA Jose wrote:
> >I have been using postgres 7.0.3 (Solaris 2.9) for the past 4 years
> >without any major problems, until about a month ago. We are now
> >experiencing crashes on the backend (connection lost to backend)
while
> >running queries (inserts, etc). Anyway I don't want to make this too
> >technical but I'd just like to understand if maybe I'm running into
some
> >sort of limit on the size of the database. My biggest table is
currently
> >at 1.5B tuples.
> >
> >Would appreciate if anyone could let me know or is aware of any
limits
> >with 7.0 version.
>
> I don't remember any specific size limitations on the 7.0 series. For
> more detailed help you'll have to provide some specific error
messages.

7.0 didn't have any protection against Xid wraparound.  As soon as you
hit the 4 billion transactions mark, your data suddenly disappeared.
That's what I heard at least -- I didn't have much experience with such
an old beast.  We switched rather quickly to 7.1.

--
Alvaro Herrera
http://www.amazon.com/gp/registry/CTMLCN8V17R4
"Before you were born your parents weren't as boring as they are now.
They
got that way paying your bills, cleaning up your room and listening to
you
tell them how idealistic you are."  -- Charles J. Sykes' advice to
teenagers

Re: Limitations on 7.0.3?

От
Martijn van Oosterhout
Дата:
On Thu, Jun 07, 2007 at 02:32:09PM -0500, ARTEAGA Jose wrote:
> I've looked at the pg_index table and we are currently at 15Mill
> entries, which should be OK. After 2-3 days runtime I just get a
> disconnect error from backend while doing an insert. After I restore the
> DB and insert the same entries it runs fine. Following is the error I
> get:

Note: Version 7.0 *will* eat your data eventually. It has no protection
against XID wraparound, you'll just start noticing data disappearing.

From the documentation:
Prior to PostgreSQL 7.2, the only defense against XID wraparound was to
re-initdb at least every 4 billion transactions.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: Limitations on 7.0.3?

От
"Joshua D. Drake"
Дата:
Martijn van Oosterhout wrote:
> On Thu, Jun 07, 2007 at 02:32:09PM -0500, ARTEAGA Jose wrote:
>> I've looked at the pg_index table and we are currently at 15Mill
>> entries, which should be OK. After 2-3 days runtime I just get a
>> disconnect error from backend while doing an insert. After I restore the
>> DB and insert the same entries it runs fine. Following is the error I
>> get:
>
> Note: Version 7.0 *will* eat your data eventually. It has no protection
> against XID wraparound, you'll just start noticing data disappearing.
>
> From the documentation:
> Prior to PostgreSQL 7.2, the only defense against XID wraparound was to
> re-initdb at least every 4 billion transactions.

This is a very kind way of saying you are completely nuts for running
7.0 anything. The minimum you should be running is 7.4 and preferably 8.2.4.

Joshua D. Drake


>
> Have a nice day,


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: Limitations on 7.0.3?

От
Scott Marlowe
Дата:
ARTEAGA Jose wrote:
> Also worth mentioning is that I just this week found out about a very,
> very important parameter "shared buffers". Ever since the original
> person setup our PG (individual no longer with us) this DB had been
> running without any major glitches, albeit slow. All this time the
> shared buffers were running at default of "64" (8192 block size). Once I
> have got this back up and running I have since set this to 1600 shared
> buffers (~125MB). I've since noticed a dramatic performance improvement,
> I hope that I've striked gold. But cannot claim victory yet it's only
> been up for 2 days.
If you think increasing shared buffers in 7.0 improves things, you
should see what upgrading to 8.2.4 will do.

Seriously, you'll freak at the increase in speed.

7.0 = handcar: http://www.handcar.net/
8.2 = TGV:  http://en.wikipedia.org/wiki/TGV

(p.s. please upgrade before 7.0 eats your data...)

Re: Limitations on 7.0.3?

От
Richard Huxton
Дата:
ARTEAGA Jose wrote:
> I have spent the last month battling and looking deeper into the issue,
> here's a summary of were I'm at:
> - Increasing shared buffers improved performance but did not resolve the
> backend FATAL disconnect error.
> - Dumping and recreating entire database also did not resolve the issue.

OK, so it's not a corrupted index/file then.

> - re-initializing the DB and recreating from the dump also did not
> resolve the issue.
> On both cases above the issue re-occurred within 2-3 days of run-time
> (insert of new records).
>
> I got the issue narrowed down to the point were I was able to re-create
> the issue at will by just inserting enough data, the data content did
> not matter. The issue always occurred while inserting into my
> "teststeprun" table, which is the largest of my tables (~15 Mill rows).
> The issue is that once I got this table to a certain size, then the
> backend system would crash.
>
> Since I was able to reproduce, I then decided to analyze the core dumps.
> Looking at the core dumps I immediately began to see a pattern, even the
> same patter was there from the initial core dumps I had when the problem
> began occurring back two months ago. In every case the dump indicated
> the last instruction was always in the call to tag_hash(). I also
> noticed that each time the values passed to tag_hash which are used to
> generate the key were just below the 32-bit max value, and tag_hash
> should be returning a uint32 value. Now I'm really suspecting that there
> is some issue with this. Below are the traces of the four core dumps
> which point to the issue I'm suspecting.

I think tag_hash (in /backend/utils/hash/hashfn.c) is responsible for
internal hash-tables (rather than hash indexes). It takes a pointer to a
key to hash and a keysize (in bytes), so either the pointer is bad or
the size is too long and it's reading off the end.

At the other end of your call, _bt_insertonpg
(/backend/access/nbtree/nbtinsert.c) is inserting into a btree index. In
one case it's splitting the index page it tries to insert into (because
it's full) but not in the others.

If it's not a hardware related problem, then it's a bug, but you're
unlikely to get a fix given how old the code is. If an upgrade to 8.2
looks like it will take a lot of effort, perhaps consider an
intermediate upgrade to 7.2 - I think schemas were introduced in 7.3 so
before that should be easier.

There is a chance that you might reduce the problem by REINDEXing the
table concerned every night. That's just a guess though, and you're real
solution will be to upgrade to something more recent.

--
   Richard Huxton
   Archonet Ltd

Re: Limitations on 7.0.3?

От
"ARTEAGA Jose"
Дата:
I have spent the last month battling and looking deeper into the issue,
here's a summary of were I'm at:
- Increasing shared buffers improved performance but did not resolve the
backend FATAL disconnect error.
- Dumping and recreating entire database also did not resolve the issue.

- re-initializing the DB and recreating from the dump also did not
resolve the issue.
On both cases above the issue re-occurred within 2-3 days of run-time
(insert of new records).

I got the issue narrowed down to the point were I was able to re-create
the issue at will by just inserting enough data, the data content did
not matter. The issue always occurred while inserting into my
"teststeprun" table, which is the largest of my tables (~15 Mill rows).
The issue is that once I got this table to a certain size, then the
backend system would crash.

Since I was able to reproduce, I then decided to analyze the core dumps.
Looking at the core dumps I immediately began to see a pattern, even the
same patter was there from the initial core dumps I had when the problem
began occurring back two months ago. In every case the dump indicated
the last instruction was always in the call to tag_hash(). I also
noticed that each time the values passed to tag_hash which are used to
generate the key were just below the 32-bit max value, and tag_hash
should be returning a uint32 value. Now I'm really suspecting that there
is some issue with this. Below are the traces of the four core dumps
which point to the issue I'm suspecting.

This is likely already resolved in new versions, which BTW I am planning
to upgrade to the 8.2 in the coming months. But I'd still like to know
if anyone out there knows about this issue and could confirm my
suspicions.

DUMP1
core 'pg_core.20070603' of 22757:
/home/ttadmin/postgresql/bin/postmaster -D /home/ttadmin/postgresql/da
 001db000 tag_hash (ff4001d8, c, 1daff4, fef8ccd0, fef8d00c, 1) + c
 001da134 call_hash (263370, ff4001d8, 0, 1, 301688, 300740) + 2c
 001da1f8 hash_search (263370, ff4001d8, 0, ff40015b, 301688, 300740) +
64
 00144ae8 BufTableLookup (ff4001d8, ff4001d8, 125aa000, 244c00, 92d4,
fefe3c) + 3c
 001458e4 BufferAlloc (2d23e8, 3, ff40027f, 0, 92d3, ff1764) + 94
 00145420 ReadBufferWithBufferLock (2d23e8, 3, 0, 0, 92d2, ff00dc) + f0
 00145314 ReadBuffer (2d23e8, 3, 0, fef8d360, fef8e020, ff001c) + 18
 0005fdec _bt_getbuf (2d23e8, 3, 2, 10000, fef8ecb0, 3) + 28
 000601fc _bt_getstackbuf (2d23e8, 3003c8, 2, fef8ccd0, fef8d00c, 1) +
28
 0005bc8c _bt_insertonpg (2d23e8, 1, 3003c8, 1, 301688, 300740) + d04


DUMP2
core 'core' of 9640:    /home/ttadmin/postgresql/bin/postmaster -B 18000
-D /home/ttadmin/post
 001db000 tag_hash (ff4001c8, c, 1daff4, fd4f13d0, fd4f170c, 1) + c
 001da134 call_hash (262b50, ff4001c8, 0, 1, 465cf8, 464db0) + 2c
 001da1f8 hash_search (262b50, ff4001c8, 0, ff40014b, 465cf8, 464db0) +
64
 00144ae8 BufTableLookup (ff4001c8, ff4001c8, 124f6000, 244c00, 927a,
10efd64) + 3c
 001458e4 BufferAlloc (428980, 3, ff40026f, 0, 9279, 10efd7c) + 94
 00145420 ReadBufferWithBufferLock (428980, 3, 0, 0, 9278, 10efd94) + f0
 00145314 ReadBuffer (428980, 3, 0, fd4f1a60, fd4f2720, 10efdac) + 18
 0005fdec _bt_getbuf (428980, 3, 2, 10000, fd4f33b0, 3) + 28
 000601fc _bt_getstackbuf (428980, 464a38, 2, fd4f13d0, fd4f170c, 1) +
28


DUMP3
core 'core' of 10466:   /home/ttadmin/postgresql/bin/postmaster -B 18000
-D /home/ttadmin/post
 001db000 tag_hash (ff4001c8, c, 1daff4, fd6093d0, fd60970c, 1) + c
 001da134 call_hash (262b50, ff4001c8, 0, 1, 463be0, 462c98) + 2c
 001da1f8 hash_search (262b50, ff4001c8, 0, ff40014b, 463be0, 462c98) +
64
 00144ae8 BufTableLookup (ff4001c8, ff4001c8, 124f6000, 244c00, 927a,
10ef044) + 3c
 001458e4 BufferAlloc (428848, 3, ff40026f, 0, 9279, 10ef05c) + 94
 00145420 ReadBufferWithBufferLock (428848, 3, 0, 0, 9278, 10ef074) + f0
 00145314 ReadBuffer (428848, 3, 0, fd609a60, fd60a720, 10ef08c) + 18
 0005fdec _bt_getbuf (428848, 3, 2, 10000, fd60b3b0, 3) + 28
 000601fc _bt_getstackbuf (428848, 462920, 2, fd6093d0, fd60970c, 1) +
28

DUMP4
core 'core' of 12400:   /home/ttadmin/postgresql/bin/postmaster -B 18000
-D /home/ttadmin/post
 001db000 tag_hash (ff4001f0, c, 1daff4, 19c7, ce, 1) + c
 001da134 call_hash (262b50, ff4001f0, 0, 22, ffffffff, 455bd0) + 2c
 001da1f8 hash_search (262b50, ff4001f0, 0, ff400173, f6000000, 0) + 64
 00144ae8 BufTableLookup (ff4001f0, ff4001f0, 0, 7, 246000, 1146554) +
3c
 001458e4 BufferAlloc (4273d8, ffffffff, ff400297, 0, 0, 0) + 94
 00145420 ReadBufferWithBufferLock (4273d8, ffffffff, 0, 0, 0, 0) + f0
 00145314 ReadBuffer (4273d8, ffffffff, 6, 260000, 0, 0) + 18
 0005fe2c _bt_getbuf (4273d8, ffffffff, 2, 455f18, f9583a60, ffbaa0e8) +
68
 0005cb8c _bt_split (4273d8, 1, 456c48, 19c7, ce, 1) + 28
 0005bd38 _bt_insertonpg (4273d8, 19c7, 455988, 1, 456c48, 455bd0) + db0

Rgds/
Jose Arteaga

-----Original Message-----
From: ARTEAGA Jose
Sent: Thursday, June 07, 2007 12:32 PM
To: Alvaro Herrera; Richard Huxton
Cc: pgsql-general@postgresql.org
Subject: RE: [GENERAL] Limitations on 7.0.3?

I've looked at the pg_index table and we are currently at 15Mill
entries, which should be OK. After 2-3 days runtime I just get a
disconnect error from backend while doing an insert. After I restore the
DB and insert the same entries it runs fine. Following is the error I
get:
"Query pgsql8.1: PGRES_FATAL_ERROR, pqReadData() -- backend closed the
channel unexpectedly.\n\tThis probably means the backend terminated
abnormally\n\tbefore or while processing the request.\n
      INSERT INTO teststeprun (description, stepunqid, stepid,
stepstarttime, duration, parentsn, stepresult) VALUES ('Qos Reset',
'24920757/3267', ' 95.  4', '2007-06-02 19:02:05', '0 0:0:25',
'5311955', '0')"

Also worth mentioning is that I just this week found out about a very,
very important parameter "shared buffers". Ever since the original
person setup our PG (individual no longer with us) this DB had been
running without any major glitches, albeit slow. All this time the
shared buffers were running at default of "64" (8192 block size). Once I
have got this back up and running I have since set this to 1600 shared
buffers (~125MB). I've since noticed a dramatic performance improvement,
I hope that I've striked gold. But cannot claim victory yet it's only
been up for 2 days.

-Jose


-----Original Message-----
From: Alvaro Herrera [mailto:alvherre@commandprompt.com]
Sent: Thursday, June 07, 2007 7:02 AM
To: Richard Huxton
Cc: ARTEAGA Jose; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Limitations on 7.0.3?

Richard Huxton wrote:
> ARTEAGA Jose wrote:
> >I have been using postgres 7.0.3 (Solaris 2.9) for the past 4 years
> >without any major problems, until about a month ago. We are now
> >experiencing crashes on the backend (connection lost to backend)
while
> >running queries (inserts, etc). Anyway I don't want to make this too
> >technical but I'd just like to understand if maybe I'm running into
some
> >sort of limit on the size of the database. My biggest table is
currently
> >at 1.5B tuples.
> >
> >Would appreciate if anyone could let me know or is aware of any
limits
> >with 7.0 version.
>
> I don't remember any specific size limitations on the 7.0 series. For
> more detailed help you'll have to provide some specific error
messages.

7.0 didn't have any protection against Xid wraparound.  As soon as you
hit the 4 billion transactions mark, your data suddenly disappeared.
That's what I heard at least -- I didn't have much experience with such
an old beast.  We switched rather quickly to 7.1.

--
Alvaro Herrera
http://www.amazon.com/gp/registry/CTMLCN8V17R4
"Before you were born your parents weren't as boring as they are now.
They
got that way paying your bills, cleaning up your room and listening to
you
tell them how idealistic you are."  -- Charles J. Sykes' advice to
teenagers

Re: Limitations on 7.0.3?

От
Tom Lane
Дата:
Richard Huxton <dev@archonet.com> writes:
> I think tag_hash (in /backend/utils/hash/hashfn.c) is responsible for
> internal hash-tables (rather than hash indexes). It takes a pointer to a
> key to hash and a keysize (in bytes), so either the pointer is bad or
> the size is too long and it's reading off the end.

Those stack traces hardly seem credible --- the key being hashed is a
local variable in BufferAlloc, so the pointer can't really be bad,
and the length apparently is 12 as it should be.  I was wondering if
maybe the hashp->hash function pointer got corrupted, but that seems
unlikely too seeing that it's part of a struct that never changes.

> If it's not a hardware related problem, then it's a bug, but you're
> unlikely to get a fix given how old the code is.

If you can reproduce the problem in something reasonably recent, we'd be
interested in taking a look.  Nobody is going to spend any time on 7.0.x
though.  It *will* eat your data someday ... you need to put more time
into getting off it and less into studying its bugs.

            regards, tom lane