Обсуждение: reserving space in a rec for future update

Поиск
Список
Период
Сортировка

reserving space in a rec for future update

От
"Gauthier, Dave"
Дата:

Hi:

 

I have a situation where I will be inserting thousands of records into a table but leaving 2 of it’s columns null.  Later on, I will be updating most of those records and putting real values in  place of those 2 nulls. As for the ones that do not get updated, I want to leave them null.   My concern has to do with record fragmentation at the time of update because there’s no room to “expand” them to accept the non-null data.  (BTW, the columns are floating point).

 

Is there a way to initially insert nulls, but reserve space for the future update (and avoid record fragmentation)? 

Is my record fragmentation concern unfounded?

 

Thanks in Advance !

 

Re: reserving space in a rec for future update

От
"Scott Marlowe"
Дата:
On Nov 14, 2007 9:28 AM, Gauthier, Dave <dave.gauthier@intel.com> wrote:
>
> Hi:
>
> I have a situation where I will be inserting thousands of records into a
> table but leaving 2 of it's columns null.  Later on, I will be updating most
> of those records and putting real values in  place of those 2 nulls. As for
> the ones that do not get updated, I want to leave them null.   My concern
> has to do with record fragmentation at the time of update because there's no
> room to "expand" them to accept the non-null data.  (BTW, the columns are
> floating point).

I don't think you really understand how PostgreSQL storage works.
every update is the exact same as a delete / insert in terms of
storage.  So, you're worrying about a problem that doesn't exist.
Read up

Read up on it here:  http://www.postgresql.org/docs/8.2/static/mvcc.html

Re: reserving space in a rec for future update

От
Andrew Sullivan
Дата:
On Wed, Nov 14, 2007 at 10:28:30AM -0500, Gauthier, Dave wrote:
> null.   My concern has to do with record fragmentation at the time of
> update because there's no room to "expand" them to accept the non-null
> data.  (BTW, the columns are floating point).

You have a mistaken idea about how this works.

> Is there a way to initially insert nulls, but reserve space for the
> future update (and avoid record fragmentation)?

No.

> Is my record fragmentation concern unfounded?

Sort of.

The way this will work in Postgres is that, when you UPDATE the row, the old
row will be marked dead, and a _new_ row will be written out with the new
data.  You will need to perform VACUUM in order to keep the table from
bloating.  You'll want to read the manual carefully about this topic, in
order to keep your table from getting so bloated that your free space map
becomes useless.  One of the weakest areas for PostgreSQL is its behaviour
under this sort of "most rows updated" scenario, and it is wise to plan
carefully how you will accomplish these sorts of activities without causing
yourself extreme pain.

A

--
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

Re: reserving space in a rec for future update

От
"Gauthier, Dave"
Дата:
OK, I didn't know Postgres did it this way. I was hoping it would retain
the old rec and update in place (if the updated values could fit).  I
guess not.

I can rewrite the DB loading algorithm to get those values in advance,
load into program memory, and reference at the time of the initial load.


Thanks for the advanced warning about problems with vaccuum !

-dave

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Andrew Sullivan
Sent: Wednesday, November 14, 2007 11:13 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] reserving space in a rec for future update

On Wed, Nov 14, 2007 at 10:28:30AM -0500, Gauthier, Dave wrote:
> null.   My concern has to do with record fragmentation at the time of
> update because there's no room to "expand" them to accept the non-null
> data.  (BTW, the columns are floating point).

You have a mistaken idea about how this works.

> Is there a way to initially insert nulls, but reserve space for the
> future update (and avoid record fragmentation)?

No.

> Is my record fragmentation concern unfounded?

Sort of.

The way this will work in Postgres is that, when you UPDATE the row, the
old
row will be marked dead, and a _new_ row will be written out with the
new
data.  You will need to perform VACUUM in order to keep the table from
bloating.  You'll want to read the manual carefully about this topic, in
order to keep your table from getting so bloated that your free space
map
becomes useless.  One of the weakest areas for PostgreSQL is its
behaviour
under this sort of "most rows updated" scenario, and it is wise to plan
carefully how you will accomplish these sorts of activities without
causing
yourself extreme pain.

A

--
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: reserving space in a rec for future update

От
Mike Charnoky
Дата:
In this usage scenario, doesn't the new HOT (heap only tuples) feature
of PG8.3 help, in terms of the DB requiring less VACUUM maintenance?

I am similarly performing a huge number of inserts, followed by a huge
number of updates to fill in a few null fields.  The data is indexed by
insert time.  My problem is, selects using that index degrade over time
as updates are performed, presumably because data is no longer ordered
sequentially across pages after updates are performed.  I was hoping
that HOT would help here and am actually installing PG8.3 now in order
to perform some testing...


Mike

Andrew Sullivan wrote:
> On Wed, Nov 14, 2007 at 10:28:30AM -0500, Gauthier, Dave wrote:
>> null.   My concern has to do with record fragmentation at the time of
>> update because there's no room to "expand" them to accept the non-null
>> data.  (BTW, the columns are floating point).
>
> You have a mistaken idea about how this works.
>
>> Is there a way to initially insert nulls, but reserve space for the
>> future update (and avoid record fragmentation)?
>
> No.
>
>> Is my record fragmentation concern unfounded?
>
> Sort of.
>
> The way this will work in Postgres is that, when you UPDATE the row, the old
> row will be marked dead, and a _new_ row will be written out with the new
> data.  You will need to perform VACUUM in order to keep the table from
> bloating.  You'll want to read the manual carefully about this topic, in
> order to keep your table from getting so bloated that your free space map
> becomes useless.  One of the weakest areas for PostgreSQL is its behaviour
> under this sort of "most rows updated" scenario, and it is wise to plan
> carefully how you will accomplish these sorts of activities without causing
> yourself extreme pain.
>
> A
>

Re: reserving space in a rec for future update

От
Andrew Sullivan
Дата:
On Wed, Nov 14, 2007 at 11:31:11AM -0500, Gauthier, Dave wrote:
> Thanks for the advanced warning about problems with vaccuum !

Note this isn't a _problem_ with vacuum, exactly, it's just the set of
compromises that PostgreSQL has settled on.  There are other ways of
cleaning up the system ("defrag", of the sort you were implying is one, "pay
the cost during transaction" is another one, &c.).  The piper has to be
paid, and all we're doing is arguing about what currency we'll use :)

A

--
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

Re: reserving space in a rec for future update

От
Andrew Sullivan
Дата:
On Wed, Nov 14, 2007 at 11:44:55AM -0500, Mike Charnoky wrote:
> In this usage scenario, doesn't the new HOT (heap only tuples) feature
> of PG8.3 help, in terms of the DB requiring less VACUUM maintenance?

It should, yes.  We'll probably know more once 8.3 is in the field.  For new
work, though, I would certainly suggest trying this on 8.3.  I know that's
the _point_ of the feature.  But if you've already got an application you
need to field today, doing it on a beta is risky.

A

--
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

Re: reserving space in a rec for future update

От
Erik Jones
Дата:
On Nov 14, 2007, at 10:44 AM, Mike Charnoky wrote:

> In this usage scenario, doesn't the new HOT (heap only tuples) feature
> of PG8.3 help, in terms of the DB requiring less VACUUM maintenance?
>
> I am similarly performing a huge number of inserts, followed by a huge
> number of updates to fill in a few null fields.  The data is
> indexed by
> insert time.  My problem is, selects using that index degrade over
> time
> as updates are performed, presumably because data is no longer ordered
> sequentially across pages after updates are performed.  I was hoping
> that HOT would help here and am actually installing PG8.3 now in order
> to perform some testing...
>
>
> Mike

Some, what HOT does is keeps index rows from being updated when
updates are made to column values that aren't indexed.  The same
insert/delete still happens in the table data.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



Re: reserving space in a rec for future update

От
Alvaro Herrera
Дата:
Erik Jones wrote:
>
> On Nov 14, 2007, at 10:44 AM, Mike Charnoky wrote:
>
>> In this usage scenario, doesn't the new HOT (heap only tuples) feature
>> of PG8.3 help, in terms of the DB requiring less VACUUM maintenance?
>>
>> I am similarly performing a huge number of inserts, followed by a huge
>> number of updates to fill in a few null fields.  The data is indexed by
>> insert time.  My problem is, selects using that index degrade over time
>> as updates are performed, presumably because data is no longer ordered
>> sequentially across pages after updates are performed.  I was hoping
>> that HOT would help here and am actually installing PG8.3 now in order
>> to perform some testing...
>
> Some, what HOT does is keeps index rows from being updated when updates are
> made to column values that aren't indexed.  The same insert/delete still
> happens in the table data.

But another thing HOT does is make it possible to vacuum the *page* that
the update is taking place on.  So if there are dead tuples that nobody
needs, they can be removed and the new tuple can be placed there.

--
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Use it up, wear it out, make it do, or do without"