Обсуждение: Adding more space, and a vacuum question.

Поиск
Список
Период
Сортировка

Adding more space, and a vacuum question.

От
"Herouth Maoz"
Дата:

Hello. We have two problems (which may actually be related...)

1. We are running at over 90% capacity of the disk at one of the servers - a report/data warehouse system. We have ran out of disk space several times. Now we need to make some file-archived data available on the database to support our legal team. This means two huge tables to be added to the database. The only solution that I see is to add more space by means of another tablespace. The two tables are static - after loading them and creating indexes they will not be changed.

The machine has no additional room for internal disks. It is a recent purchase and not likely to be replaced any time soon. Now, my position is that the best solution would be to add an external hard disk, via USB/firewire, and use it for the archive tables. My sysadmin, on the other hand, wants to mount a storage machine remotely and use it for the extra tablespace, as the storage machine is a more reliable hardware. I think that remote mounted volumes are not a proper device for a database, as the network is subject to load and I've ran into frozen mounts in both NFS and SMB in the past. Never mind being slower.

Which solution would you advise and which one of us is right?

2. That database has a few really huge tables. I think they are not being automatically vacuumed properly. In the past few days I've noticed a vacuum process on one of them which has been running since January 14th. Unfortunately, it never finished, because we were informed of a scheduled power down in our building yesterday, and had to shut down the machine. The questions are:

a. Is it normal for vacuum processes to take two weeks?
b. What happens if the vacuum process is stopped? Are the tuples partially recovered, or are they only recovered if the process completes properly?
c. Is there anything I can do to make vacuums shorter?
d. After restarting the server, all the data in pg_stat_user_tables seem to have been reset. What does this mean and how does this affect vacuum scheduling?

Thank you in advance,
Herouth

Re: Adding more space, and a vacuum question.

От
Craig Ringer
Дата:
On 01/29/2011 05:12 AM, Herouth Maoz wrote:

> The machine has no additional room for internal disks. It is a recent
> purchase and not likely to be replaced any time soon.

Newly acquired or not, it sounds like it isn't sized correctly for the
load and needs an upgrade if it can't be shifted into a more suitable
role and replaced.

> Now, my position
> is that the best solution would be to add an external hard disk, via
> USB/firewire

eSATA? Via a PCI or PCIe add-in SATA controller if there's no existing
eSATA.

FireWire is usable for a database. USB is too ... kind of. Performance
will be poor because of the high latency, CPU-heavy non-DMA access done
by the USB stack.

For something read-only, that might be OK.

> and use it for the archive tables. My sysadmin, on the
> other hand, wants to mount a storage machine remotely and use it for the
> extra tablespace, as the storage machine is a more reliable hardware.

If you have iSCSI or ATA-over-Ethernet disk volumes you can mount, that
might be a good idea. I'd personally avoid NFS or SMB.

That said, again if it's read-only you might be fine.

> think that remote mounted volumes are not a proper device for a
> database, as the network is subject to load and I've ran into frozen
> mounts in both NFS and SMB in the past. Never mind being slower.

Remote *file* *level* shares aren't great for databases, IMO.

> a. Is it normal for vacuum processes to take two weeks?

Define "really huge" and describe the hardware; without numbers it's
hard to know. What version of Pg are you using?

Was it a standalone VACUUM or was it an autovacuum worker?

> b. What happens if the vacuum process is stopped? Are the tuples
> partially recovered, or are they only recovered if the process completes
> properly?

I *think* tuples become available progressively, but I'm not certain of
that.

> c. Is there anything I can do to make vacuums shorter?

Do it much more often.

Use Pg 8.4 or later, with visibility map.

--
Craig Ringer

Re: Adding more space, and a vacuum question.

От
Herouth Maoz
Дата:
בתאריך 29/01/11 13:57, ציטוט Craig Ringer:
> On 01/29/2011 05:12 AM, Herouth Maoz wrote:
>
>> The machine has no additional room for internal disks. It is a recent
>> purchase and not likely to be replaced any time soon.
>
> Newly acquired or not, it sounds like it isn't sized correctly for the
> load and needs an upgrade if it can't be shifted into a more suitable
> role and replaced.
Sigh. Budget considerations, you know.
>
>> Now, my position
>> is that the best solution would be to add an external hard disk, via
>> USB/firewire
>
> eSATA? Via a PCI or PCIe add-in SATA controller if there's no existing
> eSATA.
Oh, yes, I forgot about eSATA. I meant basically a real local connection
rather than network one.

>
> FireWire is usable for a database. USB is too ... kind of. Performance
> will be poor because of the high latency, CPU-heavy non-DMA access
> done by the USB stack.
>
> For something read-only, that might be OK.
>
>> and use it for the archive tables. My sysadmin, on the
>> other hand, wants to mount a storage machine remotely and use it for the
>> extra tablespace, as the storage machine is a more reliable hardware.
>
> If you have iSCSI or ATA-over-Ethernet disk volumes you can mount,
> that might be a good idea. I'd personally avoid NFS or SMB.

OK.
>
> That said, again if it's read-only you might be fine.
Question is - if the read-only tablespace gets stuck/frozen, what
happens to the read-write part of the database, which is absolutely
essential to have in good responsive working order?
>> a. Is it normal for vacuum processes to take two weeks?
>
> Define "really huge" and describe the hardware; without numbers it's
> hard to know. What version of Pg are you using?
Pg 8.3.11. The tables have more than 200,000,000 records. About the
hardware, I'm not entirely in the loop, but it has two dual-core Intel
Xeon 5130 CPUs, 4G of memory, and its system disk (111G) is separate
from the database disk (825G). The disks are hardware RAID, but I'm not
sure which level, and I think they are 10,000 RPM but I could be wrong.
>
> Was it a standalone VACUUM or was it an autovacuum worker?
Autovacuum worker.

TIA,
Herouth

Re: Adding more space, and a vacuum question.

От
Herouth Maoz
Дата:

On 30/01/2011, at 13:03, Alban Hertroys wrote:

On 28 Jan 2011, at 22:12, Herouth Maoz wrote:

2. That database has a few really huge tables. I think they are not being automatically vacuumed properly. In the past few days I've noticed a vacuum process on one of them which has been running since January 14th. Unfortunately, it never finished, because we were informed of a scheduled power down in our building yesterday, and had to shut down the machine. The questions are:

a. Is it normal for vacuum processes to take two weeks?

For a 200M record table that's definitely on the long side. It was probably waiting on a lock by another transaction. In most cases that means that some transaction was kept open for that duration.
If that transaction came into existence by accident, then vacuum should be fine now that the server has restarted - that transaction is gone now. You may want to keep an eye out for long-running transactions though, that's usually a programming error - it's sometimes done deliberately, but it's still a bad idea from the point of the database.

Unless my eyes were deceiving me, this was not the case. Sure, there have been heavy transactions during that time (e.g. the daily backup of the database, and the daily inserts into other tables, which take a long time, and a few selects which I haven't been able to find an optimal index for). But this is the query I use to see these processes (ran from a superuser):

SELECT usename, procpid, query_start, client_addr, client_port, current_query,waiting
FROM pg_stat_activity
WHERE query_start < now() - interval '3 seconds'
AND xact_start is not null order by xact_start

Any long transactions should be caught by it, but most of the time, all I see are vacuum workers.

By the way, the auto vacuum on that table has started again - but only after more records were deleted from it. It has now been running since yesterday at 17:00. Here is the pg_stat_user_tables record for this table (which has also updated after the deletes):

relid            | 17806
schemaname       | sms
relname          | billing__archive
seq_scan         | 9
seq_tup_read     | 2053780855
idx_scan         | 2553
idx_tup_fetch    | 8052678
n_tup_ins        | 11437874
n_tup_upd        | 0
n_tup_del        | 7987450
n_tup_hot_upd    | 0
n_live_tup       | 218890768
n_dead_tup       | 33710378
last_vacuum      | 
last_autovacuum  | 
last_analyze     | 2011-01-29 15:29:37.059176+02
last_autoanalyze | 


In older PG versions autovacuum could get stuck like that on large tables. It keeps starting over trying to vacuum that same table, but never reaches the end of it. Since it's only a single worker process (in those versions), it also will never vacuum any tables beyond the table it got stuck on.

How old? Mine is 8.3.11.


If you don't delete or update tuples a lot, then the tables are probably just that big. If you do delete/update them regularly, try if a normal vacuum will shrink them enough (probably not) and if not, schedule a VACUUM FULL and a REINDEX at some time the database isn't too busy. Both are quite heavy operations that take exclusive locks on things (tables, indices).

Yes, I do delete many tuples from that table. My mode of usage is like this: I have a small table called billing which receives new data every night. I want to keep that table small so that those nightly updates don't take an overly long time, because all data (several such tables) has to be ready in the database by the next morning. Therefore, once a week on the weekend, I move a week's worth of data to billing__archive (the table we are discussing), and delete a week's worth from its end. Now, the indexes on that table would make this impossible to do within the weekend, so what I do is drop all the indexes before I do the inserts, and then recreate them, and then do the deletes.

What you are saying is that in this mode of operation, there's basically no hope that autovacuum will ever salvage the deleted records? Does removing and recreating the indexes have any effect on the vacuuming process?

If a vacuum takes me several days (let alone over a week!) than a VACUUM FULL is out of the question. VACUUM FULL locks the table completely and that table is essential to our customer care. If push comes to shove, I think I'd rather dump that table, drop it, and restore it over the weekend, which I believe will be faster than a VACUUM FULL.

One other important question: a tuple marked by VACUUM as reusable (not VACUUM FULL which restores it to the operating system) - can its space ever be used by another table, or can it only be used for new inserts into the same table?


d. After restarting the server, all the data in pg_stat_user_tables seem to have been reset. What does this mean and how does this affect vacuum scheduling?

I recall reading somewhere that that's normal; probably this list. It probably contained an explanation too.

Thank you.

Herouth

Re: Adding more space, and a vacuum question.

От
Andrew Sullivan
Дата:
On Sun, Jan 30, 2011 at 04:56:29PM +0200, Herouth Maoz wrote:
>
> Unless my eyes were deceiving me, this was not the case. Sure, there have been heavy transactions during that time
(e.g.the daily backup of the database, and the daily inserts into other tables, which take a long time, and a few
selectswhich I haven't been able to find an optimal index for). But this is the query I use to see these processes (ran
froma superuser): 
>
> SELECT usename, procpid, query_start, client_addr, client_port, current_query,waiting
> FROM pg_stat_activity
> WHERE query_start < now() - interval '3 seconds'
> AND xact_start is not null order by xact_start
>
> Any long transactions should be caught by it, but most of the time, all I see are vacuum workers.

Well, what's your I/O on the disk?  Have you tuned vacuum?  Maybe
you're just saturating the ability of the table to be vacuumed, or
else vacuum is being told to back off?

> Yes, I do delete many tuples from that table. My mode of usage is
> like this: I have a small table called billing which receives new
> data every night. I want to keep that table small so that those
> nightly updates don't take an overly long time, because all data
> (several such tables) has to be ready in the database by the next
> morning. Therefore, once a week on the weekend, I move a week's
> worth of data to billing__archive (the table we are discussing), and
> delete a week's worth from its end. Now, the indexes on that table
> would make this impossible to do within the weekend, so what I do is
> drop all the indexes before I do the inserts, and then recreate
> them, and then do the deletes.

Without looking at the details of your database, I have to say that
the above sounds to me like more work than letting the system handle
this itself.  I have a suspicion that what you really want to do is
trickle out the changes rather than trying to do things in big batches
this way.

> If a vacuum takes me several days (let alone over a week!) than a
> VACUUM FULL is out of the question. VACUUM FULL locks the table
> completely and that table is essential to our customer care. If push
> comes to shove, I think I'd rather dump that table, drop it, and
> restore it over the weekend, which I believe will be faster than a
> VACUUM FULL.

Yes, I think so too.  And I bet at the current state of affairs,
that's a good bet.  Whatever the situation, I suspect things are too
bad off to be worth trying to get through a vacuum with.

> One other important question: a tuple marked by VACUUM as reusable
> (not VACUUM FULL which restores it to the operating system) - can
> its space ever be used by another table, or can it only be used for
> new inserts into the same table?

It's managed by postgres, but given your churn rate on these tables
I'd be tempted to set a fillfactor with a lot of room, and let the
tables be "big" (i.e. with a lot of empty space) so that their long
term storage footprint is stable.

A

--
Andrew Sullivan
ajs@crankycanuck.ca

Re: Adding more space, and a vacuum question.

От
Herouth Maoz
Дата:

On 30/01/2011, at 12:27, Craig Ringer wrote:

OK, so you're pre-8.4 , which means you have the max_fsm settings to play with. Have you seen any messages in the logs about the free space map (fsm)? If your install didn't have a big enough fsm to keep track of deleted tuples, you'd face massive table bloat that a regular vacuum couldn't fix.

Ouch. You're absolutely right. There are messages about max_fsm_pages in the postgres log. It's currently set to 153600. According to the documentation, I can increase it up to 200000. Will that even help? How do I find out how many I need to set it to?


You also don't have the visibility map, which means that (auto)vacuum can't skip bits of the tables it knows don't need vacuuming. Your vacuums will be slower.

Autovacuum improved significantly in both 8.4 and 9.0; consider an upgrade.

I will consider it. Thank you.

Herouth

Re: Adding more space, and a vacuum question.

От
Craig Ringer
Дата:
On 01/31/2011 12:14 AM, Herouth Maoz wrote:
>
> On 30/01/2011, at 12:27, Craig Ringer wrote:
>>
>> OK, so you're pre-8.4 , which means you have the max_fsm settings to
>> play with. Have you seen any messages in the logs about the free space
>> map (fsm)? If your install didn't have a big enough fsm to keep track
>> of deleted tuples, you'd face massive table bloat that a regular
>> vacuum couldn't fix.
>
> Ouch. You're absolutely right. There are messages about max_fsm_pages in
> the postgres log. It's currently set to 153600. According to the
> documentation, I can increase it up to 200000. Will that even help? How
> do I find out how many I need to set it to?

I think the logs suggest what to set. I haven't used 8.3 in ages and
don't remember well.

Increasing it won't help after the fact. You almost certainly have badly
bloated tables. Fixing that will be interesting in your current
low-disk-space situation. VACUUM FULL would work - but will exclusively
lock the table being vacuumed for *ages*, so nothing else can do any
work, not even reads. CLUSTER will do the same, and while it's much
faster, to work it requires enough free disk space to store a complete
copy of the still-valid parts of the table while the bloated original is
still on disk. You may have to look into some of the lockless fake
vacuum full approaches.

I think table bloat identification and management is one of the worst
problems PostgreSQL has remaining. It's too hard, out of the box, to
discover bloat developing, and it's too disruptive to fix it if and when
it does happen. The automatic free space map management in 8.4, and the
ongoing autovacuum improvements, help reduce the chances of bloat
happening, but it's still a pain to monitor for and a pain to fix when
it does happen.

For approaches to possibly fixing your problem, see:

http://www.depesz.com/index.php/2010/10/17/reduce-bloat-of-table-without-longexclusive-locks/

http://blog.endpoint.com/2010/09/reducing-bloat-without-locking.html

--
Craig Ringer

Re: Adding more space, and a vacuum question.

От
Alban Hertroys
Дата:
On 30 Jan 2011, at 17:14, Herouth Maoz wrote:

> On 30/01/2011, at 12:27, Craig Ringer wrote:
>>
>> OK, so you're pre-8.4 , which means you have the max_fsm settings to play with. Have you seen any messages in the
logsabout the free space map (fsm)? If your install didn't have a big enough fsm to keep track of deleted tuples, you'd
facemassive table bloat that a regular vacuum couldn't fix. 
>
> Ouch. You're absolutely right. There are messages about max_fsm_pages in the postgres log. It's currently set to
153600.According to the documentation, I can increase it up to 200000. Will that even help? How do I find out how many
Ineed to set it to? 
>
>>
>> You also don't have the visibility map, which means that (auto)vacuum can't skip bits of the tables it knows don't
needvacuuming. Your vacuums will be slower. 
>>
>> Autovacuum improved significantly in both 8.4 and 9.0; consider an upgrade.
>
> I will consider it. Thank you.

I'm not trying to push an upgrade on you, but if restoring a dump is one of your better options to get rid of table
bloatanyway, now seems a good time. 

What I'd do is:
- Find a test system with enough disk space
- Install PG 8.4 on it
- Create a dump of your database, using this 8.4 version of pg_dump
- Restore it to the test version and check everything's alright
- If it is, pull the plug on that 8.3 database - you'll probably want to create a fresh dump here if the system is
stillin production 
- Install an 8.4 and use the previously made dump to restore it
- Don't forget to run analyse right after if it's going to be used right away

You will want to check the release notes. One of the things that sometimes bites people is that 8.4 has tighter
type-conversionrestrictions. Some implicit type-casts that used to work don't anymore, unless made explicit. 

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4d466a8511732033268635!



Re: Adding more space, and a vacuum question.

От
Herouth Maoz
Дата:


I'm not quite sure what this will do for me. How will Postgresql know that there is free space in low-numbered pages, if the bloat has not been vacuumed off and is not in the fsm?
--
חרות מעוז
יוניסל פתרונות סלולריים מתקדמים
☎ 03-5181717 שלוחה 742