Обсуждение: VMWare file system / database corruption
Hi All, We're having numerous problems with a PostgreSQL 8.3.7 database running on a virtual Linux server w/VMWare ESX. This is not by choice and I have been asking the operator of this equipment for details about the disk setup and here's what I got: "We have a SAN that is presenting an NFS share. VMWare sees that share and reads the VMDK file that make up the virtual file system." Does anyone with a better understanding of PostgreSQL and VMWare know if this is an unreliable setup for PostgreSQL? I see things like "NFS" and "VMWare" and start to get worried. Tom -- Tom Duffey <tduffey@techbydesign.com> Technology by Design :: http://techbydesign.com/ p: 414.431.0800
On Mon, Sep 21, 2009 at 11:09 AM, Tom Duffey <tduffey@techbydesign.com> wrote: > Hi All, > > We're having numerous problems with a PostgreSQL 8.3.7 database running on a > virtual Linux server w/VMWare ESX. This is not by choice and I have been > asking the operator of this equipment for details about the disk setup and > here's what I got: > > "We have a SAN that is presenting an NFS share. VMWare sees that share and > reads the VMDK file that make up the virtual file system." > > Does anyone with a better understanding of PostgreSQL and VMWare know if > this is an unreliable setup for PostgreSQL? I see things like "NFS" and > "VMWare" and start to get worried. I see VMWare and thing performance issues, I see NFS and thing dear god help us all. Even if properly setup NFS is a problem waiting to happen, and it's not reliable storage for a database in my opinion. That said, lots of folks do it. Ask for the NFS mount options from the sysadmin.
On Mon, Sep 21, 2009 at 1:40 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > Even if properly setup NFS is a problem waiting to > happen, and it's not reliable storage for a database in my opinion. It reads to me that the VMware image is pulled from NFS, so as far as Pg is concerned, it is speaking to a local disk. Whether that disk is properly backed to disk is an open question. I don't know what kind of durability guarantee postgres can give you if the VM itself is read/written over a known not-100% reliable file system. That is, does the sync() call inside the VM not return until the underlying VM image data is on actual hard platters? I doubt it.
On Mon, Sep 21, 2009 at 11:53 AM, Vick Khera <vivek@khera.org> wrote: > On Mon, Sep 21, 2009 at 1:40 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: >> Even if properly setup NFS is a problem waiting to >> happen, and it's not reliable storage for a database in my opinion. > > > It reads to me that the VMware image is pulled from NFS, so as far as > Pg is concerned, it is speaking to a local disk. Whether that disk is > properly backed to disk is an open question. I don't know what kind > of durability guarantee postgres can give you if the VM itself is > read/written over a known not-100% reliable file system. Oh, I completely read that wrong then. > That is, does the sync() call inside the VM not return until the > underlying VM image data is on actual hard platters? I doubt it. I'm pretty sure fsync is properly supported in VMWare by default, but there are settings to turn it off. Could be the other way around. Either way, this isn't really the best place probably to get VMWare support.
I'd be careful using VMWARE as a database server for anything other than production. I've had problems with SQL Server andmySQL databases and I am sure that Postgres will experience problems as well. One thing to look at is whether snapshots are set to be taken while the database is active. If they are, ask the systemadmin to find another way to make backups of the VM. My experience has been that snapshots cause many problems withdatabases. Alex ----- Original Message ----- From: "Scott Marlowe" <scott.marlowe@gmail.com> To: "Tom Duffey" <tduffey@techbydesign.com> Cc: pgsql-general@postgresql.org Sent: Monday, September 21, 2009 1:40:33 PM GMT -05:00 US/Canada Eastern Subject: Re: [GENERAL] VMWare file system / database corruption On Mon, Sep 21, 2009 at 11:09 AM, Tom Duffey <tduffey@techbydesign.com> wrote: > Hi All, > > We're having numerous problems with a PostgreSQL 8.3.7 database running on a > virtual Linux server w/VMWare ESX. This is not by choice and I have been > asking the operator of this equipment for details about the disk setup and > here's what I got: > > "We have a SAN that is presenting an NFS share. VMWare sees that share and > reads the VMDK file that make up the virtual file system." > > Does anyone with a better understanding of PostgreSQL and VMWare know if > this is an unreliable setup for PostgreSQL? I see things like "NFS" and > "VMWare" and start to get worried. I see VMWare and thing performance issues, I see NFS and thing dear god help us all. Even if properly setup NFS is a problem waiting to happen, and it's not reliable storage for a database in my opinion. That said, lots of folks do it. Ask for the NFS mount options from the sysadmin. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Sep 21, 2009, at 12:40 PM, Scott Marlowe wrote: > On Mon, Sep 21, 2009 at 11:09 AM, Tom Duffey > <tduffey@techbydesign.com> wrote: >> Hi All, >> >> We're having numerous problems with a PostgreSQL 8.3.7 database >> running on a >> virtual Linux server w/VMWare ESX. This is not by choice and I >> have been >> asking the operator of this equipment for details about the disk >> setup and >> here's what I got: >> >> "We have a SAN that is presenting an NFS share. VMWare sees that >> share and >> reads the VMDK file that make up the virtual file system." >> >> Does anyone with a better understanding of PostgreSQL and VMWare >> know if >> this is an unreliable setup for PostgreSQL? I see things like >> "NFS" and >> "VMWare" and start to get worried. > > I see VMWare and thing performance issues, I see NFS and thing dear > god help us all. Even if properly setup NFS is a problem waiting to > happen, and it's not reliable storage for a database in my opinion. > That said, lots of folks do it. Ask for the NFS mount options from > the sysadmin. Thanks to everyone so far for the insight. I'm trying to get more details about the hardware setup but am not making much progress. Here are some of the errors we're getting. I searched through archives and they all seem to point at hardware trouble but is there anything else I should be looking at? ERROR: invalid page header in block 2 of relation "pg_toast_19466_index" ERROR: invalid memory alloc request size 1667592311 STATEMENT: COPY public.version_bundle (node_id_hi, node_id_lo, bundle_data) TO stdout; ERROR: unexpected chunk number 1632 (expected 1629) for toast value 19711 in pg_toast_19184 STATEMENT: COPY public.data_binval (binval_id, binval_data) TO stdout; ERROR: invalid page header in block 414 of relation "pg_toast_19460_index" ERROR: could not open segment 1 of relation 1663/16386/16535 (target block 3966127611): No such file or directory I dealt with some of the above by reindexing or finding and deleting bad rows. I can now successfully dump the database but of course have missing data so the application is toast. What I'm really wondering now is how to prevent this from happening again and if that means moving the database to new hardware. Best Regards, Tom
On Mon, Sep 21, 2009 at 12:46 PM, Tom Duffey <tduffey@techbydesign.com> wrote: > > On Sep 21, 2009, at 12:40 PM, Scott Marlowe wrote: > >> On Mon, Sep 21, 2009 at 11:09 AM, Tom Duffey <tduffey@techbydesign.com> >> wrote: >>> >>> Hi All, >>> >>> We're having numerous problems with a PostgreSQL 8.3.7 database running >>> on a >>> virtual Linux server w/VMWare ESX. This is not by choice and I have been >>> asking the operator of this equipment for details about the disk setup >>> and >>> here's what I got: >>> >>> "We have a SAN that is presenting an NFS share. VMWare sees that share >>> and >>> reads the VMDK file that make up the virtual file system." >>> >>> Does anyone with a better understanding of PostgreSQL and VMWare know if >>> this is an unreliable setup for PostgreSQL? I see things like "NFS" and >>> "VMWare" and start to get worried. >> >> I see VMWare and thing performance issues, I see NFS and thing dear >> god help us all. Even if properly setup NFS is a problem waiting to >> happen, and it's not reliable storage for a database in my opinion. >> That said, lots of folks do it. Ask for the NFS mount options from >> the sysadmin. > > Thanks to everyone so far for the insight. I'm trying to get more details > about the hardware setup but am not making much progress. > > Here are some of the errors we're getting. I searched through archives and > they all seem to point at hardware trouble but is there anything else I > should be looking at? > > ERROR: invalid page header in block 2 of relation "pg_toast_19466_index" > > ERROR: invalid memory alloc request size 1667592311 > STATEMENT: COPY public.version_bundle (node_id_hi, node_id_lo, bundle_data) > TO stdout; > > ERROR: unexpected chunk number 1632 (expected 1629) for toast value 19711 > in pg_toast_19184 > STATEMENT: COPY public.data_binval (binval_id, binval_data) TO stdout; > > ERROR: invalid page header in block 414 of relation "pg_toast_19460_index" > > ERROR: could not open segment 1 of relation 1663/16386/16535 (target block > 3966127611): No such file or directory > > I dealt with some of the above by reindexing or finding and deleting bad > rows. I can now successfully dump the database but of course have missing > data so the application is toast. What I'm really wondering now is how to > prevent this from happening again and if that means moving the database to > new hardware. Definitely sounds like file system corruption to me. And who knows what's gotten hammered that hasn't caused an error, eh? Time to move to a standalone db server or get a sysadmin who knows how to setup vmware to make pgsql happy.
I have a question about using Reference. I have several tables that are defined such as below CREATE TABLE iss.accessor ( loaddtlid UUID NOT NULL , seqno SMALLINT NOT NULL , billable VARCHAR(1) DEFAULT 'N' CHECK(billable IN ('N','Y')) NOT NULL , payind VARCHAR(1) DEFAULT 'P' CHECK(payind IN ('P','F')) NOT NULL , code UUID REFERENCE accessorcodes (code) , description CITEXT , ref CITEXT , tractororcarrierflag VARCHAR(1) DEFAULT 'T' CHECK(tractororcarrierflag IN ('T','C')) NOT NULL , tractororcarriernoid UUID , tractorpct DECIMAL(6,4) DEFAULT 0 CHECK(tractorpct BETWEEN 0 AND 1) NOT NULL , charge DECIMAL(7,2) DEFAULT 0 , type VARCHAR(1) DEFAULT 'N' CHECK(type IN ('N','V','D','R','A','S')) NOT NULL , checkdate DATE , checkno CITEXT , processed VARCHAR(1) DEFAULT 'N' CHECK(processed IN ('N','Y')) NOT NULL , itemflag VARCHAR(1) DEFAULT 'N' CHECK(itemflag IN ('N','Y')) NOT NULL , tractorterminalid UUID REFERENCES terminal (terminalid) , cost DECIMAL(7,2) DEFAULT 0 , createdatetime TIMESTAMP , createuser CITEXT , editdatetime TIMESTAMP DEFAULT CURRENT_TIMESTAMP , edituser CITEXT DEFAULT CURRENT_USER ) As you can see there are a few columns which reference back to another table. What I need to know is how does Postgres work with these columns. Can I insert or update a row if those columns are null or are they required to have a non-null value in order for the row to be inserted or saved? Best Regards Michael Gould
On the contrary, we've been running PG in production for years now under VMWare. Same with MSSQL. We've never had any problems. Less so than an actual physical machine actually since we can move the server to different physical hardware ondemand. Also makes disaster recovery MUCH easier. However, VMWare does have its places. A high usage database is not one of them, IMHO. A moderately or less used one, dependingon requirements and the hardware backing it, is often a good fit. And I agree with Scott about the snapshots. They do tend to cause temporary communication issues with a running virtual machine occasionally, regardless of OS or DBtype. (The benefits outweigh the risks 99% of the time though, with backups being that 1%.) In my experience the levelof interference from snapshotting a virtual machine also depends on the type and speed of your physical disks backingthe VMWare host and the size of the virtual machine and any existing snapshot. I've been told that in VSPhere (VMWare4.0) this will be significantly improved. My .02 cents worth as we are a heavy VMWare user. Thanks, Scot Kreienkamp skreien@la-z-boy.com -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alex Gadea Sent: Monday, September 21, 2009 2:11 PM To: Scott Marlowe Cc: pgsql-general@postgresql.org; Tom Duffey Subject: Re: [GENERAL] VMWare file system / database corruption I'd be careful using VMWARE as a database server for anything other than production. I've had problems with SQL Server andmySQL databases and I am sure that Postgres will experience problems as well. One thing to look at is whether snapshots are set to be taken while the database is active. If they are, ask the systemadmin to find another way to make backups of the VM. My experience has been that snapshots cause many problems withdatabases. Alex ----- Original Message ----- From: "Scott Marlowe" <scott.marlowe@gmail.com> To: "Tom Duffey" <tduffey@techbydesign.com> Cc: pgsql-general@postgresql.org Sent: Monday, September 21, 2009 1:40:33 PM GMT -05:00 US/Canada Eastern Subject: Re: [GENERAL] VMWare file system / database corruption On Mon, Sep 21, 2009 at 11:09 AM, Tom Duffey <tduffey@techbydesign.com> wrote: > Hi All, > > We're having numerous problems with a PostgreSQL 8.3.7 database running on a > virtual Linux server w/VMWare ESX. This is not by choice and I have been > asking the operator of this equipment for details about the disk setup and > here's what I got: > > "We have a SAN that is presenting an NFS share. VMWare sees that share and > reads the VMDK file that make up the virtual file system." > > Does anyone with a better understanding of PostgreSQL and VMWare know if > this is an unreliable setup for PostgreSQL? I see things like "NFS" and > "VMWare" and start to get worried. I see VMWare and thing performance issues, I see NFS and thing dear god help us all. Even if properly setup NFS is a problem waiting to happen, and it's not reliable storage for a database in my opinion. That said, lots of folks do it. Ask for the NFS mount options from the sysadmin. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Michael Gould wrote: > As you can see there are a few columns which reference back to another > table. What I need to know is how does Postgres work with these columns. > Can I insert or update a row if those columns are null or are they required > to have a non-null value in order for the row to be inserted or saved? If they are NULL, they are not checked against the other table. If you want the value to be always checked, add a NOT NULL constraint to those columns. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Scot Kreienkamp wrote: > On the contrary, we've been running PG in production for years now under VMWare. Same with MSSQL. We've never had anyproblems. Less so than an actual physical machine actually since we can move the server to different physical hardwareon demand. Also makes disaster recovery MUCH easier. > > However, VMWare does have its places. A high usage database is not one of them, IMHO. A moderately or less used one,depending on requirements and the hardware backing it, is often a good fit. And I agree with Scott about the snapshots. They do tend to cause temporary communication issues with a running virtual machine occasionally, regardless ofOS or DB type. (The benefits outweigh the risks 99% of the time though, with backups being that 1%.) In my experiencethe level of interference from snapshotting a virtual machine also depends on the type and speed of your physicaldisks backing the VMWare host and the size of the virtual machine and any existing snapshot. I've been told thatin VSPhere (VMWare 4.0) this will be significantly improved. > does your VMWARE server use NFS to communicate with the disks? It was my understanding most folks used SAN logical units for the virtual disks with VMware ESX, and not NFS/NAS
Sent from my iPod
Scot Kreienkamp wrote:
> On the contrary, we've been running PG in production for years now under VMWare. Same with MSSQL. We've never had any problems. Less so than an actual physical machine actually since we can move the server to different physical hardware on demand. Also makes disaster recovery MUCH easier.
>
> However, VMWare does have its places. A high usage database is not one of them, IMHO. A moderately or less used one, depending on requirements and the hardware backing it, is often a good fit. And I agree with Scott about the snapshots. They do tend to cause temporary communication issues with a running virtual machine occasionally, regardless of OS or DB type. (The benefits outweigh the risks 99% of the time though, with backups being that 1%.) In my experience the level of interference from snapshotting a virtual machine also depends on the type and speed of your physical disks backing the VMWare host and the size of the virtual machine and any existing snapshot. I've been told that in VSPhere (VMWare 4.0) this will be significantly improved.
>
does your VMWARE server use NFS to communicate with the disks? It was
my understanding most folks used SAN logical units for the virtual disks
with VMware ESX, and not NFS/NAS
On Mon, Sep 21, 2009 at 4:03 PM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote: > On the contrary, we've been running PG in production for years now under VMWare. Same with MSSQL. We've never had anyproblems. Less so than an actual physical machine actually since we can move the server to different physical hardwareon demand. Also makes disaster recovery MUCH easier. > > However, VMWare does have its places. A high usage database is not one of them, IMHO. A moderately or less used one,depending on requirements and the hardware backing it, is often a good fit. And I agree with Scott about the snapshots. They do tend to cause temporary communication issues with a running virtual machine occasionally, regardless ofOS or DB type. (The benefits outweigh the risks 99% of the time though, with backups being that 1%.) In my experiencethe level of interference from snapshotting a virtual machine also depends on the type and speed of your physicaldisks backing the VMWare host and the size of the virtual machine and any existing snapshot. I've been told thatin VSPhere (VMWare 4.0) this will be significantly improved. I agree with pretty much everything you've said. I would never put a high load system on vmware, but testing, workstation, development, legacy etc is all good. I've never had any type of filesystem corruption. I'm guessing the OP's issues are either coming from NFS or hardware problems on the SAN (IMO not iikely). I would however check all software versions, etc. and make sure it's all up to date. Personally, I avoid NFS like the plague for anything other than basic file serving. Running a database through a NAS gateway to a SAN is crazy...even if it works performance is going to suck. If it was me and this was a critical database, I'd dedicate a LUN on the san and run a fiber cable direct to the vmware box, and mount the storage directly. merlin
In response to Merlin Moncure <mmoncure@gmail.com>: > On Mon, Sep 21, 2009 at 4:03 PM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote: > > On the contrary, we've been running PG in production for years now under VMWare. Same with MSSQL. We've never had anyproblems. Less so than an actual physical machine actually since we can move the server to different physical hardwareon demand. Also makes disaster recovery MUCH easier. > > > > However, VMWare does have its places. A high usage database is not one of them, IMHO. A moderately or less used one,depending on requirements and the hardware backing it, is often a good fit. And I agree with Scott about the snapshots. They do tend to cause temporary communication issues with a running virtual machine occasionally, regardless ofOS or DB type. (The benefits outweigh the risks 99% of the time though, with backups being that 1%.) In my experiencethe level of interference from snapshotting a virtual machine also depends on the type and speed of your physicaldisks backing the VMWare host and the size of the virtual machine and any existing snapshot. I've been told thatin VSPhere (VMWare 4.0) this will be significantly improved. > > I agree with pretty much everything you've said. I would never put a > high load system on vmware, but testing, workstation, development, > legacy etc is all good. I've never had any type of filesystem > corruption. There was a bug in the disk driver for VMWare ESXi not long ago that would neglect to flush the disk buffers on certain hardware on shutdown. This, naturally, would lead to horrendous filesystem corruption if you rebooted. This bit us pretty hard (we use ESXi for our testing/development systems). The good news is that VMWare has since issued a patch that fixes the problem. The point being that if you're experiencing filesystem corruption on VMWare, check your version against their updates and see if that's the cause. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/
On Mon, 21 Sep 2009, Tom Duffey wrote: > Does anyone with a better understanding of PostgreSQL and VMWare know if this > is an unreliable setup for PostgreSQL? I see things like "NFS" and "VMWare" > and start to get worried. PostgreSQL requires one simple guarantee: that when the database writes something and then calls the OS fsync call, that call will not return success until that write is on physical disk. In your case, this requires: 1) VMWare recognizes fsync and passes that request to the network storage device 2) The NFS software passes fsync to the SAN 3) The SAN waits until the physical disk write is complete (and not just in the hard drive's write caches) before returning from the fsync that the operation is complete. In theory, there's no reason this can't be made reliable (albeit slow). But when you have so many layers of stuff in the middle it's hard to prove that things are working correctly or find the problem part that's causing corruption. You'll need to audit everything from your VM down to the SAN configuration to make sure there are no non-battery backed write-back caches being used (and, no, a UPS doesn't count), and that none of the software involved has turned off fsync support as a performance optimization. There's a bunch of additional trivia in this area at http://www.postgresql.org/docs/current/static/wal-reliability.html and my article at http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD