Обсуждение: Why database is corrupted after re-booting

Поиск
Список
Период
Сортировка

Why database is corrupted after re-booting

От
"Andrus"
Дата:
Yesterday  computer running Postgres re-boots suddenly. After that,

select * from firma1.klient

returns

ERROR:  invalid page header in block 739 of relation "klient"

I have Quantum Fireball IDE drive, write caching is turned OFF.
I have Windows XP with FAT32  file system.
I'm using PostgreSQL 8.0.2 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC)
3.4.2 (mingw-special)

Why the corruption occurs ?  How to avoid data corruption?
Will NTFS file system prevent all corruptions ? If yes, how to convert FAT32
to NTFS without losing data in drive ?

Andrus.



Re: Why database is corrupted after re-booting

От
"Andrus"
Дата:
> To change partition types you need to re-format (resetting partitions
> will lose data structure - reformat required).

Troy,

Whole my IDE drive is 20 GB FAT32 C: drive booting XP
I have a lot of data in this drive so it is not possible to re-format. Also
I do'nt want to create two logical disks in single drive.

Is this prevents data corruption for Postgres, is there some utility which
can convert C: drive to NTFS ?
Can Partition Magic help ?

Andrus



Re: Why database is corrupted after re-booting

От
Scott Marlowe
Дата:
On Wed, 2005-10-26 at 10:27, Andrus wrote:
> Yesterday  computer running Postgres re-boots suddenly. After that,
>
> select * from firma1.klient
>
> returns
>
> ERROR:  invalid page header in block 739 of relation "klient"
>
> I have Quantum Fireball IDE drive, write caching is turned OFF.
> I have Windows XP with FAT32  file system.
> I'm using PostgreSQL 8.0.2 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC)
> 3.4.2 (mingw-special)
>
> Why the corruption occurs ?  How to avoid data corruption?
> Will NTFS file system prevent all corruptions ? If yes, how to convert FAT32
> to NTFS without losing data in drive ?

If your machine crashes, FAT makes no promises that it will come back
up, uncorrupted or otherwise.

NTFS has journaling, and should provide more safety.

Turning off the write cache is the right thing to do.  Putting your db
on FAT is the (very very) wrong thing to do.

I would run the ntfs converter if I were you, but you'll likely need a
backup to get your database back on its feet again.  Don't forget the
backups.

Re: Why database is corrupted after re-booting

От
Gregory Youngblood
Дата:
Talking with various people that ran postgres at different times, one thing they always come back with in why mysql is so much better: postgresql corrupts too easily and you lose your data.

Personally, I've not seen corruption in postgres since 5.x or 6.x versions from several years ago. And, I've seen corruption on mysql (though I could not isolate between a reiserfs or mysql problem - both with supposedly stable releases installed as part of a distro).

Is corruption a problem? I don't think so - but I want to make sure I haven't had my head in the sand for a while. :) I realize this instance appears to be on Windows, which is relatively new as a native Windows program. I'm really after the answer on more mature platforms (including Linux).

Thanks,
Greg

On Wed, 2005-10-26 at 18:27 +0300, Andrus wrote:
Yesterday  computer running Postgres re-boots suddenly. After that,

select * from firma1.klient

returns

ERROR:  invalid page header in block 739 of relation "klient"

I have Quantum Fireball IDE drive, write caching is turned OFF.
I have Windows XP with FAT32  file system.
I'm using PostgreSQL 8.0.2 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 
3.4.2 (mingw-special)

Why the corruption occurs ?  How to avoid data corruption?
Will NTFS file system prevent all corruptions ? If yes, how to convert FAT32 
to NTFS without losing data in drive ?

Andrus. 



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: Why database is corrupted after re-booting

От
"Joshua D. Drake"
Дата:
On Wed, 2005-10-26 at 18:27 +0300, Andrus wrote:
> Yesterday  computer running Postgres re-boots suddenly. After that,
>
> select * from firma1.klient
>
> returns
>
> ERROR:  invalid page header in block 739 of relation "klient"
>
> I have Quantum Fireball IDE drive, write caching is turned OFF.
> I have Windows XP with FAT32  file system.
> I'm using PostgreSQL 8.0.2 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC)
> 3.4.2 (mingw-special)
>
> Why the corruption occurs ?

Most likely because the IDE was caching the information. IDE drives
sometimes lie about having caching turned on or off.

>   How to avoid data corruption?

You could also have a bad drive.

> Will NTFS file system prevent all corruptions ?

No.


Sincerely,

Joshua D. Drake


> If yes, how to convert FAT32
> to NTFS without losing data in drive ?
>
> Andrus.
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
--
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/


Re: Why database is corrupted after re-booting

От
"Welty, Richard"
Дата:
Gregory Youngblood  wrote:
>Is corruption a problem? I don't think so - but I want to make sure I haven't had my
>head in the sand for a while. :) I realize this instance appears to be on Windows,
>which is relatively new as a native Windows program. I'm really after the answer on
>more mature platforms (including Linux).

crappy disk drives and bad windows file systems, nothing more. postgresql is
rather corruption free when the surrounding hardware and software environments
are well chosen.

if you do have to use cheap disk drives/controllers, then a battery backup
unit that shuts the server down automagically is a really really good idea.
getting that IDE cache flushed is pretty high on the priority list.

richard

Re: Why database is corrupted after re-booting

От
Tom Lane
Дата:
Gregory Youngblood <pgcluster@netio.org> writes:
> Is corruption a problem? I don't think so - but I want to make sure I
> haven't had my head in the sand for a while. :) I realize this instance
> appears to be on Windows, which is relatively new as a native Windows
> program. I'm really after the answer on more mature platforms (including
> Linux).

It's been quite some time since I've seen an instance of data corruption
that appeared to be due to a Postgres bug.  (At least, not corruption in
tables ... we've had some index bugs, but those you can always fix with
REINDEX.)  I have seen lots of cases that seemed to be due to hardware
or OS misfeasance, eg, disk sectors filled with data that didn't come
from Postgres at all.

You can reduce your exposure by making sure things are correctly
configured (eg, disable write caching, or better yet don't use
consumer-grade drives at all).  In the end there's no substitute
for a good backup policy ;-)

AFAICS mysql will have exactly the same problems.  So will oracle or
any other DB.  Oracle may have a better looking track record, but
that's probably because people don't try to run it on cheap junk PCs.

            regards, tom lane

Re: Why database is corrupted after re-booting

От
"Joshua D. Drake"
Дата:
On Wed, 2005-10-26 at 19:14 +0300, Andrus wrote:
> > To change partition types you need to re-format (resetting partitions
> > will lose data structure - reformat required).
>
> Troy,
>
> Whole my IDE drive is 20 GB FAT32 C: drive booting XP
> I have a lot of data in this drive so it is not possible to re-format. Also
> I do'nt want to create two logical disks in single drive.
>
> Is this prevents data corruption for Postgres, is there some utility which
> can convert C: drive to NTFS ?
> Can Partition Magic help ?

XP at least on install I believe has the ability to convert to NTFS.

Have you tried just right clicking on your C: selecting properties
and then seeing if there is a convert option?

>
> Andrus
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
--
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/


Re: Why database is corrupted after re-booting

От
"Joshua D. Drake"
Дата:
>
> AFAICS mysql will have exactly the same problems.  So will oracle or
> any other DB.  Oracle may have a better looking track record, but
> that's probably because people don't try to run it on cheap junk PCs.

Can I quote this?

>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
--
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/


Re: Why database is corrupted after re-booting

От
"Wes Williams"
Дата:
Type the following at the Windows command prompt (start, run, "cmd"):

convert c: /fs:ntfs /v

It will complain about locked files and perform the convert at the next
reboot, which you should do immediately.

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org]On Behalf Of Joshua D. Drake
Sent: Wednesday, October 26, 2005 1:10 PM
To: Andrus
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Why database is corrupted after re-booting


On Wed, 2005-10-26 at 19:14 +0300, Andrus wrote:
> > To change partition types you need to re-format (resetting partitions
> > will lose data structure - reformat required).
>
> Troy,
>
> Whole my IDE drive is 20 GB FAT32 C: drive booting XP
> I have a lot of data in this drive so it is not possible to re-format.
Also
> I do'nt want to create two logical disks in single drive.
>
> Is this prevents data corruption for Postgres, is there some utility which
> can convert C: drive to NTFS ?
> Can Partition Magic help ?

XP at least on install I believe has the ability to convert to NTFS.

Have you tried just right clicking on your C: selecting properties
and then seeing if there is a convert option?

>
> Andrus
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
--
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


Re: Why database is corrupted after re-booting

От
Shelby Cain
Дата:
Additionally, you should also take the opportunity to defrag the
filesystem after the conversion as the change in cluster size (I'm
guessing from 64k to 4k) will leave your shiny new NTFS file system
highly fragmented.

--- Wes Williams <wes_williams@fcbonline.net> wrote:

> Type the following at the Windows command prompt (start, run, "cmd"):
>
> convert c: /fs:ntfs /v
>
> It will complain about locked files and perform the convert at the
> next
> reboot, which you should do immediately.
>
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Joshua D.
> Drake
> Sent: Wednesday, October 26, 2005 1:10 PM
> To: Andrus
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Why database is corrupted after re-booting
>
>
> On Wed, 2005-10-26 at 19:14 +0300, Andrus wrote:
> > > To change partition types you need to re-format (resetting
> partitions
> > > will lose data structure - reformat required).
> >
> > Troy,
> >
> > Whole my IDE drive is 20 GB FAT32 C: drive booting XP
> > I have a lot of data in this drive so it is not possible to
> re-format.
> Also
> > I do'nt want to create two logical disks in single drive.
> >
> > Is this prevents data corruption for Postgres, is there some
> utility which
> > can convert C: drive to NTFS ?
> > Can Partition Magic help ?
>
> XP at least on install I believe has the ability to convert to NTFS.
>
> Have you tried just right clicking on your C: selecting properties
> and then seeing if there is a convert option?
>
> >
> > Andrus
> >
> >
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 9: In versions below 8.0, the planner will ignore your desire
> to
> >        choose an index scan if your joining column's datatypes do
> not
> >        match
> --
> The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
> Managed Services, Shared and Dedicated Hosting
> Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>




__________________________________
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com

Re: Why database is corrupted after re-booting

От
Scott Marlowe
Дата:
On Wed, 2005-10-26 at 11:14, Gregory Youngblood wrote:
> Talking with various people that ran postgres at different times, one
> thing they always come back with in why mysql is so much better:
> postgresql corrupts too easily and you lose your data.
>
> Personally, I've not seen corruption in postgres since 5.x or 6.x
> versions from several years ago. And, I've seen corruption on mysql
> (though I could not isolate between a reiserfs or mysql problem - both
> with supposedly stable releases installed as part of a distro).
>
> Is corruption a problem? I don't think so - but I want to make sure I
> haven't had my head in the sand for a while. :) I realize this
> instance appears to be on Windows, which is relatively new as a native
> Windows program. I'm really after the answer on more mature platforms
> (including Linux).

I have been using PostgreSQL since version 6.5.2.  There are many people
on this list that have been using it longer than that.  In all that
time, I've had exactly zero problems with data corruption.  Of course,
every server I've run PostgreSQL on has been burnt in for at least a
week of heavy testing, and they've all had SCSI drives, and if they had
RAID controllers they all had battery backed cache.

Every machine was tested by running pg_bench for many days, about 100
clients wide, while doing other, more general work at the same time.  A
part of the testing was to switch the machine off many times while it
was committing to the database, often forcing a flush before pulling the
plug.

I found quickly that IDE drives are not reliable with the cache turned
on, and are too slow for most production purposes without the cache.
So, SCSI was (and apparently still is) the only way to go.

Now, I'm willing to bet that PostgreSQL is more likely to notice
corruption and report it than MySQL.  I wonder if MySQL can detect most
simple single bit errors or not?  I'd have to do some testing on it to
see if it can detect such errors easily.

I'd much rather have a database that simply stops and reports a data
corruption error than one that doesn't notice, wouldn't you?

Re: Why database is corrupted after re-booting

От
"Wes Williams"
Дата:
Even with a primary UPS on the *entire PostgreSQL server* does one still
need, or even still recommend, a battery-backed cache on the RAID controller
card?  [ref SCSI 320, of course]

If so, I'd be interest in knowing briefly why.

Thanks.

-----Original Message-----
===snip===

...
every server I've run PostgreSQL on has been burnt in for at least a
week of heavy testing, and they've all had SCSI drives, and if they had
RAID controllers they all had battery backed cache.


Re: Why database is corrupted after re-booting

От
snacktime
Дата:

I remember a few months back when someone hit the emergency power switch to the whole floor where we host at Internap.  Subsequently the backup power system had a cascading failure.  Livejournal, who also hosts there, was up all night and into the next day restoring their mysql databases after a bunch of them were corrupted.  I believe they had write cache turned on.

Of course our postgresql servers on scsi drives came right back up.  If it wasn't for a couple of servers that won't reboot automatically if the power goes out I wouldn't have even had to go down to the data center.

Chris

Re: Why database is corrupted after re-booting

От
Douglas McNaught
Дата:
"Wes Williams" <wes_williams@fcbonline.net> writes:

> Even with a primary UPS on the *entire PostgreSQL server* does one still
> need, or even still recommend, a battery-backed cache on the RAID controller
> card?  [ref SCSI 320, of course]
>
> If so, I'd be interest in knowing briefly why.

UPSs can fail just like any other piece of hardware.

-Doug

Re: Why database is corrupted after re-booting

От
Scott Marlowe
Дата:
On Wed, 2005-10-26 at 13:38, Wes Williams wrote:
> Even with a primary UPS on the *entire PostgreSQL server* does one still
> need, or even still recommend, a battery-backed cache on the RAID controller
> card?  [ref SCSI 320, of course]
>
> If so, I'd be interest in knowing briefly why.

I'll tell you a quick little story.

Got a new server, aged out the old one.  new server was a dual P-IV 2800
with 2 gigs ram and a pair of 36 gig U320 drives in a RAID-1 mirror
under a battery backed cache.  This machine also had four 120 gig IDE
drives for file storage.  But the database was on the dual SCSIs under
the RAID controller.

I tested it with the power off test, etc... And it passed with flying
colors.  Put it into production.  Many other servers, including our
Oracle servers, were not tested in this way.

This machine had dual redundant power supplies with separate power
cables running into two separate rails, each running off of a different
UPS.  The UPSes were fed by power conditioners, and there was a switch
on the other side of that to switch us over to diesel generators should
the power go out.  The UPSes were quite large, and even with a hundred
or so computers in the hosting center, there was about 3 hours of
battery time before the diesel generator HAD to be up or we'd lose
power.

Seems pretty solid, right?  We're talking a multi million dollar hosting
center, the kind with an ops center that looks like the deck of the
Enterprise.  Raised floors, everything.

Fast forward six months.  An electrician working on the wiring in the
ceiling above one of the power conditioners clips off a tiny piece of
wire.  Said tiny piece of wire drops into the power conditioner.  Said
power conditioner overloads, and trips the other two power conditioners
in the hosting center.  This also blew out the master controller on the
UPS setup, so it didn't come up.  The switch for the Diesel generator
would have switched over, but it was fried too.  The UPSes, luckily,
were the constant on variety, so they took the hit for the computers on
the other side of them, about half the UPSes were destroyed.

After about 3 hours, we had enough of the power jury rigged to bring the
systems back up.  In a company with dozens and dozens, ranging from
MySQL to Oracle to PostgreSQL to Ingres to MSSQL to interbase to foxpro,
exactly one of our database servers came up without any errors.  You
already know which one it was, or I wouldn't be writing this letter.

Power supplies fail, UPSes fail, hard drives fail, and raid controllers
and batter backed caches fail.  You can remove every possibility of
failure, but you can limit the number of things that can harm you should
they fail.

I do know that after that outage, I never once got shit for using
postgresql ever again from anybody.  The sad thing is, if any of those
other machines had had battery backed raid controllers with local
storage (many were running on NFS or SMB mounts) they would have been
fine too.  But many of the DBAs for those other databases had the same
"who needs to worry about sudden power off when we have UPSes and power
conditioners."  You can guess what optional feature suddenly seemed like
a good idea for every new database server after that.

Re: Why database is corrupted after re-booting

От
Bricklen Anderson
Дата:
snacktime wrote:
>
> I remember a few months back when someone hit the emergency power switch
> to the whole floor where we host at Internap.  Subsequently the backup
> power system had a cascading failure.  Livejournal, who also hosts
> there, was up all night and into the next day restoring their mysql
> databases after a bunch of them were corrupted.  I believe they had
> write cache turned on.
>
> Of course our postgresql servers on scsi drives came right back up.  If
> it wasn't for a couple of servers that won't reboot automatically if the
> power goes out I wouldn't have even had to go down to the data center.
>
> Chris

I remember reading a detailed account on Livejournal about the hoops they had to
jump through to get up and running again after that incident. Bit of a nightmare
for them.

--
_______________________________

This e-mail may be privileged and/or confidential, and the sender does
not waive any related rights and obligations. Any distribution, use or
copying of this e-mail or the information it contains by other than an
intended recipient is unauthorized. If you received this e-mail in
error, please advise me (by return e-mail or otherwise) immediately.
_______________________________

Re: Why database is corrupted after re-booting

От
"Welty, Richard"
Дата:
Wes Williams writes:
>Even with a primary UPS on the *entire PostgreSQL server* does one still
>need, or even still recommend, a battery-backed cache on the RAID controller
>card?  [ref SCSI 320, of course]

>If so, I'd be interest in knowing briefly why.

it can be a lot faster.

if the raid controller knows it has a battery backup, then it'll be free
to do whatever it sees fit in terms of write order.

some controllers (the ibm serveraid 4 units that i have a couple of, for
example) won't do this unless they know the battery is there, they have no
option for overriding that setting.

richard

Re: Why database is corrupted after re-booting

От
"Keith C. Perry"
Дата:
Just to add another story...

I've been running PostgreSQL on Linux since the 6.x days and back then I was
almost always on IDE drives with an EXT2 filesystem.  To date, the worse class of
experiences I've had was going through the fs recovery steps for EXT2.  In
those cases I never lost data in the database even when I might have lost files.
 Once XFS became an in kernel option for Linux, I moved almost all my servers to
that filesystem whether they are IDE or SCSI.  In a recent experience where I
was forced to hard reset a server with XFS and IDE drives, the box came right
back up with no data loss.

There is only one case of a major "problem" I've have in the last 8 years or so
and I posted to this list and with Tom's help I was able to get the box online.
 That wasn't a filesystem problem though.  Its off topic but (for those
interested) that thread, "Database Recovery Procedures", was from  September 16,
2003.  It had to deal with padding out one of the pg_clog files in a 7.3.x system.

Quoting "Welty, Richard" <richard.welty@bankofamerica.com>:

> Wes Williams writes:
> >Even with a primary UPS on the *entire PostgreSQL server* does one still
> >need, or even still recommend, a battery-backed cache on the RAID
> controller
> >card?  [ref SCSI 320, of course]
>
> >If so, I'd be interest in knowing briefly why.
>
> it can be a lot faster.
>
> if the raid controller knows it has a battery backup, then it'll be free
> to do whatever it sees fit in terms of write order.
>
> some controllers (the ibm serveraid 4 units that i have a couple of, for
> example) won't do this unless they know the battery is there, they have no
> option for overriding that setting.
>
> richard
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>


--
Keith C. Perry, MS E.E.
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com

____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com

Re: Why database is corrupted after re-booting

От
David Garamond
Дата:
Andrus wrote:
> Will NTFS file system prevent all corruptions ? If yes, how to convert FAT32
> to NTFS without losing data in drive ?

iirc (i'm not on windows currently, google for the exact syntax),

at the dos prompt, type:

 convert /fs:ntfs C:

and it will schedule a conversion after the next reboot. you *should*
backup all important data to another drive/computer though (imagine what
will happen if your computer dies again in the middle of conversion).

--
dave

Re: Why database is corrupted after re-booting

От
Alex Stapleton
Дата:
On 26 Oct 2005, at 19:43, snacktime wrote:

>
> I remember a few months back when someone hit the emergency power
> switch to the whole floor where we host at Internap.  Subsequently
> the backup power system had a cascading failure.  Livejournal, who
> also hosts there, was up all night and into the next day restoring
> their mysql databases after a bunch of them were corrupted.  I
> believe they had write cache turned on.
>
> Of course our postgresql servers on scsi drives came right back
> up.  If it wasn't for a couple of servers that won't reboot
> automatically if the power goes out I wouldn't have even had to go
> down to the data center.
>
> Chris

I don't know about this you know. Power failures can cause seriously
random failures on most PC hardware. A few weeks ago we had a RAID 1
(fsync on, caching off, battery backed raid controller etc) system
get it's RAID partitions gets totally fried by a power failure. My
suspicion is that if the power failure isn't a particularly fast one,
(e.g. you overloaded a fuse somewhere, fuses are insanely slow to
fail compared to alternatives like MCBs) then your RAID card's RAM
will get corrupted as the voltage drops or the system memory will
resulting in bad data getting copied to the RAID controller as RAM
seems to be pretty sensitive to voltage variations in experiments
i've done on my insanely tweak-able desktop at home. I would of
though ECC probably helps, but it can only correct so much.

Of course I'm not an electrical engineer (although my friend is a
member of IEEE and he seemed to agree it was a possibility) doesn't
the possibility of this kinda make things a bit more complicated and/
or expensive to maintain data integrity during a power failure?

Re: Why database is corrupted after re-booting

От
"Andrus"
Дата:
>> Why the corruption occurs ?
>
> Most likely because the IDE was caching the information. IDE drives
> sometimes lie about having caching turned on or off.
>
>> Will NTFS file system prevent all corruptions ?
>
> No.

Joshua,

thank you.  Please re-confirm. In the configuration

1. Windows XP
2. QUANTUM FIREBALLP LM20.5  (IDE drive)
3. Write caching is off in XP device manager
4. fsync is ON in Postgres 8
5. NTFS file system

following may occur:

a. Power failure (or its simulation by pressing RESET button) causes
Postgres database to be corrupted.
b. No automatic repair/rollback is perfomed.
c. Only way to bring database back online is to restore from backup

My problem: Sometimes I need also to run desktop (server and client in same
desktop computer)  applications with Postgres.
Desktop computer have this config. It is not  possible to force users to buy
SCSI drives nor upses for each desktop computer.
Can Firebird or SQLLite automatically recover from power failure?

Andrus.



Re: Why database is corrupted after re-booting

От
Richard Huxton
Дата:
Andrus wrote:
> My problem: Sometimes I need also to run desktop (server and client in same
> desktop computer)  applications with Postgres.
> Desktop computer have this config. It is not  possible to force users to buy
> SCSI drives nor upses for each desktop computer.
> Can Firebird or SQLLite automatically recover from power failure?

If data on your disk gets corrupted then NOTHING can guarantee to
recover your database - not PG, not Firebird, not Oracle.

PostgreSQL writes all transactions to a log (WAL) before reporting them
as committed. If your system tells the truth about when data is actually
written to disk, then it can use this WAL to find out what happened when
the system stopped and make sure the database is in a consistent state.

Now, if your WAL gets corrupted then obviously there's not much PG can
do about it - that's why it's vital to make sure that write caching is
off, so PG can guarantee that something written to disk is actually there.

Now, since you're not going to control your clients' hardware, and
probably can't guarantee their settings either you'll have to accept a
greater risk of data loss than with good quality hardware you specify
yourself. There are steps you can take to protect their data though -
running on NTFS, telling them to switch write caching off and, I would
suggest looking into running a PITR setup on the same machine.

--
   Richard Huxton
   Archonet Ltd

Re: Why database is corrupted after re-booting

От
"Andrus"
Дата:
> If data on your disk gets corrupted then NOTHING can guarantee to recover
> your database - not PG, not Firebird, not Oracle.

Richard,

thank you for reply. I ask my questing more presicely:

I have configuration like in my previous message. Hardware (IDE drive,
computer) and software (Windows XP) works according to vendor
specifications.

If I turn power off by breaking power cord when Postgres server is busy, is
it possible that
after that SELECT * FROM anytable does not work ?

Andrus.



Re: Why database is corrupted after re-booting

От
Martijn van Oosterhout
Дата:
On Thu, Oct 27, 2005 at 02:54:50PM +0300, Andrus wrote:
> I have configuration like in my previous message. Hardware (IDE drive,
> computer) and software (Windows XP) works according to vendor
> specifications.
>
> If I turn power off by breaking power cord when Postgres server is busy, is
> it possible that
> after that SELECT * FROM anytable does not work ?

Lets put it another way:

1. If you are only doing SELECTs the chance anything will go wrong is
very small, because you're not actually writing anything.

2. If you are changing data and your disk faithfully and correctly
writes that data in the order it's told, then PostgreSQL can use the
WAL to recover, everything will work fine.

3. If your disk lies about writing data in the right order, 99% of
the time you will be fine, but that one time your uber-important data
is there, Murphy's law will kick in and trash it for you.

I've run PostgreSQL on all sorts of hardware, some of it not very good
and I've never lost any data or not had PostgreSQL come up properly
afterwards. But I just consider myself lucky. I've been on this list
long enough to see that bad things *do* happen with dodgy hardware. It
doesn't go wrong often. Even then, it's usually a single block
corrupted or an index that needs to be reindexed.

Note: I've always run on Linux system, which provides POSIX type
semantics for these things. I have *no* idea how much of this applies
to Windows.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения

Re: Why database is corrupted after re-booting

От
Richard Huxton
Дата:
Andrus wrote:
>>If data on your disk gets corrupted then NOTHING can guarantee to recover
>>your database - not PG, not Firebird, not Oracle.
>
>
> Richard,
>
> thank you for reply. I ask my questing more presicely:
>
> I have configuration like in my previous message. Hardware (IDE drive,
> computer) and software (Windows XP) works according to vendor
> specifications.
>
> If I turn power off by breaking power cord when Postgres server is busy, is
> it possible that
> after that SELECT * FROM anytable does not work ?

It is always *possible*, but if your system isn't caching writes then it
is *very very* unlikely. The tricky bit is that a lot of IDE drives
don't really disable the write-cache.

You should really test properly, but a quick way to know is to run a
series of single inserts, each in their own transaction. If you get more
transactions than the speed (rpm) of the disk then you know it *must* be
caching.

--
   Richard Huxton
   Archonet Ltd

Re: Why database is corrupted after re-booting

От
"Andrus"
Дата:
>> If I turn power off by breaking power cord when Postgres server is busy,
>> is it possible that
>> after that SELECT * FROM anytable does not work ?
>
> It is always *possible*, but if your system isn't caching writes then it
> is *very very* unlikely. The tricky bit is that a lot of IDE drives don't
> really disable the write-cache.
>
> You should really test properly, but a quick way to know is to run a
> series of single inserts, each in their own transaction. If you get more
> transactions than the speed (rpm) of the disk then you know it *must* be
> caching.

Richard,

thank you.

QUANTUM FIREPALLP LM20.5 is a widely used ATA IDE drive.

Where do find information does it implement write caching properly or not ?

Is there IDE drive compatibility list for Postgres ?

If this information is not available is there a standard utility which can
determine this drive compatibility with Postgres under Windows ?

Is it possible write utility which converts corrupted database to readable
state so that SELECT * FROM anytable will work always ?
This utility may remove all contraints, just create database which contains
as much data as possible.
Then I can import this data to empty correct database and discard all rows
which violate database rules.

Andrus.



Re: Why database is corrupted after re-booting

От
Tom Lane
Дата:
Alex Stapleton <alexs@advfn.com> writes:
> suspicion is that if the power failure isn't a particularly fast one,
> (e.g. you overloaded a fuse somewhere, fuses are insanely slow to
> fail compared to alternatives like MCBs) then your RAID card's RAM
> will get corrupted as the voltage drops or the system memory will
> resulting in bad data getting copied to the RAID controller as RAM
> seems to be pretty sensitive to voltage variations in experiments
> i've done on my insanely tweak-able desktop at home. I would of
> though ECC probably helps, but it can only correct so much.

Any competently designed battery-backup scheme has no problem with this.

What can seriously fry your equipment is a spike (ie, too much voltage
not too little).  Most UPS-type equipment includes surge suppression
hardware that offers a pretty good defense against this, but if you get
a lightning strike directly where the power comes into your building,
you're going to be having a chat with your insurance agent.  There is
nothing made that will withstand a point-blank strike.

            regards, tom lane

Re: Why database is corrupted after re-booting

От
Alex Stapleton
Дата:
On 27 Oct 2005, at 14:57, Tom Lane wrote:

> Alex Stapleton <alexs@advfn.com> writes:
>
>> suspicion is that if the power failure isn't a particularly fast one,
>> (e.g. you overloaded a fuse somewhere, fuses are insanely slow to
>> fail compared to alternatives like MCBs) then your RAID card's RAM
>> will get corrupted as the voltage drops or the system memory will
>> resulting in bad data getting copied to the RAID controller as RAM
>> seems to be pretty sensitive to voltage variations in experiments
>> i've done on my insanely tweak-able desktop at home. I would of
>> though ECC probably helps, but it can only correct so much.
>>
>
> Any competently designed battery-backup scheme has no problem with
> this.
>
> What can seriously fry your equipment is a spike (ie, too much voltage
> not too little).  Most UPS-type equipment includes surge suppression
> hardware that offers a pretty good defense against this, but if you
> get
> a lightning strike directly where the power comes into your building,
> you're going to be having a chat with your insurance agent.  There is
> nothing made that will withstand a point-blank strike.
>

The system RAM won't usually be supported by any batteries though, so
it will go crazy, copy corrupt data to the DIMMs on the RAID
controller, which then will refuse to write it to the disk until the
power comes up, and then write the bad data to the drive surely?

Re: Why database is corrupted after re-booting

От
"Troy"
Дата:
Unless I missed something, I think you can select on a fresh install
but not after. I doubt even an image could be switched but I could be
wrong, I am too often.

Troy


Re: Why database is corrupted after re-booting

От
ellis@spinics.net (Rick Ellis)
Дата:
In article <A209FE4DA934614CAF3F5BD8E5E14290B0DEC1@ex2k.bankofamerica.com>,
Welty, Richard <richard.welty@bankofamerica.com> wrote:

>crappy disk drives and bad windows file systems, nothing more.

Could even be crappy memory.

--
http://yosemitecampsites.com/

Re: Why database is corrupted after re-booting

От
"Troy"
Дата:
I couldn't load it on a FAT32 partition on an XP HOME pc. So I loaded
it on the NTSF partition of the same drive.

I don't know why it did & now doesn't work but it could be that you
need to defrag and clear some space.

To change partition types you need to re-format (resetting partitions
will lose data structure - reformat required).

You could just pop in an additional harddrive (slave) and have it
formatted NTFS - then install it on that drive D:/postgres/

Not the answer you'd want but good luck.
Troy


Re: Why database is corrupted after re-booting

От
"Troy"
Дата:
Cheaper solution is to get a second hard drive an put it in your
computer as a slave....

yes you could xcopy your drive to some backup device then repartition
and plop it back - that would take alot of work and involves
DiskCopy/Ghost like software and has great risk. (Run Defrag first -
Plus you may still need dual partition the drive to put your boot files
back in place.) Backup everything first!

I don't know how much access you have, but another harddrive (100GB
from bestbuy.com about $50 - cheaper that software.  You could install
a used, smaller hard drive and you'd never know the difference. Put
just Postgres on the second hard drive (FORMAT IT NTFS FIRST).

hope it helps
Troy H


Re: Why database is corrupted after re-booting

От
Tom Lane
Дата:
Alex Stapleton <alexs@advfn.com> writes:
> The system RAM won't usually be supported by any batteries though, so
> it will go crazy, copy corrupt data to the DIMMs on the RAID
> controller, which then will refuse to write it to the disk until the
> power comes up, and then write the bad data to the drive surely?

Not in competently designed hardware.  The system should shut down
completely the instant the power supply's outputs go out of spec,
which will be before the logic components actually start to malfunction.

This is not to say that cheap consumer-grade PCs are competently
designed ;-) but the issue was a solved problem when I was a
practicing EE, and that was a long time ago.

            regards, tom lane

Re: Why database is corrupted after re-booting

От
Alex Stapleton
Дата:
On 27 Oct 2005, at 16:07, Tom Lane wrote:

> Alex Stapleton <alexs@advfn.com> writes:
>
>> The system RAM won't usually be supported by any batteries though, so
>> it will go crazy, copy corrupt data to the DIMMs on the RAID
>> controller, which then will refuse to write it to the disk until the
>> power comes up, and then write the bad data to the drive surely?
>>
>
> Not in competently designed hardware.  The system should shut down
> completely the instant the power supply's outputs go out of spec,
> which will be before the logic components actually start to
> malfunction.
>
> This is not to say that cheap consumer-grade PCs are competently
> designed ;-) but the issue was a solved problem when I was a
> practicing EE, and that was a long time ago.

lol, iirc it was an middle aged piece of random dell equipment. They
seem to be getting progressively less awful these days so maybe it
was just that particular model. I may have to do some evil tests
using glass fuses and hammers (and rubber gloves).......

Re: Why database is corrupted after re-booting

От
Richard Huxton
Дата:
Andrus wrote:
>
> QUANTUM FIREPALLP LM20.5 is a widely used ATA IDE drive.
>
> Where do find information does it implement write caching properly or not ?

I don't think the manufacturers bother to make this sort of information
available.

> Is there IDE drive compatibility list for Postgres ?

No - for the reason above (amongst oghers).

> If this information is not available is there a standard utility which can
> determine this drive compatibility with Postgres under Windows ?

Try the test I described earlier.

> Is it possible write utility which converts corrupted database to readable
> state so that SELECT * FROM anytable will work always ?
> This utility may remove all contraints, just create database which contains
> as much data as possible.
> Then I can import this data to empty correct database and discard all rows
> which violate database rules.

There's nothing I know of, and I don't think we see enough problems to
build anything very sophisticated. There is a file-dump utility from Red
Hat:
   http://sources.redhat.com/rhdb/

Far better is to always have a known-good version on the machine. Have a
look in the manuals for Point-in-time recovery (PITR). That might suit
your needs. It also would let you re-run changes to any point in the day
- useful for clients who delete things they shouldn't!

--
   Richard Huxton
   Archonet Ltd

Re: Why database is corrupted after re-booting

От
"Keith C. Perry"
Дата:
Actually, because I lost several thousands of dollars or equipement a couple of
years ago, I recommended these "brickwall" products to a company.

http://brickwall.com/index.htm

We actually never deployed these units (grounding the communications lines ended
up being a much cheaper solution) but I did talk and engineer at the company and
apparently they have some hospitals as client that use unitss.  I'm won't get
into the technology of how they work since you can read that yourself but I
remember having a warm and fuzzy after my conversation.

I will pull one quote from their web site though...

"Unlike MOV’s, TRANS-ZORBS and similar shunt based surge protectors that use
elements weighing less than 1/4 ounce, Brick Wall surge protectors can easily
absorb any surge repeatedly with absolutely no degradation."

The important phrase here is "...absorb any surge repeatedly with absolutely no
degradation."

Quoting Tom Lane <tgl@sss.pgh.pa.us>:

> Alex Stapleton <alexs@advfn.com> writes:
> > suspicion is that if the power failure isn't a particularly fast one,
> > (e.g. you overloaded a fuse somewhere, fuses are insanely slow to
> > fail compared to alternatives like MCBs) then your RAID card's RAM
> > will get corrupted as the voltage drops or the system memory will
> > resulting in bad data getting copied to the RAID controller as RAM
> > seems to be pretty sensitive to voltage variations in experiments
> > i've done on my insanely tweak-able desktop at home. I would of
> > though ECC probably helps, but it can only correct so much.
>
> Any competently designed battery-backup scheme has no problem with this.
>
> What can seriously fry your equipment is a spike (ie, too much voltage
> not too little).  Most UPS-type equipment includes surge suppression
> hardware that offers a pretty good defense against this, but if you get
> a lightning strike directly where the power comes into your building,
> you're going to be having a chat with your insurance agent.  There is
> nothing made that will withstand a point-blank strike.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>


--
Keith C. Perry, MS E.E.
Director of Networks & Applications
VCSN, Inc.
http://vcsn.com

____________________________________
This email account is being host by:
VCSN, Inc : http://vcsn.com

Re: Why database is corrupted after re-booting

От
Scott Marlowe
Дата:
On Thu, 2005-10-27 at 15:14, Keith C. Perry wrote:
> Actually, because I lost several thousands of dollars or equipement a couple of
> years ago, I recommended these "brickwall" products to a company.
>
> http://brickwall.com/index.htm
>
> We actually never deployed these units (grounding the communications lines ended
> up being a much cheaper solution) but I did talk and engineer at the company and
> apparently they have some hospitals as client that use unitss.  I'm won't get
> into the technology of how they work since you can read that yourself but I
> remember having a warm and fuzzy after my conversation.
>
> I will pull one quote from their web site though...
>
> "Unlike MOV’s, TRANS-ZORBS and similar shunt based surge protectors that use
> elements weighing less than 1/4 ounce, Brick Wall surge protectors can easily
> absorb any surge repeatedly with absolutely no degradation."
>
> The important phrase here is "...absorb any surge repeatedly with absolutely no
> degradation."

Having worked on stuff with some massive surge protectors, I'd say that
surge protectors in a Radio Shack (or any other store) are like having
an umbrella compared to a regular rain storm.

The higher end stuff, up through this brick wall, are kind of like
variously well built buildings and storm cellers against increasingly
nasty storms.

And lastly, there's the direct lightening strike.  Which fries
everything within a certain radius.  It's equivalent to a tornado
touching down exactly against your storm cellar, and maybe even dropping
a locomotive right through the entrance as well.

And if that's not enough, there's always a meteor strike to ruin your
day.

Don't get me wrong, I'm all for protection, I've just come to realize
that everything is in shades of grey.

But I do agree that those MOV based surge protectors are pretty much
worthless, like bows and arrows agains the lightening (it's a cloudy,
stormy day here in Chicago, what can I say...)

Re: Why database is corrupted after re-booting

От
Ron Mayer
Дата:
w_tom wrote:
>   Series mode protector will ignore or avoid THE one and essential
> component of an effective protection system - single point earth
> ground.

Indeed.   And yes, a high end data center should survive
a lightning strike (as well as hospital's power systems, etc).


Here's a nice article where Suncoast Schools Federal Credit
Union's data center survived a direct lightning strike to
their 480-V service entrance cable.   The article spends
a lot of the time talking about the grounding system.

http://www.ecpzone.com/article/article.jsp?siteSection=12&id=41
"Starting from the ground up, the main elements of the
[lightning protection] system...include:

(1) Three 20-ft x 5/8-in (6-m x 16-mm) copper-clad-steel
grounding electrodes [...] The grounding system's resistance
to earth as measured by fall-of-potential testing is 4.3 ohms.

(2) Another 4/0 copper grounding conductor connects the
ground-neutral bus in the service entrance panel to the
ground bus in a 480-V distribution panel ...

(3) Multiple uninterruptible power supplies (UPSs)....

(4) Up to seven layers of voltage surge protection....

High Quality Grounding.... "even the most expensive
TVSS you can buy is absolutely useless unless it sees
a high-quality, low-resistance ground. "
"

Re: Why database is corrupted after re-booting

От
"Andrus"
Дата:
>> QUANTUM FIREPALLP LM20.5 is a widely used ATA IDE drive.
>>
>> Where do find information does it implement write caching properly or not
>> ?
>
> I don't think the manufacturers bother to make this sort of information
> available.
>
>> Is there IDE drive compatibility list for Postgres ?
>
> No - for the reason above (amongst oghers).

Richard, thank you.
Classification of IDE drives into good and bad ones requires knowing *at
least one* good and bad model.

Can you write one good and bad IDE drive models, please?

Knowing those models before buying is huge step forward for perventing
database corruption in desktop computers.

Andrus.




Re: Why database is corrupted after re-booting

От
Bruce Momjian
Дата:
Andrus wrote:
> >> QUANTUM FIREPALLP LM20.5 is a widely used ATA IDE drive.
> >>
> >> Where do find information does it implement write caching properly or not
> >> ?
> >
> > I don't think the manufacturers bother to make this sort of information
> > available.
> >
> >> Is there IDE drive compatibility list for Postgres ?
> >
> > No - for the reason above (amongst oghers).
>
> Richard, thank you.
> Classification of IDE drives into good and bad ones requires knowing *at
> least one* good and bad model.
>
> Can you write one good and bad IDE drive models, please?
>
> Knowing those models before buying is huge step forward for perventing
> database corruption in desktop computers.

The bottom line is that IDE are desktop drives, not designed for high
concurrency.  Read this for details:

    http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Why database is corrupted after re-booting

От
Alex Turner
Дата:
Of course not counting the Western Digital Raptor SATA drive, which
are priced more like SCSI drives also, and have many of the features
of a SCSI drive including NCQ

Alex

On 10/28/05, Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> Andrus wrote:
> > >> QUANTUM FIREPALLP LM20.5 is a widely used ATA IDE drive.
> > >>
> > >> Where do find information does it implement write caching properly or not
> > >> ?
> > >
> > > I don't think the manufacturers bother to make this sort of information
> > > available.
> > >
> > >> Is there IDE drive compatibility list for Postgres ?
> > >
> > > No - for the reason above (amongst oghers).
> >
> > Richard, thank you.
> > Classification of IDE drives into good and bad ones requires knowing *at
> > least one* good and bad model.
> >
> > Can you write one good and bad IDE drive models, please?
> >
> > Knowing those models before buying is huge step forward for perventing
> > database corruption in desktop computers.
>
> The bottom line is that IDE are desktop drives, not designed for high
> concurrency.  Read this for details:
>
>         http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 359-1001
>   +  If your life is a hard drive,     |  13 Roberts Road
>   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

Re: Why database is corrupted after re-booting

От
Bruce Momjian
Дата:
Alex Turner wrote:
> Of course not counting the Western Digital Raptor SATA drive, which
> are priced more like SCSI drives also, and have many of the features
> of a SCSI drive including NCQ
>

Well, the PDF talks about several aspects of server drives, including
concurrency, performance, and reliability.  Not cutting corners to save
money in these areas are features for server drives.

---------------------------------------------------------------------------


> Alex
>
> On 10/28/05, Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> > Andrus wrote:
> > > >> QUANTUM FIREPALLP LM20.5 is a widely used ATA IDE drive.
> > > >>
> > > >> Where do find information does it implement write caching properly or not
> > > >> ?
> > > >
> > > > I don't think the manufacturers bother to make this sort of information
> > > > available.
> > > >
> > > >> Is there IDE drive compatibility list for Postgres ?
> > > >
> > > > No - for the reason above (amongst oghers).
> > >
> > > Richard, thank you.
> > > Classification of IDE drives into good and bad ones requires knowing *at
> > > least one* good and bad model.
> > >
> > > Can you write one good and bad IDE drive models, please?
> > >
> > > Knowing those models before buying is huge step forward for perventing
> > > database corruption in desktop computers.
> >
> > The bottom line is that IDE are desktop drives, not designed for high
> > concurrency.  Read this for details:
> >
> >         http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf
> >
> > --
> >   Bruce Momjian                        |  http://candle.pha.pa.us
> >   pgman@candle.pha.pa.us               |  (610) 359-1001
> >   +  If your life is a hard drive,     |  13 Roberts Road
> >   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: explain analyze is your friend
> >
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Why database is corrupted after re-booting

От
Alex Turner
Дата:
I have read it before - it's a _fantastic_ resource, and I will
probably make every junior tech I ever hire read it too.

On 10/28/05, Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> Alex Turner wrote:
> > Of course not counting the Western Digital Raptor SATA drive, which
> > are priced more like SCSI drives also, and have many of the features
> > of a SCSI drive including NCQ
> >
>
> Well, the PDF talks about several aspects of server drives, including
> concurrency, performance, and reliability.  Not cutting corners to save
> money in these areas are features for server drives.
>
> ---------------------------------------------------------------------------
>
>
> > Alex
> >
> > On 10/28/05, Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> > > Andrus wrote:
> > > > >> QUANTUM FIREPALLP LM20.5 is a widely used ATA IDE drive.
> > > > >>
> > > > >> Where do find information does it implement write caching properly or not
> > > > >> ?
> > > > >
> > > > > I don't think the manufacturers bother to make this sort of information
> > > > > available.
> > > > >
> > > > >> Is there IDE drive compatibility list for Postgres ?
> > > > >
> > > > > No - for the reason above (amongst oghers).
> > > >
> > > > Richard, thank you.
> > > > Classification of IDE drives into good and bad ones requires knowing *at
> > > > least one* good and bad model.
> > > >
> > > > Can you write one good and bad IDE drive models, please?
> > > >
> > > > Knowing those models before buying is huge step forward for perventing
> > > > database corruption in desktop computers.
> > >
> > > The bottom line is that IDE are desktop drives, not designed for high
> > > concurrency.  Read this for details:
> > >
> > >         http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf
> > >
> > > --
> > >   Bruce Momjian                        |  http://candle.pha.pa.us
> > >   pgman@candle.pha.pa.us               |  (610) 359-1001
> > >   +  If your life is a hard drive,     |  13 Roberts Road
> > >   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: explain analyze is your friend
> > >
> >
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 359-1001
>   +  If your life is a hard drive,     |  13 Roberts Road
>   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
>

Re: Why database is corrupted after re-booting

От
"Troy"
Дата:
huh never heard of that - I'll hold out testing it for now but thats
good info. (how does it know which partition - if there's 2?)

Troy H


Re: Why database is corrupted after re-booting

От
"w_tom"
Дата:
  Destructive surges seek earth ground.  Do you think a protector is
going to stop what 3 miles of non-conductive sky could not?  And yet
that is exactly what some protectors manufacturers hope you will
assume.

  Effective protectors don't stop, block, or absorb typically
destructive transients.  Joules does not mean protection is about
absorbing surge energy.   Protectors shunt as even Ben Franklin
demonstrated in 1752.  They shunt as protectors do to protect every
telephone switching station and every commercial radio station.  As was
demonstrated in 1930s GE and Westinghouse papers.  Destructive
transients seek earth ground - either before entering the building
('whole house' protection) or via electronics (surge damage).

  What defines effective protection?  The protector is only as good as
the protection it connects to.  Ineffective protectors hope you never
learn why earth ground is THE one essential component of every
protection system.  Protectors are temporary wires to protection - the
single point earth ground.  Protectors are only effective when they
create a short (typically 'less than 10 foot') connection to earth.

  Series mode protectors (Brickwall, Surgex, Zerosurge) are good
supplemental protection.  But again, without that short connection to
earth ground (the shunt mode protector or hardwire connection), then
even series mode protectors are bypassed or overwhelmed. See that
safety ground wire?  It bypasses the series mode protector.  Meanwhile,
what the series mode protector is doing should already be inside the
electronics.

  Series mode protector will ignore or avoid THE one and essential
component of an effective protection system - single point earth
ground.  Protection is a building wide system.  Its most essential
component is a single point earth ground.  All connections to that
earthing must be short - typically 'less than 10 feet'.  All incoming
utilities must connect to that protection - either via a protector (ie
AC electric, telephone, communication wires) or via a direct hardwire
(ie cable TV, satellite dish).

  Meanwhile, we are only discussing secondary protection.  Primary
protection is provided by the utility:
   http://www.tvtower.com/fpl.html

  To learn about serious protection, maybe start with a benchmark in
this industry - Polyphaser - whose app notes are considered legendary:
  http://www.polyphaser.com/ppc_ptd_home.aspx
What does Polyphaser discuss?  Their products?   No.  Polyphaser app
notes discuss the most critical component on a protection system -
single point earth ground.

  BTW notice a repeated reference to less than 10 feet.  One of many
requirements to reduce wire impedance.  Not resistance - impedance.
That means sharp bends, splices, inside metal conduit, etc all can
diminish effective earthing.

  What do plug-in UPSes avoid discussing?  Earth ground.  No earth
ground means no effective protection.  So they claim protection -
forgetting to mention they don't protect from the typically destructive
transient.  Hoping you will never learn about the most essential
component of a protection system.  That would also explain why
ineffective protectors also have too few joules. They claim protection
- forgetting to mention the protection is not effective.  Protectors
are only as effective as their earth ground.

"Keith C. Perry" wrote:
> Actually, because I lost several thousands of dollars or equipement a couple of
> years ago, I recommended these "brickwall" products to a company.
>
> http://brickwall.com/index.htm
>
> We actually never deployed these units (grounding the communications lines ended
> up being a much cheaper solution) but I did talk and engineer at the company and
> apparently they have some hospitals as client that use unitss.  I'm won't get
> into the technology of how they work since you can read that yourself but I
> remember having a warm and fuzzy after my conversation.
>
> I will pull one quote from their web site though...
>
> "Unlike MOV's, TRANS-ZORBS and similar shunt based surge protectors that use
> elements weighing less than 1/4 ounce, Brick Wall surge protectors can easily
> absorb any surge repeatedly with absolutely no degradation."
>
> The important phrase here is "...absorb any surge repeatedly with absolutely no
> degradation."
> TIP 5: don't forget to increase your free space map settings


Re: Why database is corrupted after re-booting

От
"w_tom"
Дата:
  One of the many problems with FAT32 was that files on the drive can
be deleted if power is lost.  This is why FAT was obsoleted by HPFS
which in turn was obsoleted by NTFS.

  Power loss should not cause data loss which is why we stopped using
FAT even before Windows 95 was released.

  Program to convert to NTFS is called convert.  But for details, use
Windows HELP command.


Re: Why database is corrupted after re-booting

От
"w_tom"
Дата:
  The transistor has existed in homes now for 30 years.  That means new
homes should be built to withstand direct lightning strikes without
damage.  Such earthing is not difficult.  But it requires the builder
to plan for the lightning protection 'system' before the footing are
poured.  It is an old and well proven technology - called Ufer grounds.
 They are installed in the footing using materials already inside
footings. IOW significant and effective protection systems for
residential environments need not be expensive.  It simply requires
planning by the builders - who currently don't consider such 'systems'
until much later - when the electrician arrives.

  Also essential is that all utilities enter as the same location - the
service entrance - so that all make a 'less than 10 foot' connection to
that single point and most superior earth ground.

  Ufer grounds, halo grounds, and other simple techniques cost so
little when installed during construction.  And yet we still build new
homes as if the transistor did not exist.

Ron Mayer wrote:
> Indeed.   And yes, a high end data center should survive
> a lightning strike (as well as hospital's power systems, etc).
>
>
> Here's a nice article where Suncoast Schools Federal Credit
> Union's data center survived a direct lightning strike to
> their 480-V service entrance cable.   The article spends
> a lot of the time talking about the grounding system.
>
> http://www.ecpzone.com/article/article.jsp?siteSection=12&id=41
> "Starting from the ground up, the main elements of the
> [lightning protection] system...include:
>
> (1) Three 20-ft x 5/8-in (6-m x 16-mm) copper-clad-steel
> grounding electrodes [...] The grounding system's resistance
> to earth as measured by fall-of-potential testing is 4.3 ohms.
>
> (2) Another 4/0 copper grounding conductor connects the
> ground-neutral bus in the service entrance panel to the
> ground bus in a 480-V distribution panel ...
>
> (3) Multiple uninterruptible power supplies (UPSs)....
>
> (4) Up to seven layers of voltage surge protection....
>
> High Quality Grounding.... "even the most expensive
> TVSS you can buy is absolutely useless unless it sees
> a high-quality, low-resistance ground. "
> "


Re: Why database is corrupted after re-booting

От
"Magnus Hagander"
Дата:
> 1. Windows XP
> 2. QUANTUM FIREBALLP LM20.5  (IDE drive)
> 3. Write caching is off in XP device manager
> 4. fsync is ON in Postgres 8

Coming late into the discussion, there is one more note I'd add to this
- if you're on windows and want to be extra secure, also set
wal_sync_method=fsync_writethrough. This will get it through most (I
would say all if I was sure, but I'm not) IDE disks that lie about write
completion.

//Magnus

Re: Why database is corrupted after re-booting

От
"Andrus"
Дата:
> Coming late into the discussion, there is one more note I'd add to this
> - if you're on windows and want to be extra secure, also set
> wal_sync_method=fsync_writethrough. This will get it through most (I
> would say all if I was sure, but I'm not) IDE disks that lie about write
> completion.

Magnus,

thank you.

Is it reasonable to turn IDE write caching OFF  for speed if

wal_sync_method=fsync_writethrough     ?

Andrus.