We had wondered... VM error or hardware error are reasonable
theories, and given the lack of reproduction or other insights it's
probably what we'll attribute it to. I lean more towards VM error
than hardware... As for ECC, the particular box is a test box and
does not have ECC. Our production equipment all has ECC memory.
The packages are actually consistent - it's CentOS built packages
running on CentOS, but Centos itself is running in a VM. The VM
itself is what's running on Ubuntu.
Thanks for taking a look!
On Wed, 28 Dec 2011 14:21:30 -0500 Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>lsq@nym.hush.com writes:
>> I got a simultaneous autovacuum and postmaster crash, then
>another
>> postmaster crash a few seconds later on restart. After a third
>> restart the system was stable. I haven't found anything
>obviously
>> matching this defect on-line.
>
>Those stack traces are pretty fascinating, but after studying them
>I have to think that you had some sort of hardware or VM glitch.
>"Bus error" is an uncommon event on x86_64 platforms, and for
>three
>different processes to all get that, in three unrelated places, at
>nearly the same time pushes the bounds of credulity. What's more,
>all three of those places are frequently-executed code paths, so
>if
>there were some kind of PG bug there I'm sure we'd have seen it
>before.
>
>I'm not too sure about the wisdom of running packages built for
>Centos
>on Ubuntu, but as for this particular event, I'd write it off to
>cosmic
>rays or something. (Speaking of which, does that box have ECC
>memory?)
>
> regards, tom lane