Robert,
> > What would such a test look like? It's not obvious to me that
> > there's any rapid way for a user to detect this situation, without
> > checking each server individually.
>
> Change something on the master and observe that none of the supposed
> standbys notice?
That doesn't sound like an infallible test, or a 60-second one.
My point is that in a complex situation (imagine a shop with 9 replicated servers in 3 different cascaded groups,
immediatelyafter a failover of the original master), it would be easy for a sysadmin, responding to middle of the night
page,to accidentally fat-finger an IP address and create a cycle instead of a new master. And once he's done that, a
longishtroubleshooting process to figure out what's wrong and why writes aren't working, especially if he goes to bed
andsome other sysadmin picks up the "Writes failing to PostgreSQL" ticket.
*if* it's relatively easy for us to detect cycles (that's a big if, I'm not sure how we'd do it), then it would help a
lotfor us to at least emit a WARNING. That would short-cut a lot of troubleshooting.
--Josh Berkus