PostgreSQL shutdown modes
От | Robert Haas |
---|---|
Тема | PostgreSQL shutdown modes |
Дата | |
Msg-id | CA+TgmoYxs1dzDN5jc5rVJz236M0uOd6QA2JiY+1yb=BVYg8MgA@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: PostgreSQL shutdown modes
Re: PostgreSQL shutdown modes Re: PostgreSQL shutdown modes |
Список | pgsql-hackers |
Hi, I think it's pretty evident that the names we've chosen for the various PostgreSQL shutdown modes are pretty terrible, and maybe we should try to do something about that. There is nothing "smart" about a smart shutdown. The usual result of attempting a smart shutdown is that the server never shuts down at all, because typically there are going to be some applications using connections that are kept open more or less permanently. What ends up happening when you attempt a "smart" shutdown is that you've basically put the server into a mode where you're irreversibly committed to accepting no new connections, but because you have a connection pooler or something that keeps connections open forever, you never shut down either. It is in effect a denial-of-service attack on the database you're supposed to be administering. Similarly, "fast" shutdowns are not in any way fast. It is pretty common for a fast shutdown to take many minutes or even tens of minutes to complete. This doesn't require some kind of extreme workload to hit; I've run into it during casual benchmarking runs. It's very easy to have enough dirty data in shared buffers, or enough dirty in the operating system cache that will have to be fsync'd in order to complete the shutdown checkpoint, to make things take an extremely long time. In some ways, this is an even more effective denial-of-service attack than a smart shutdown. True, the database will at some point actually finish shutting down, but in the meantime not only will we not accept new connections but we'll evict all of the existing ones. Good luck maintaining five nines of availability if waiting for a clean shutdown to complete is any part of the process. It might be smarter to initiate a regular (non-shutdown) checkpoint first, without cutting off connections, and then when that finishes, proceed as we do now. The second checkpoint will complete a lot faster, so while the overall operation still won't be fast, at least we'd be refusing connections for a shorter period of time before the system is actually shut down and you can do whatever maintenance you need to do. "immediate" shutdowns aren't as bad as the other two, but they're still bad. One of the big problems is that I encounter in this area is that Oracle uses the name "immediate" shutdown to mean a normal shutdown with a checkpoint allowing for a clean restart. Users coming from Oracle are sometimes extremely surprised to discover that an immediate shutdown is actually a server crash that will require recovery. Even if you don't come from Oracle, there's really nothing about the name of this shutdown mode that intrinsically makes you understand that it's something you should do only as a last resort. Who doesn't like things that are immediate? The problem with this theory is that you make the shutdown quicker at the price of startup becoming much, much slower, because the crash recovery is very likely going to take a whole lot longer than the shutdown checkpoint would have done. I attach herewith a modest patch to rename these shutdown modes to more accurately correspond to their actual characteristics. -- Robert Haas EDB: http://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Andres FreundДата:
Сообщение: Can we automatically add elapsed times to tap test log?