Обсуждение: Reducing bgwriter wakeups
Recent changes for power reduction mean that we now issue a wakeup call to the bgwriter every time we set a hint bit. However cheap that is, its still overkill. My proposal is that we wakeup the bgwriter whenever a backend is forced to write a dirty buffer, a job the bgwriter should have been doing. This significantly reduces the number of wakeup calls and allows the bgwriter to stay asleep even when very light traffic happens, which is good because the bgwriter is often the last process to sleep. Seems useful to have an explicit discussion on this point, especially in view of recent performance results. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
On Sun, Feb 19, 2012 at 1:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > Recent changes for power reduction mean that we now issue a wakeup > call to the bgwriter every time we set a hint bit. > > However cheap that is, its still overkill. > > My proposal is that we wakeup the bgwriter whenever a backend is > forced to write a dirty buffer, a job the bgwriter should have been > doing. > > This significantly reduces the number of wakeup calls and allows the > bgwriter to stay asleep even when very light traffic happens, which is > good because the bgwriter is often the last process to sleep. > > Seems useful to have an explicit discussion on this point, especially > in view of recent performance results. I don't see what this has to do with recent performance results, so please elaborate. Off-hand, I don't see any point in getting cheap. It seems far more important to me that the background writer become active when needed than that we save some trivial amount of power by waiting longer before activating it. If we're concerned about saving power, then IMHO what we should be worried about is that the wal writer is still waking up 5x/s. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Feb 19, 2012 at 8:15 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sun, Feb 19, 2012 at 1:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Recent changes for power reduction mean that we now issue a wakeup >> call to the bgwriter every time we set a hint bit. >> >> However cheap that is, its still overkill. >> >> My proposal is that we wakeup the bgwriter whenever a backend is >> forced to write a dirty buffer, a job the bgwriter should have been >> doing. >> >> This significantly reduces the number of wakeup calls and allows the >> bgwriter to stay asleep even when very light traffic happens, which is >> good because the bgwriter is often the last process to sleep. >> >> Seems useful to have an explicit discussion on this point, especially >> in view of recent performance results. > > I don't see what this has to do with recent performance results, so > please elaborate. Off-hand, I don't see any point in getting cheap. > It seems far more important to me that the background writer become > active when needed than that we save some trivial amount of power by > waiting longer before activating it. Then you misunderstand, since I am advocating waking it when needed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Feb 19, 2012 at 4:11 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Sun, Feb 19, 2012 at 8:15 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Sun, Feb 19, 2012 at 1:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Recent changes for power reduction mean that we now issue a wakeup >>> call to the bgwriter every time we set a hint bit. >>> >>> However cheap that is, its still overkill. >>> >>> My proposal is that we wakeup the bgwriter whenever a backend is >>> forced to write a dirty buffer, a job the bgwriter should have been >>> doing. >>> >>> This significantly reduces the number of wakeup calls and allows the >>> bgwriter to stay asleep even when very light traffic happens, which is >>> good because the bgwriter is often the last process to sleep. >>> >>> Seems useful to have an explicit discussion on this point, especially >>> in view of recent performance results. >> >> I don't see what this has to do with recent performance results, so >> please elaborate. Off-hand, I don't see any point in getting cheap. >> It seems far more important to me that the background writer become >> active when needed than that we save some trivial amount of power by >> waiting longer before activating it. > > Then you misunderstand, since I am advocating waking it when needed. Well, I guess that depends on when it's actually needed. You haven't presented any evidence one way or the other. I mean, let's suppose that a sudden spike of activity hits a previously-idle system. If we wait until all of shared_buffers is dirty before waking up the background writer, it seems possible that the background writer is going to have a hard time catching up. If we wake it immediately, we don't have that problem. Also, in general, I think that it's not a good idea to let dirty data sit in shared_buffers forever. I'm unhappy about the change this release cycle to skip checkpoints if we've written less than a full WAL segment, and this seems like another step in that direction. It's exposing us to needless risk of data loss. In 9.1, if you process a transaction and, an hour later, the disk where pg_xlog is written melts into a heap of molten slag, your transaction will be there, even if you end up having to run pg_resetxlog. In 9.2, it may well be that xlog contains the only record of that transaction, and you're hosed. The more work we do to postpone writing the data until the absolutely last possible moment, the more likely it is that it won't be on disk when we need it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Feb 19, 2012 at 2:18 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > Also, in general, I think that it's not a good idea to let dirty data > sit in shared_buffers forever. I'm unhappy about the change this > release cycle to skip checkpoints if we've written less than a full > WAL segment, and this seems like another step in that direction. It's > exposing us to needless risk of data loss. In 9.1, if you process a > transaction and, an hour later, the disk where pg_xlog is written > melts into a heap of molten slag, your transaction will be there, even > if you end up having to run pg_resetxlog. Would the log really have been archived in 9.1? I don't think checkpoint_timeout caused a log switch, just a checkpoint which could happily be in the same file as the previous checkpoint. > In 9.2, it may well be that > xlog contains the only record of that transaction, and you're hosed. > The more work we do to postpone writing the data until the absolutely > last possible moment, the more likely it is that it won't be on disk > when we need it. Isn't that what archive_timeut is for? Should archive_timeout default to something like 5 min, rather than 0? Cheers, Jeff
On Sun, Feb 19, 2012 at 5:56 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > Would the log really have been archived in 9.1? I don't think > checkpoint_timeout caused a log switch, just a checkpoint which could > happily be in the same file as the previous checkpoint. The log segment doesn't need to get archived - it's sufficient that the dirty buffers get written to disk. >> In 9.2, it may well be that >> xlog contains the only record of that transaction, and you're hosed. >> The more work we do to postpone writing the data until the absolutely >> last possible moment, the more likely it is that it won't be on disk >> when we need it. > > Isn't that what archive_timeut is for? > > Should archive_timeout default to something like 5 min, rather than 0? I dunno. I think people are doing replication are probably mostly using streaming replication these days, in which case archive_timeout won't matter one way or the other. But if you're not doing replication, your only hope of recovering from a trashed pg_xlog is that PostgreSQL wrote the buffers and (in the case of an OS crash) the OS wrote them to disk. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 20.02.2012 00:18, Robert Haas wrote: > On Sun, Feb 19, 2012 at 4:11 PM, Simon Riggs<simon@2ndquadrant.com> wrote: >> On Sun, Feb 19, 2012 at 8:15 PM, Robert Haas<robertmhaas@gmail.com> wrote: >>> On Sun, Feb 19, 2012 at 1:53 PM, Simon Riggs<simon@2ndquadrant.com> wrote: >>>> Recent changes for power reduction mean that we now issue a wakeup >>>> call to the bgwriter every time we set a hint bit. >>>> >>>> However cheap that is, its still overkill. >>>> >>>> My proposal is that we wakeup the bgwriter whenever a backend is >>>> forced to write a dirty buffer, a job the bgwriter should have been >>>> doing. >>>> >>>> This significantly reduces the number of wakeup calls and allows the >>>> bgwriter to stay asleep even when very light traffic happens, which is >>>> good because the bgwriter is often the last process to sleep. That seems like swinging the pendulum too much in the other direction, as others have noted. A simple thing you could do, however, is to only wake up bgwriter every 10 dirtied pages in the backend or something like that. That would reduce the wakeups by a factor of 10. Would that be useful? It's not actually clear to me what the problem you're trying to solve is. >>>> Seems useful to have an explicit discussion on this point, especially >>>> in view of recent performance results. >>> >>> I don't see what this has to do with recent performance results, so >>> please elaborate. Off-hand, I don't see any point in getting cheap. >>> It seems far more important to me that the background writer become >>> active when needed than that we save some trivial amount of power by >>> waiting longer before activating it. >> >> Then you misunderstand, since I am advocating waking it when needed. > > Well, I guess that depends on when it's actually needed. You haven't > presented any evidence one way or the other. > > I mean, let's suppose that a sudden spike of activity hits a > previously-idle system. If we wait until all of shared_buffers is > dirty before waking up the background writer, it seems possible that > the background writer is going to have a hard time catching up. If we > wake it immediately, we don't have that problem. Well, as long as the OS has some clean buffers, as it presumably does if the system has been idle for a while, bgwriter will catch up very quickly by simply dumping a large number of dirty pages to the OS. Also, as the code stands, bgwriter still wakes up every 10 seconds even when no-one signals it, which makes this a much less likely to happen. Nevertheless, I also feel that it would be better for bgwriter to be a bit more proactive than that. > Also, in general, I think that it's not a good idea to let dirty data > sit in shared_buffers forever. I'm unhappy about the change this > release cycle to skip checkpoints if we've written less than a full > WAL segment, and this seems like another step in that direction. It's > exposing us to needless risk of data loss. In 9.1, if you process a > transaction and, an hour later, the disk where pg_xlog is written > melts into a heap of molten slag, your transaction will be there, even > if you end up having to run pg_resetxlog. In 9.2, it may well be that > xlog contains the only record of that transaction, and you're hosed. > The more work we do to postpone writing the data until the absolutely > last possible moment, the more likely it is that it won't be on disk > when we need it. True. (but as noted above, bgwriter still wakes up every 10 seconds so this isn't really an issue at the moment) -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com