Обсуждение: cost_sort() may need to be updated
For external sorts, cost_sort() assumes the following: * The disk traffic is assumed to be 3/4ths sequential and 1/4th random* accesses (XXX can't we refine that guess?) ... /* Assume 3/4ths of accesses are sequential, 1/4th are not */ startup_cost += npageaccesses * (seq_page_cost* 0.75 + random_page_cost * 0.25); I think that we *can* refine this guess, and should, because random I/O is really quite unlikely to be a large cost these days (I/O in general often isn't a large cost, actually). More fundamentally, I think it's a problem that cost_sort() thinks that external sorts are far more expensive than internal sorts in general. There is good reason to think that that does not reflect the reality. I think we can expect external sorts to be *faster* than internal sorts with increasing regularity in Postgres 10. In one sense, things got worse here in 9.6 when replacement selection was all but killed, since runs became on average half their size, which cost_sort() was taught about at the time. But, cost_sort() didn't care about cases where only one pass is predicted (the great majority of external sorts), so that didn't really matter. cost_sort() doesn't seem to care about the size of runs to be merged at all, which seems limiting. It also seems limiting that cost_sort() doesn't differentiate between internal and external sort *startup* costs at all. External sorts can often start returning tuples sooner, due to the final on-the-fly merge mechanism. It's pretty common to see cases where merging 4 runs takes only slightly less time than an internal sort, and yet are thought to cost more than twice as much. I realize that costs are always a bit fudged, but I am fairly suspicious of how big the differences between external and internal sorts regularly are. I suggest that this be revisited soon. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > I think that we *can* refine this guess, and should, because random > I/O is really quite unlikely to be a large cost these days (I/O in > general often isn't a large cost, actually). More fundamentally, I > think it's a problem that cost_sort() thinks that external sorts are > far more expensive than internal sorts in general. There is good > reason to think that that does not reflect the reality. I think we can > expect external sorts to be *faster* than internal sorts with > increasing regularity in Postgres 10. TBH, if that's true, haven't you broken something? regards, tom lane
On Sun, Sep 11, 2016 at 9:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <pg@heroku.com> writes: >> I think that we *can* refine this guess, and should, because random >> I/O is really quite unlikely to be a large cost these days (I/O in >> general often isn't a large cost, actually). More fundamentally, I >> think it's a problem that cost_sort() thinks that external sorts are >> far more expensive than internal sorts in general. There is good >> reason to think that that does not reflect the reality. I think we can >> expect external sorts to be *faster* than internal sorts with >> increasing regularity in Postgres 10. > > TBH, if that's true, haven't you broken something? It's possible for external sorts to be faster some of the time because the memory access patterns can be more cache efficient: smaller runs are better when accessing tuples in sorted order, scattered across memory. More importantly, the sort can start returning tuples earlier in the common case where a final on-the-fly merge can be performed. In principle, you could adopt internal sorts to have the same advantages, but that hasn't and probably won't happen. Finally, the external sort I/O costs grow linearly, whereas the CPU costs grow in a linearithmic fashion, which will eventually come to dominate. We can hide the latency of those costs pretty well, too, with asynchronous I/O. I'm not arguing that cost_sort() should think that external sorts are cheaper under any circumstances, since all of this is very hard to model. I only mention this because it illustrates nicely that cost_sort() has the wrong idea. -- Peter Geoghegan
On Sun, Sep 11, 2016 at 12:13 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Sun, Sep 11, 2016 at 9:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Peter Geoghegan <pg@heroku.com> writes: >>> I think that we *can* refine this guess, and should, because random >>> I/O is really quite unlikely to be a large cost these days (I/O in >>> general often isn't a large cost, actually). More fundamentally, I >>> think it's a problem that cost_sort() thinks that external sorts are >>> far more expensive than internal sorts in general. There is good >>> reason to think that that does not reflect the reality. I think we can >>> expect external sorts to be *faster* than internal sorts with >>> increasing regularity in Postgres 10. >> >> TBH, if that's true, haven't you broken something? > > It's possible for external sorts to be faster some of the time because > the memory access patterns can be more cache efficient: smaller runs > are better when accessing tuples in sorted order, scattered across > memory. More importantly, the sort can start returning tuples earlier > in the common case where a final on-the-fly merge can be performed. In > principle, you could adopt internal sorts to have the same advantages, > but that hasn't and probably won't happen. Finally, the external sort > I/O costs grow linearly, whereas the CPU costs grow in a linearithmic > fashion, which will eventually come to dominate. We can hide the > latency of those costs pretty well, too, with asynchronous I/O. > > I'm not arguing that cost_sort() should think that external sorts are > cheaper under any circumstances, since all of this is very hard to > model. I only mention this because it illustrates nicely that > cost_sort() has the wrong idea. I think that the only real way to judge whether cost_sort() is any good is to see whether it causes the wrong plan to be chosen in some cases. For example, it could cause merge joins to be picked too frequently or too infrequently, or it could cause a wrong decision between hash and group aggregates. Now, there is bound to be an argument about whether any test cases that we might pick are representative and the results will further differ between hardware platforms, but I think changing the formula based on an abstract idea about what's happening is probably not a good idea. Because it's necessary to set work_mem so conservatively to avoid running out of memory, most supposedly-external sorts aren't really doing any I/O at all, and therefore arguing about whether we should be charging seq_page_cost or random_page_cost for I/O is kinda missing the point. Those numbers may just be reflecting other work that the sort is doing that isn't being properly accounted for elsewhere, or it's possible that the current costing of sorts is fairly far away from reality. How would we know? The only thing that makes me think they probably aren't *too* bad is that in a somewhat-systematic study of query performance problems (https://sites.google.com/site/robertmhaas/query-performance) done several years ago I found no evidence of a problem with sort costing, and my experience outside of that study bears that out. But that's pretty back of the envelope. Of course, the issue of supposed I/O actually being reads of cached data is not unique to sorting. Sequential scans have the same problem. We've previously discussed the idea of adding a cached_page_cost, but this has fallen down over the issue that you also need to know what percentage of the table is likely to be read from cache, and you don't want that estimate to be too unstable or your plans will bounce all over the place. One idea I had for dealing with that is to just base it on the size of the table: assume small tables are probably cached, and large tables probably aren't. The latter is certainly true at least in the case where "large" means "bigger than physical memory", and the former isn't a bad guess either. So basically where this would go is to assume that scanning a small table is cheaper per page than scanning a big one, which I suspect is probably better than the uniform estimate we're using today. However, I haven't got a single test case to prove it, and in fact I'm not even sure how one could go about figuring out whether this makes things better or worse in general. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Sep 13, 2016 at 5:59 AM, Robert Haas <robertmhaas@gmail.com> wrote: > I think that the only real way to judge whether cost_sort() is any > good is to see whether it causes the wrong plan to be chosen in some > cases. For example, it could cause merge joins to be picked too > frequently or too infrequently, or it could cause a wrong decision > between hash and group aggregates. Now, there is bound to be an > argument about whether any test cases that we might pick are > representative and the results will further differ between hardware > platforms, but I think changing the formula based on an abstract idea > about what's happening is probably not a good idea. I don't disagree with any of this. And, I don't pretend to have the best understanding of how real world observations on the quality of plans can make its way back into the query plan, and be integrated, without regressing too many other things. The basis of my complaint is that cost_sort() typically costs external sorts as twice or more as expensive as internal sorts, and that that does not seem at all consistent with my *recent* observations, without even bringing caching effects into it. That seems safe enough for someone that is relatively unfamiliar with the optimizer to have as a general concern, because external sorts and internal sorts really aren't so different these days. > Because it's necessary to set work_mem so conservatively to avoid > running out of memory, most supposedly-external sorts aren't really > doing any I/O at all, and therefore arguing about whether we should be > charging seq_page_cost or random_page_cost for I/O is kinda missing > the point. I don't take issue with how internal sorts are costed at all. I think that the factors that direct the tuning of work_mem might prominently include the anti-scaling effect of replacement selection. To the DBA, that can just look like simple memory pressure, perhaps. Time will tell how true this is. > Those numbers may just be reflecting other work that the > sort is doing that isn't being properly accounted for elsewhere, or > it's possible that the current costing of sorts is fairly far away > from reality. How would we know? The only thing that makes me think > they probably aren't *too* bad is that in a somewhat-systematic study > of query performance problems > (https://sites.google.com/site/robertmhaas/query-performance) done > several years ago I found no evidence of a problem with sort costing, > and my experience outside of that study bears that out. But that's > pretty back of the envelope. It's also out of date. As I said, we don't have any real field experience with the 9.6 external sort changes yet, but we'll have some soon enough. I'm happy to leave it at that, for now -- maybe someone will recall this thread at a later date, when there is a complaint about a suboptimal query plan from a user. -- Peter Geoghegan