Re: index prefetching
От | Tomas Vondra |
---|---|
Тема | Re: index prefetching |
Дата | |
Msg-id | f0fda1a3-23ef-4cbe-b6d5-48f0bd9432cb@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: index prefetching (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: index prefetching
(Peter Geoghegan <pg@bowt.ie>)
|
Список | pgsql-hackers |
On 2/15/24 17:42, Peter Geoghegan wrote: > On Thu, Feb 15, 2024 at 9:36 AM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> On 2/15/24 00:06, Peter Geoghegan wrote: >>> I suppose that it might be much more important than I imagine it is >>> right now, but it'd be nice to have something a bit more concrete to >>> go on. >>> >> >> This probably depends on which corner cases are considered important. >> >> The page-at-a-time approach essentially means index items at the >> beginning of the page won't get prefetched (or vice versa, prefetch >> distance drops to 0 when we get to end of index page). > > I don't think that's true. At least not for nbtree scans. > > As I went into last year, you'd get the benefit of the work I've done > on "boundary cases" (most recently in commit c9c0589f from just a > couple of months back), which helps us get the most out of suffix > truncation. This maximizes the chances of only having to scan a single > index leaf page in many important cases. So I can see no reason why > index items at the beginning of the page are at any particular > disadvantage (compared to those from the middle or the end of the > page). > I may be missing something, but it seems fairly self-evident to me an entry at the beginning of an index page won't get prefetched (assuming the page-at-a-time thing). If I understand your point about boundary cases / suffix truncation, that helps us by (a) picking the split in a way to minimize a single key spanning multiple pages, if possible and (b) increasing the number of entries that fit onto a single index page. That's certainly true / helpful, and it makes the "first entry" issue much less common. But the issue is still there. Of course, this says nothing about the importance of the issue - the impact may easily be so small it's not worth worrying about. > Where you might have a problem is cases where it's just inherently > necessary to visit more than a single leaf page, despite the best > efforts of the nbtsplitloc.c logic -- cases where the scan just > inherently needs to return tuples that "straddle the boundary between > two neighboring pages". That isn't a particularly natural restriction, > but it's also not obvious that it's all that much of a disadvantage in > practice. > One case I've been thinking about is sorting using index, where we often read large part of the index. >> It certainly was a great improvement, no doubt about that. I dislike the >> restriction, but that's partially for aesthetic reasons - it just seems >> it'd be nice to not have this. >> >> That being said, I'd be OK with having this restriction if it makes v1 >> feasible. For me, the big question is whether it'd mean we're stuck with >> this restriction forever, or whether there's a viable way to improve >> this in v2. > > I think that there is no question that this will need to not > completely disable kill_prior_tuple -- I'd be surprised if one single > person disagreed with me on this point. There is also a more nuanced > way of describing this same restriction, but we don't necessarily need > to agree on what exactly that is right now. > Even for the page-at-a-time approach? Or are you talking about the v2? >> And I don't have answer to that :-( I got completely lost in the ongoing >> discussion about the locking implications (which I happily ignored while >> working on the PoC patch), layering tensions and questions which part >> should be "in control". > > Honestly, I always thought that it made sense to do things on the > index AM side. When you went the other way I was surprised. Perhaps I > should have said more about that, sooner, but I'd already said quite a > bit at that point, so... > > Anyway, I think that it's pretty clear that "naive desynchronization" > is just not acceptable, because that'll disable kill_prior_tuple > altogether. So you're going to have to do this in a way that more or > less preserves something like the current kill_prior_tuple behavior. > It's going to have some downsides, but those can be managed. They can > be managed from within the index AM itself, a bit like the > _bt_killitems() no-pin stuff does things already. > > Obviously this interpretation suggests that doing things at the index > AM level is indeed the right way to go, layering-wise. Does it make > sense to you, though? > Yeah. The basic idea was that by moving this above index AM it will work for all indexes automatically - but given the current discussion about kill_prior_tuple, locking etc. I'm not sure that's really feasible. The index AM clearly needs to have more control over this. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления:
Следующее
От: Jelte Fennema-NioДата:
Сообщение: Re: Add trim_trailing_whitespace to editorconfig file