Обсуждение: index paths and enable_indexscan
Hi, Maybe I am missing something obvious, but is it intentional that enable_indexscan is checked by cost_index(), that is, *after* creating an index path? I was expecting that if enable_indexscan is off, then no index paths would be generated to begin with, because I thought they are optional. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
On Tue, Apr 14, 2020 at 2:44 PM Amit Langote <amitlangote09@gmail.com> wrote:
Hi,
Maybe I am missing something obvious, but is it intentional that
enable_indexscan is checked by cost_index(), that is, *after* creating
an index path? I was expecting that if enable_indexscan is off, then
no index paths would be generated to begin with, because I thought
they are optional.
that setting enable_xxx to off only adds a penalty factor (disable_cost)
to the path's cost. The path would be still generated and compete with
other paths in add_path().
Thanks
Richard
On Tue, Apr 14, 2020 at 4:13 PM Richard Guo <guofenglinux@gmail.com> wrote: > On Tue, Apr 14, 2020 at 2:44 PM Amit Langote <amitlangote09@gmail.com> wrote: >> Maybe I am missing something obvious, but is it intentional that >> enable_indexscan is checked by cost_index(), that is, *after* creating >> an index path? I was expecting that if enable_indexscan is off, then >> no index paths would be generated to begin with, because I thought >> they are optional. > > > I think the cost estimate of index paths is the same as other paths on > that setting enable_xxx to off only adds a penalty factor (disable_cost) > to the path's cost. The path would be still generated and compete with > other paths in add_path(). Yeah, but I am asking why build the path to begin with, as there will always be seq scan path for base rels. Turning enable_hashjoin off, for example, means that no hash join paths will be built at all. Looking into the archives, I see that the idea of "not generating disabled paths to begin with" was discussed quite recently: https://www.postgresql.org/message-id/29821.1572706653%40sss.pgh.pa.us -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
On Tue, Apr 14, 2020 at 3:40 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Tue, Apr 14, 2020 at 4:13 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Tue, Apr 14, 2020 at 2:44 PM Amit Langote <amitlangote09@gmail.com> wrote:
>> Maybe I am missing something obvious, but is it intentional that
>> enable_indexscan is checked by cost_index(), that is, *after* creating
>> an index path? I was expecting that if enable_indexscan is off, then
>> no index paths would be generated to begin with, because I thought
>> they are optional.
>
>
> I think the cost estimate of index paths is the same as other paths on
> that setting enable_xxx to off only adds a penalty factor (disable_cost)
> to the path's cost. The path would be still generated and compete with
> other paths in add_path().
Yeah, but I am asking why build the path to begin with, as there will
always be seq scan path for base rels.
I guess that is because user may disable seqscan as well. If so, we
still need formula to decide with one to use, which requires index path
has to be calculated. but since disabling the two at the same time is rare,
we can ignore the index path build if user allow seqscan
Turning enable_hashjoin off,
for example, means that no hash join paths will be built at all.
As for join, the difference is even user allows a join method by setting,
but the planner may still not able to use it. so the disabled path still need
to be used. Consider query "select * from t1, t2 where f(t1.a, t2.a) = 3",
and user setting is enable_nestloop = off, enable_hashjoin = on.
But I think it is still possible to ignore the path generating after
some extra checking.
Looking into the archives, I see that the idea of "not generating
disabled paths to begin with" was discussed quite recently:
https://www.postgresql.org/message-id/29821.1572706653%40sss.pgh.pa.us
--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com
On Tue, Apr 14, 2020 at 5:29 PM Andy Fan <zhihui.fan1213@gmail.com> wrote: > On Tue, Apr 14, 2020 at 3:40 PM Amit Langote <amitlangote09@gmail.com> wrote: >> On Tue, Apr 14, 2020 at 4:13 PM Richard Guo <guofenglinux@gmail.com> wrote: >> > On Tue, Apr 14, 2020 at 2:44 PM Amit Langote <amitlangote09@gmail.com> wrote: >> >> Maybe I am missing something obvious, but is it intentional that >> >> enable_indexscan is checked by cost_index(), that is, *after* creating >> >> an index path? I was expecting that if enable_indexscan is off, then >> >> no index paths would be generated to begin with, because I thought >> >> they are optional. >> > >> > I think the cost estimate of index paths is the same as other paths on >> > that setting enable_xxx to off only adds a penalty factor (disable_cost) >> > to the path's cost. The path would be still generated and compete with >> > other paths in add_path(). >> >> Yeah, but I am asking why build the path to begin with, as there will >> always be seq scan path for base rels. > > I guess that is because user may disable seqscan as well. If so, we > still need formula to decide with one to use, which requires index path > has to be calculated. but since disabling the two at the same time is rare, > we can ignore the index path build if user allow seqscan I am saying that instead of building index path with disabled cost, just don't build it at all. A base rel will always have a sequetial path, even though with disabled cost if enable_seqscan = off. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
On Tue, Apr 14, 2020 at 4:58 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Tue, Apr 14, 2020 at 5:29 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
> On Tue, Apr 14, 2020 at 3:40 PM Amit Langote <amitlangote09@gmail.com> wrote:
>> On Tue, Apr 14, 2020 at 4:13 PM Richard Guo <guofenglinux@gmail.com> wrote:
>> > On Tue, Apr 14, 2020 at 2:44 PM Amit Langote <amitlangote09@gmail.com> wrote:
>> >> Maybe I am missing something obvious, but is it intentional that
>> >> enable_indexscan is checked by cost_index(), that is, *after* creating
>> >> an index path? I was expecting that if enable_indexscan is off, then
>> >> no index paths would be generated to begin with, because I thought
>> >> they are optional.
>> >
>> > I think the cost estimate of index paths is the same as other paths on
>> > that setting enable_xxx to off only adds a penalty factor (disable_cost)
>> > to the path's cost. The path would be still generated and compete with
>> > other paths in add_path().
>>
>> Yeah, but I am asking why build the path to begin with, as there will
>> always be seq scan path for base rels.
>
> I guess that is because user may disable seqscan as well. If so, we
> still need formula to decide with one to use, which requires index path
> has to be calculated. but since disabling the two at the same time is rare,
> we can ignore the index path build if user allow seqscan
I am saying that instead of building index path with disabled cost,
just don't build it at all. A base rel will always have a sequetial
path, even though with disabled cost if enable_seqscan = off.
Let's say user set enable_seqscan=off and set enable_indexscan=off;
will you expect user to get seqscan at last? If so, why is seqscan
(rather than index scan) since both are disabled by user equally?
Amit Langote
EnterpriseDB: http://www.enterprisedb.com
On Tue, Apr 14, 2020 at 5:12 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
On Tue, Apr 14, 2020 at 4:58 PM Amit Langote <amitlangote09@gmail.com> wrote:On Tue, Apr 14, 2020 at 5:29 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
> On Tue, Apr 14, 2020 at 3:40 PM Amit Langote <amitlangote09@gmail.com> wrote:
>> On Tue, Apr 14, 2020 at 4:13 PM Richard Guo <guofenglinux@gmail.com> wrote:
>> > On Tue, Apr 14, 2020 at 2:44 PM Amit Langote <amitlangote09@gmail.com> wrote:
>> >> Maybe I am missing something obvious, but is it intentional that
>> >> enable_indexscan is checked by cost_index(), that is, *after* creating
>> >> an index path? I was expecting that if enable_indexscan is off, then
>> >> no index paths would be generated to begin with, because I thought
>> >> they are optional.
>> >
>> > I think the cost estimate of index paths is the same as other paths on
>> > that setting enable_xxx to off only adds a penalty factor (disable_cost)
>> > to the path's cost. The path would be still generated and compete with
>> > other paths in add_path().
>>
>> Yeah, but I am asking why build the path to begin with, as there will
>> always be seq scan path for base rels.
>
> I guess that is because user may disable seqscan as well. If so, we
> still need formula to decide with one to use, which requires index path
> has to be calculated. but since disabling the two at the same time is rare,
> we can ignore the index path build if user allow seqscan
I am saying that instead of building index path with disabled cost,
just don't build it at all. A base rel will always have a sequetial
path, even though with disabled cost if enable_seqscan = off.Let's say user set enable_seqscan=off and set enable_indexscan=off;will you expect user to get seqscan at last? If so, why is seqscan(rather than index scan) since both are disabled by user equally?
The following test should demonstrate what I think.
demo=# create table t(a int);
CREATE TABLE
demo=# insert into t select generate_series(1, 10000000);
INSERT 0 10000000
demo=# create index t_a on t(a);
CREATE INDEX
demo=# analyze t;
ANALYZE
demo=# set enable_seqscan to off;
SET
demo=# set enable_indexscan to off;
SET
demo=# set enable_bitmapscan to off;
SET
demo=# set enable_indexonlyscan to off;
SET
demo=# explain select * from t where a = 1;
QUERY PLAN
---------------------------------------------------------------------------------
Index Scan using t_a on t (cost=10000000000.43..10000000008.45 rows=1 width=4)
Index Cond: (a = 1)
(2 rows)
CREATE TABLE
demo=# insert into t select generate_series(1, 10000000);
INSERT 0 10000000
demo=# create index t_a on t(a);
CREATE INDEX
demo=# analyze t;
ANALYZE
demo=# set enable_seqscan to off;
SET
demo=# set enable_indexscan to off;
SET
demo=# set enable_bitmapscan to off;
SET
demo=# set enable_indexonlyscan to off;
SET
demo=# explain select * from t where a = 1;
QUERY PLAN
---------------------------------------------------------------------------------
Index Scan using t_a on t (cost=10000000000.43..10000000008.45 rows=1 width=4)
Index Cond: (a = 1)
(2 rows)
If we just disable index path, we will get seqscan at last.
Regards
Andy Fan
Amit Langote
EnterpriseDB: http://www.enterprisedb.com
On Tue, Apr 14, 2020 at 6:12 PM Andy Fan <zhihui.fan1213@gmail.com> wrote: > On Tue, Apr 14, 2020 at 4:58 PM Amit Langote <amitlangote09@gmail.com> wrote: >> I am saying that instead of building index path with disabled cost, >> just don't build it at all. A base rel will always have a sequetial >> path, even though with disabled cost if enable_seqscan = off. > > Let's say user set enable_seqscan=off and set enable_indexscan=off; > will you expect user to get seqscan at last? If so, why is seqscan > (rather than index scan) since both are disabled by user equally? I was really thinking of this in terms of planner effort, which for creating an index path is more than creating sequential path, although sure the payoff can be great. That is, I want the planner to avoid creating index paths *to save cycles*, but see no way of making that happen. I was thinking disabling enable_indexscan would do the trick. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com
Amit Langote <amitlangote09@gmail.com> writes: > I am saying that instead of building index path with disabled cost, > just don't build it at all. A base rel will always have a sequetial > path, even though with disabled cost if enable_seqscan = off. Awhile back I'd looked into getting rid of disable_cost altogether by dint of not generating disabled paths. It's harder than it sounds. We could perhaps change this particular case, but it's not clear that there's any real benefit of making this one change in isolation. Note that you can't just put a big OFF switch at the start of indxpath.c, because enable_indexscan and enable_bitmapscan are distinct switches, but the code to generate those path types is inextricably intertwined. Skipping individual paths further down on the basis of the appropriate switch would be fairly subtle and perhaps bug-prone. The existing implementation of those switches has the advantages of being trivially simple and clearly correct (for some value of "correct"). regards, tom lane
Amit Langote <amitlangote09@gmail.com> writes: > I was really thinking of this in terms of planner effort, which for > creating an index path is more than creating sequential path, although > sure the payoff can be great. That is, I want the planner to avoid > creating index paths *to save cycles*, but see no way of making that > happen. I was thinking disabling enable_indexscan would do the trick. I think that's completely misguided, because in point of fact nobody is going to care about the planner's performance with enable_indexscan turned off. It's not an interesting production case. All of these enable_xxx switches exist just for debug purposes, and so the right way to think about them is "what's the simplest, least bug-prone, lowest-maintenance way to get the effect?". Likewise, I don't actually much care what results you get if you turn off *all* of them. It's not a useful case to spend our effort on. regards, tom lane
On Tue, Apr 14, 2020 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Awhile back I'd looked into getting rid of disable_cost altogether > by dint of not generating disabled paths. It's harder than it > sounds. We could perhaps change this particular case, but it's > not clear that there's any real benefit of making this one change > in isolation. I like the idea and have had the same thought before. I wondered whether we could arrange to generate paths for a rel and then if we end up with no paths, do it again ignoring the disable flags. It didn't seem entirely easy to rearrange things to work that way, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company