Обсуждение: how to troubleshoot a wrong query plan

Поиск
Список
Период
Сортировка

how to troubleshoot a wrong query plan

От
jian xu
Дата:

Hello,

          We found a slow query(it took minutes), which is due to the bad execution plan.  The plan uses hash join with 2 tables, the problem is that it uses seq scan on a very large table(the join only returns 1k rows). Actually there is a index on the join column, If we disable hash join and merge join, and force it to run with nestloop join, the plan can use the index scan on the table, which is super fast(completes in a sec).

By comparing the bad plan(seq scan with hash join) and good plan(index scan next loop join), the estimation value of the good query plan is much smaller than the bad plan, my understanding is pg execution plan is cost based. But why it will pick a plan with a high estimation cost?

I also run vacuum and analyze, even run alter column set statistics to 10,000 , but it always picks the seq scan with hash join which has a high estimation cost. Is there any way to figure out why pg decides to use seq scan + hash join, is there any way to look inside how the plan is generated?

 

Thanks,

 

James

 

RE: how to troubleshoot a wrong query plan

От
jian xu
Дата:

Sorry, we figured out that the PostgreSQL.conf has the nestloop disabled….that’s why It picked merge join by default

But it might be still very helpful if there is a way to show how the plan is generated , or some debug information about the plan generating, for example in this case, if we can debug the plan and see nestloop is disable, it will be easy for us figure out the problem.

Thanks

James

 

Sent from Mail for Windows

 

From: jian xu
Sent: Wednesday, December 27, 2023 3:45 PM
To: pgsql-admin@lists.postgresql.org
Subject: how to troubleshoot a wrong query plan

 

Hello,

          We found a slow query(it took minutes), which is due to the bad execution plan.  The plan uses hash join with 2 tables, the problem is that it uses seq scan on a very large table(the join only returns 1k rows). Actually there is a index on the join column, If we disable hash join and merge join, and force it to run with nestloop join, the plan can use the index scan on the table, which is super fast(completes in a sec).

By comparing the bad plan(seq scan with hash join) and good plan(index scan next loop join), the estimation value of the good query plan is much smaller than the bad plan, my understanding is pg execution plan is cost based. But why it will pick a plan with a high estimation cost?

I also run vacuum and analyze, even run alter column set statistics to 10,000 , but it always picks the seq scan with hash join which has a high estimation cost. Is there any way to figure out why pg decides to use seq scan + hash join, is there any way to look inside how the plan is generated?

 

Thanks,

 

James

 

 

Re: how to troubleshoot a wrong query plan

От
Tom Lane
Дата:
jian xu <jamesxu@outlook.com> writes:
> Sorry, we figured out that the PostgreSQL.conf has the nestloop disabled….that’s why It picked merge join by default
> But it might be still very helpful if there is a way to show how the plan is generated , or some debug information
aboutthe plan generating, for example in this case, if we can debug the plan and see nestloop is disable, it will be
easyfor us figure out the problem. 

In supported releases of PG, there's the EXPLAIN (SETTINGS) option, which
will print the values of any non-default settings that affect planning.
Of course, you have to remember to use that, but it does respond to
your point.

            regards, tom lane



Re: how to troubleshoot a wrong query plan

От
jian xu
Дата:
Thanks Tom, except for setting parameters, is there any other ways to show debug information about plan generating? For example, can pg show all the candidate plans it searched, and the cost of each candidate plan?


From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Wednesday, December 27, 2023 4:21:51 PM
To: jian xu <jamesxu@outlook.com>
Cc: pgsql-admin@lists.postgresql.org <pgsql-admin@lists.postgresql.org>
Subject: Re: how to troubleshoot a wrong query plan
 
jian xu <jamesxu@outlook.com> writes:
> Sorry, we figured out that the PostgreSQL.conf has the nestloop disabled….that’s why It picked merge join by default
> But it might be still very helpful if there is a way to show how the plan is generated , or some debug information about the plan generating, for example in this case, if we can debug the plan and see nestloop is disable, it will be easy for us figure out the problem.

In supported releases of PG, there's the EXPLAIN (SETTINGS) option, which
will print the values of any non-default settings that affect planning.
Of course, you have to remember to use that, but it does respond to
your point.

                        regards, tom lane

Re: how to troubleshoot a wrong query plan

От
Tom Lane
Дата:
jian xu <jamesxu@outlook.com> writes:
> Thanks Tom, except for setting parameters, is there any other ways to show debug information about plan generating?
Forexample, can pg show all the candidate plans it searched, and the cost of each candidate plan? 

No, because rejected alternatives are not carried through to the point
of having a displayable plan.  (Trying to do so would be enormously
expensive.)  People have experimented with adding trace facilities
that would show some details about rejected plan alternatives, but
none of that has led to a facility that I'd care to use.

            regards, tom lane