Re: [EXTERNAL] Re: performance expectations for table(s) with 2B recs

Поиск
Список
Период
Сортировка
От Peter J. Holzer
Тема Re: [EXTERNAL] Re: performance expectations for table(s) with 2B recs
Дата
Msg-id 20211210214312.ycfynzxiqmbm35o6@hjp.at
обсуждение исходный текст
Ответ на RE: [EXTERNAL] Re: performance expectations for table(s) with 2B recs  ("Godfrin, Philippe E" <Philippe.Godfrin@nov.com>)
Ответы RE: [EXTERNAL] Re: performance expectations for table(s) with 2B recs  ("Godfrin, Philippe E" <Philippe.Godfrin@nov.com>)
Список pgsql-general
On 2021-12-10 18:04:07 +0000, Godfrin, Philippe E wrote:
> >But in my experience the biggest problem with large tables are unstable
> >execution plans - for most of the parameters the optimizer will choose
> >to use an index, but for some it will erroneously think that a full
> >table scan is faster. That can lead to a situation where a query
> >normally takes less than a second, but sometimes (seemingly at random)
> >it takes several minutes
[...]
> For Peter I have a question. What exactly causes ‘unstable execution plans’ ??
>
> Besides not using bind variables, bad statistics, would you elaborate
> in what would contribute to that instability?

Not using bind variables and bad statistics are certainly big factors:

On one hand not using bind variables gives a lot more information to the
optimizer, so it can choose a better plan at run time. On the other hand
that makes hard to predict what plan it will choose.

Bad statistics come in many flavours: They might just be wrong, that's
usually easy to fix. More problematic are statistics which just don't
describe reality very well - they may not show a correlation, causing
the optimizer to assume that two distributions are independent when they
really aren't (since PostgreSQL 10 you can create statistics on multiple
columns which helps in many but not all cases) or not show some other
peculiarity of the data. Or they may be just so close to a flipping
point that a small change causes the optimizer to choose a wildly
different plan.

Another source is dynamically generated SQL. Your application may just
put together SQL from fragments or it might use something like
SQLalchemy or an ORM. In any of these cases what looks like one query
from a user's perspective may really be a whole family of related
queries - and PostgreSQL will try to find the optimal plan for each of
them. Which is generally a good thing, but it adds opportunities to mess
up.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Postgresql + containerization possible use case
Следующее
От: Richard Michael
Дата:
Сообщение: Re: CTE Materialization