Обсуждение: multi-tenant queries select wrong index

Поиск
Список
Период
Сортировка

multi-tenant queries select wrong index

От
Kirill
Дата:
Hello,
 
As modern software is typically multi-tenant aware it is critical for DB to effectively filter
database records based on tenant ID context. Yet, we constantly hit the situations when Postgres 13.4 performs poorly.
If community is interested I can report such trivial and obvious cases for optimisation. Or even sponsor development a bit.
 
1. Here is an example when tasks are selected for 1 tenant and everything is fine and index on (tenant_id, id) is used:
 
SELECT * FROM "tasks" WHERE
(tenant_id IN ('45AQ7HARTXQG1P6QNEDDA8A5V0'))
ORDER BY id desc LIMIT 100
Limit  (cost=0.69..426.01 rows=100 width=1679) (actual time=0.023..0.209 rows=100 loops=1)
  ->  Index Scan Backward using task_tenant_id_status_idx on tasks  (cost=0.69..25770.78 rows=6059 width=1679) (actual time=0.023..0.200 rows=100 loops=1)
        Index Cond: (tenant_id = '45AQ7HARTXQG1P6QNEDDA8A5V0'::text)
Planning Time: 0.125 ms
Execution Time: 0.231 ms
 
2. Now when I add 2 additional tenant IDs to the query everything gets 100x worse, despite the fact that those 2 tenants do NOT have any records at all.
The reason is the wrong index on (tenant_id, status) is used:
 
SELECT * FROM "tasks" WHERE
(tenant_id IN ('222P0TQT0FAR86BR30BB50TZZX','1X2W2J9B2VVJFSXGWZYR3XEHJO','45AQ7HARTXQG1P6QNEDDA8A5V0'))
ORDER BY id desc LIMIT 100
Limit  (cost=65506.24..65506.49 rows=100 width=1679) (actual time=93.972..93.989 rows=100 loops=1)
  ->  Sort  (cost=65506.24..65551.68 rows=18178 width=1679) (actual time=93.970..93.979 rows=100 loops=1)
        Sort Key: id DESC
        Sort Method: top-N heapsort  Memory: 97kB
        ->  Bitmap Heap Scan on tasks  (cost=322.56..64811.49 rows=18178 width=1679) (actual time=10.546..65.559 rows=29159 loops=1)
              Recheck Cond: (tenant_id = ANY ('{222P0TQT0FAR86BR30BB50TZZX,1X2W2J9B2VVJFSXGWZYR3XEHJO,45AQ7HARTXQG1P6QNEDDA8A5V0}'::text[]))
              Heap Blocks: exact=27594
              ->  Bitmap Index Scan on task_tenant_status_idx  (cost=0.00..318.01 rows=18178 width=0) (actual time=4.268..4.268 rows=29236 loops=1)
                    Index Cond: (tenant_id = ANY ('{222P0TQT0FAR86BR30BB50TZZX,1X2W2J9B2VVJFSXGWZYR3XEHJO,45AQ7HARTXQG1P6QNEDDA8A5V0}'::text[]))
Planning Time: 0.212 ms
Execution Time: 94.051 ms
 
is it possible somehow to force PG to use the correct index?
 
Regards,
Kirill

Re: multi-tenant queries select wrong index

От
Alexey M Boltenkov
Дата:
On 09/20/21 15:33, Kirill wrote:
> Hello,
> As modern software is typically multi-tenant aware it is critical for 
> DB to effectively filter
> database records based on tenant ID context. Yet, we constantly hit 
> the situations when Postgres 13.4 performs poorly.
> If community is interested I can report such trivial and obvious cases 
> for optimisation. Or even sponsor development a bit.
> 1. Here is an example when tasks are selected for 1 tenant and 
> everything is fine and index on (tenant_id, id) is used:
> SELECT * FROM "tasks" WHERE
> (tenant_id IN ('45AQ7HARTXQG1P6QNEDDA8A5V0'))
> ORDER BY id desc LIMIT 100
> Limit  (cost=0.69..426.01 rows=100 width=1679) (actual 
> time=0.023..0.209 rows=100 loops=1)
>   ->  Index Scan Backward using task_tenant_id_status_idx on tasks 
>  (cost=0.69..25770.78 rows=6059 width=1679) (actual time=0.023..0.200 
> rows=100 loops=1)
>         Index Cond: (tenant_id = '45AQ7HARTXQG1P6QNEDDA8A5V0'::text)
> Planning Time: 0.125 ms
> Execution Time: 0.231 ms
> 2. Now when I add 2 additional tenant IDs to the query everything gets 
> 100x worse, despite the fact that those 2 tenants do NOT have any 
> records at all.
> The reason is the wrong index on (tenant_id, status) is used:
> SELECT * FROM "tasks" WHERE
> (tenant_id IN 
> ('222P0TQT0FAR86BR30BB50TZZX','1X2W2J9B2VVJFSXGWZYR3XEHJO','45AQ7HARTXQG1P6QNEDDA8A5V0'))
> ORDER BY id desc LIMIT 100
> Limit  (cost=65506.24..65506.49 rows=100 width=1679) (actual 
> time=93.972..93.989 rows=100 loops=1)
>   ->  Sort  (cost=65506.24..65551.68 rows=18178 width=1679) (actual 
> time=93.970..93.979 rows=100 loops=1)
>         Sort Key: id DESC
>         Sort Method: top-N heapsort  Memory: 97kB
>         ->  Bitmap Heap Scan on tasks  (cost=322.56..64811.49 
> rows=18178 width=1679) (actual time=10.546..65.559 rows=29159 loops=1)
>               Recheck Cond: (tenant_id = ANY 
> ('{222P0TQT0FAR86BR30BB50TZZX,1X2W2J9B2VVJFSXGWZYR3XEHJO,45AQ7HARTXQG1P6QNEDDA8A5V0}'::text[]))
>               Heap Blocks: exact=27594
>               ->  Bitmap Index Scan on task_tenant_status_idx 
>  (cost=0.00..318.01 rows=18178 width=0) (actual time=4.268..4.268 
> rows=29236 loops=1)
>                     Index Cond: (tenant_id = ANY 
> ('{222P0TQT0FAR86BR30BB50TZZX,1X2W2J9B2VVJFSXGWZYR3XEHJO,45AQ7HARTXQG1P6QNEDDA8A5V0}'::text[]))
> Planning Time: 0.212 ms
> Execution Time: 94.051 ms
> is it possible somehow to force PG to use the correct index?
Try "set enable_bitmapscan to off;", but it is not a solution.
Have you try to analyze table, vacuum table, create statistics [...] on 
... from ... ?
> Regards,
> Kirill