Обсуждение: Query time related to limit clause

Поиск
Список
Период
Сортировка

Query time related to limit clause

От
Shubham Mittal
Дата:
Hi Team,

I have shared execution times of two queries below:

I need to find only the first row matching the criteria , but limit 1 is taking more time than limit 15 or more.. If any one can tell an explanation for this and how I can achieve the same in less time. 

 explain analyze SELECT * from abc where organisation_process_path = cast('org' as ltree) and  abc_type='secondary' and common_details->'commonDetails'->'nchBundle'->>'subGridName'='905811-22_MISCN_data'
            and status <>
               'CLOSED' AND sub_product_type = 'Prepaid'
            AND created_date >= cast('2021-03-23 00:00:00.000' AS TIMESTAMP)
        AND created_date <= cast('2021-09-23 00:00:00.000' AS TIMESTAMP)
order by created_date asc LIMIT 1

"Limit  (cost=1.31..941.32 rows=1 width=6947) (actual time=5117.039..5117.042 rows=1 loops=1)"
"  ->  Merge Append  (cost=1.31..4476296.09 rows=4762 width=6947) (actual time=5117.036..5117.038 rows=1 loops=1)"
"        Sort Key: abc_serv_nch_q1_2021.created_date"
"        ->  Index Scan using abc_serv_nch_q1_2021_created_date_idx on abc_serv_nch_q1_2021  (cost=0.43..378412.39 rows=1005 width=7025) (actual time=742.277..742.277 rows=0 loops=1)"
"              Index Cond: ((created_date >= '2021-03-23 00:00:00'::timestamp without time zone) AND (created_date <= '2021-09-23 00:00:00'::timestamp without time zone))"
"              Filter: (((status)::text <> 'CLOSED'::text) AND (organisation_process_path = 'org'::ltree) AND ((abc_type)::text = 'secondary'::text) AND ((sub_product_type)::text = 'Prepaid'::text) AND ((((common_details -> 'commonDetails'::text) -> 'nchBundle'::text) ->> 'subGridName'::text) = '905811-22_MISCN_data'::text))"
"              Rows Removed by Filter: 558116"
"        ->  Index Scan using abc_serv_nch_q2_2021_created_date_idx on abc_serv_nch_q2_2021  (cost=0.43..2674454.09 rows=3756 width=6928) (actual time=2074.950..2074.950 rows=1 loops=1)"
"              Index Cond: ((created_date >= '2021-03-23 00:00:00'::timestamp without time zone) AND (created_date <= '2021-09-23 00:00:00'::timestamp without time zone))"
"              Filter: (((status)::text <> 'CLOSED'::text) AND (organisation_process_path = 'org'::ltree) AND ((abc_type)::text = 'secondary'::text) AND ((sub_product_type)::text = 'Prepaid'::text) AND ((((common_details -> 'commonDetails'::text) -> 'nchBundle'::text) ->> 'subGridName'::text) = '905811-22_MISCN_data'::text))"
"              Rows Removed by Filter: 1743539"
"        ->  Index Scan using abc_serv_nch_q3_2021_created_date_idx on abc_serv_nch_q3_2021  (cost=0.43..1423368.04 rows=1 width=6548) (actual time=2299.805..2299.805 rows=0 loops=1)"
"              Index Cond: ((created_date >= '2021-03-23 00:00:00'::timestamp without time zone) AND (created_date <= '2021-09-23 00:00:00'::timestamp without time zone))"
"              Filter: (((status)::text <> 'CLOSED'::text) AND (organisation_process_path = 'org'::ltree) AND ((abc_type)::text = 'secondary'::text) AND ((sub_product_type)::text = 'Prepaid'::text) AND ((((common_details -> 'commonDetails'::text) -> 'nchBundle'::text) ->> 'subGridName'::text) = '905811-22_MISCN_data'::text))"
"              Rows Removed by Filter: 1320434"


"Planning Time: 18.563 ms"
"Execution Time: 5117.157 ms"

***********************WHEN LIMIT IS GIVEN MORE THAN EQUAL TO 15************************

explain analyze SELECT * from abc where organisation_process_path = cast('org' as ltree) and  abc_type='secondary' and common_details->'commonDetails'->'nchBundle'->>'subGridName'='905811-22_MISCN_data'
            and status <>
               'CLOSED' AND sub_product_type = 'Prepaid'
            AND created_date >= cast('2021-03-23 00:00:00.000' AS TIMESTAMP)
        AND created_date <= cast('2021-09-23 00:00:00.000' AS TIMESTAMP)
order by created_date asc LIMIT 15

       "Limit  (cost=12708.06..12708.09 rows=15 width=6947) (actual time=0.428..0.431 rows=15 loops=1)"
"  ->  Sort  (cost=12708.06..12719.96 rows=4762 width=6947) (actual time=0.426..0.428 rows=15 loops=1)"
"        Sort Key: abc_serv_nch_q1_2021.created_date"
"        Sort Method: top-N heapsort  Memory: 40kB"
"        ->  Append  (cost=7201.82..12591.22 rows=4762 width=6947) (actual time=0.081..0.366 rows=299 loops=1)"
"              ->  Bitmap Heap Scan on abc_serv_nch_q1_2021  (cost=7201.82..8338.60 rows=1005 width=7025) (actual time=0.038..0.038 rows=0 loops=1)"
"                    Recheck Cond: (((((common_details -> 'commonDetails'::text) -> 'nchBundle'::text) ->> 'subGridName'::text) = '905811-22_MISCN_data'::text) AND ((sub_product_type)::text = 'Prepaid'::text) AND ((abc_type)::text = 'secondary'::text) AND ((status)::text <> 'CLOSED'::text) AND (created_date >= '2021-03-23 00:00:00'::timestamp without time zone) AND (created_date <= '2021-09-23 00:00:00'::timestamp without time zone))"
"                    Filter: (organisation_process_path = 'org'::ltree)"
"                    ->  BitmapAnd  (cost=7201.82..7201.82 rows=1005 width=0) (actual time=0.036..0.037 rows=0 loops=1)"
"                          ->  Bitmap Index Scan on abc_serv_nch_q1_2021_expr_idx3  (cost=0.00..195.83 rows=14010 width=0) (actual time=0.036..0.036 rows=0 loops=1)"
"                                Index Cond: ((((common_details -> 'commonDetails'::text) -> 'nchBundle'::text) ->> 'subGridName'::text) = '905811-22_MISCN_data'::text)"
"                          ->  Bitmap Index Scan on abc_serv_nch_q1_2021_created_date_idx  (cost=0.00..7005.23 rows=533170 width=0) (never executed)"
"                                Index Cond: ((created_date >= '2021-03-23 00:00:00'::timestamp without time zone) AND (created_date <= '2021-09-23 00:00:00'::timestamp without time zone))"
"              ->  Index Scan using abc_serv_nch_q2_2021_expr_idx3 on abc_serv_nch_q2_2021  (cost=0.43..4226.46 rows=3756 width=6928) (actual time=0.042..0.305 rows=299 loops=1)"
"                    Index Cond: ((((common_details -> 'commonDetails'::text) -> 'nchBundle'::text) ->> 'subGridName'::text) = '905811-22_MISCN_data'::text)"
"                    Filter: ((created_date >= '2021-03-23 00:00:00'::timestamp without time zone) AND (created_date <= '2021-09-23 00:00:00'::timestamp without time zone) AND (organisation_process_path = 'org'::ltree))"
"              ->  Index Scan using abc_serv_nch_q3_2021_expr_idx3 on abc_serv_nch_q3_2021  (cost=0.12..2.35 rows=1 width=6548) (actual time=0.003..0.003 rows=0 loops=1)"
"                    Index Cond: ((((common_details -> 'commonDetails'::text) -> 'nchBundle'::text) ->> 'subGridName'::text) = '905811-22_MISCN_data'::text)"
"                    Filter: ((created_date >= '2021-03-23 00:00:00'::timestamp without time zone) AND (created_date <= '2021-09-23 00:00:00'::timestamp without time zone) AND (organisation_process_path = 'org'::ltree))"


"Planning Time: 21.959 ms"
"Execution Time: 0.551 ms"


Thanks ,
Shubham

Re: Query time related to limit clause

От
Michael Lewis
Дата:
What's the definition for abc_serv_nch_q1_2021_expr_idx3? That is a jsonb field I assume? Statistics aren't great on jsonb data, so you may benefit greatly from pulling keys out to be stored as a standard column. I would be curious for more "pure" estimates on each quarterly partition directly for only the condition below (explain analyze, just looking at estimated vs actual row counts) since they seem to be rather severe overestimates but I'm not sure how much the date condition is obscuring that.

(((common_details -> 'commonDetails'::text) -> 'nchBundle'::text) ->> 'subGridName'::text) = '905811-22_MISCN_data'::text)"

The planner is choosing to use ONLY the index on the created timestamp when limit is 1 and finding they nearly all match (I hope all on the q2 partition) and needs to filter almost all of those out (all from q1 partition I think, and nearly all from the others). I believe that the planner thinks the other criteria in the query are not nearly as selective as they are, and so it thinks it will find 1 match very quickly and be done. That isn't the case.

When you want more rows, the planner decides that using both indexes is less costly and it is correct.