Обсуждение: Buglet in "Sort Method" explain output in degenerate case

Поиск
Список
Период
Сортировка

Buglet in "Sort Method" explain output in degenerate case

От
Gregory Stark
Дата:
I noticed a small bug in the "Sort Method" code:

postgres=# explain analyze select * from test order by random() limit 1;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Limit  (cost=21.50..21.50 rows=1 width=4) (actual time=3.649..3.651 rows=1 loops=1)
   ->  Sort  (cost=21.50..24.00 rows=1000 width=4) (actual time=3.646..3.646 rows=1 loops=1)
         Sort Key: (random())
         Sort Method:  quicksort  Memory: 17kB
         ->  Seq Scan on test  (cost=0.00..16.50 rows=1000 width=4) (actual time=0.021..1.707 rows=1000 loops=1)
 Total runtime: 3.704 ms
(6 rows)

It's printing "quicksort" even though it used a heap. This happens because we
don't bother deheapifying a singleton heap so the boundUsed flag never gets
set. The patch below just moves setting that flag to when the heap is made
instead of when it's deheapified.

One could make the argument that we should distinguish the noop sort from
quicksort or the half-hearted singleton heapsort from the full heapsort but
that seems like gilding. But distinguishing between heapsort and quicksort is
important since it really could be either depending on how many inputs there
were.


Index: src/backend/utils/sort/tuplesort.c
===================================================================
RCS file: /home/stark/src/REPOSITORY/pgsql/src/backend/utils/sort/tuplesort.c,v
retrieving revision 1.77
diff -u -r1.77 tuplesort.c
--- src/backend/utils/sort/tuplesort.c    7 Jun 2007 19:19:57 -0000    1.77
+++ src/backend/utils/sort/tuplesort.c    1 Sep 2007 17:17:25 -0000
@@ -2247,6 +2247,7 @@
     }

     Assert(state->memtupcount == state->bound);
+    state->boundUsed = true;
     state->status = TSS_BOUNDED;
 }

@@ -2284,7 +2285,6 @@
     REVERSEDIRECTION(state);

     state->status = TSS_SORTEDINMEM;
-    state->boundUsed = true;
 }

 /*


--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Buglet in "Sort Method" explain output in degenerate case

От
Tom Lane
Дата:
Gregory Stark <stark@enterprisedb.com> writes:
> It's printing "quicksort" even though it used a heap. This happens because we
> don't bother deheapifying a singleton heap so the boundUsed flag never gets
> set. The patch below just moves setting that flag to when the heap is made
> instead of when it's deheapified.

Hmm.  Actually, given that sort_bounded_heap() is only conditionally
invoked, *both* of the state updates it makes are bogus.  But I think
they should both be done at the call site in tuplesort_performsort.
(The state->status update already is, which is why it works at all.)
Setting it at conclusion is correct, I think, since if we ever changed
the code to abandon TSS_BOUNDED state in the face of unexpected memory
growth, it would be wrong to have set it in make_bounded_sort.

            regards, tom lane

Re: Buglet in "Sort Method" explain output in degenerate case

От
Tom Lane
Дата:
I wrote:
> Hmm.  Actually, given that sort_bounded_heap() is only conditionally
> invoked, *both* of the state updates it makes are bogus.

Er, make that three state updates: its REVERSEDIRECTION() operation is
being skipped as well.  That's not critical now, but might be someday.

Rather than moving all that up to tuplesort_performsort, it seems better
to leave it where it is, and instead remove the premature optimization
of trying to skip sort_bounded_heap.  The number of cycles saved that
way is tiny anyway...

            regards, tom lane

Re: Buglet in "Sort Method" explain output in degenerate case

От
Gregory Stark
Дата:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Setting it at conclusion is correct, I think, since if we ever changed
> the code to abandon TSS_BOUNDED state in the face of unexpected memory
> growth, it would be wrong to have set it in make_bounded_sort.

Actually that would be pretty easy to do and strikes me as worth doing. It
isn't hard to contrive an example that over-runs work_mem though I'm not sure
how easy it would be to actually run out of RAM.

One thing though, we would have to let up on switching to bounded sort if we
run out of memory accumulating tuples. We wouldn't want to switch to bounded
sort and then spill to disk right afterward. Here I just added 10% assuming
usually the remaining tuples won't be 10% larger than the first batch.
Introducing floating point math in here kind of bothers me though.




--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Вложения