Tom Lane <tgl@sss.pgh.pa.us> writes:
> Greg Stark <gsstark@mit.edu> writes:
> > I'm watching this query I'm working on now drive the cpu to 100% for 6+ hours
> > with virtually no I/O. And I think it's the best I can do. All the times is
> > being spent moving bits around from one place to another.
>
> It would be interesting to see a gprof profile of that.
Ok, I have a copy of postgres built with -pg now. I'm running grpof on the
gmon.out file but all I'm seeing is this:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
100.00 0.01 0.01 1 10.00 10.00 InitFreeSpaceMap
0.00 0.01 0.00 12296 0.00 0.00 LWLockAssign
0.00 0.01 0.00 323 0.00 0.00 GUC_yylex
0.00 0.01 0.00 123 0.00 0.00 ShmemAlloc
0.00 0.01 0.00 123 0.00 0.00 tas
0.00 0.01 0.00 79 0.00 0.00 AllocSetFreeIndex
...
I assume this is only the postmaster initialization as I've run a smaller
version of the job which took 8 minutes. Is there an easy trick to getting
gmon.out output from a specific backend process?
> > It occurs to me that it's possible postgres is doing this already at a lower
> > level of abstraction.
>
> Yes, tuplesort.c has heard of pushing pointers around rather than
> copying records. I wonder though whether data is being pushed out to
> kernel disk buffers and then back in again --- do you have sort_mem
> set large enough?
Yes, this is the job I asked about sort_mem for. I have it up to 64M and no
pgsql_tmp files are being created.
> Another likely theory is that the interface layers needed to access
> datatype-specific comparison routines are chewing the cycles. Need facts not
> speculation to know where the bottleneck is...
--
greg