Обсуждение: pgstatindex

Поиск
Список
Период
Сортировка

pgstatindex

От
Tatsuo Ishii
Дата:
Here is a new contrib function called "pgstatindex", similar to
pgstattuple but different in that it returns the percentage of the
dead tuples of an index. I am posting this for review purpose.

Installation of pgstatindex is pretty easy:

unpack the tar package in contrib directory.
cd into pgstatindex directory.
make
make install
psql -f /usr/local/pgsql/share/contrib/pgstatindex.sql your_database

Note:

(1) I think I have adopted to the recent Tom's changes to index access   routines, but if you find anything is wrong,
pleselet me know.
 

(2) pgstatindex probably does not work with rtree and gist indexes.
--
Tatsuo Ishii

Re: pgstatindex

От
Tom Lane
Дата:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> Here is a new contrib function called "pgstatindex", similar to
> pgstattuple but different in that it returns the percentage of the
> dead tuples of an index. I am posting this for review purpose.

Um ... what's the point?  Isn't this always the same as the percentage
for the underlying table?
        regards, tom lane


Re: pgstatindex

От
Tatsuo Ishii
Дата:
> Um ... what's the point?  Isn't this always the same as the percentage
> for the underlying table?

Sure. In my understanding, unlike tables "free/reusable space" is
actually not reused in index. pgstatindex would be usefull to judge if
REINDEX is needed by showing the growth of physical length and
"free/reusable space".

Maybe "free/reusable space" is not appropriate wording, "dead space"
is better?
--
Tatsuo Ishii


Re: pgstatindex

От
Tom Lane
Дата:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> Sure. In my understanding, unlike tables "free/reusable space" is
> actually not reused in index. pgstatindex would be usefull to judge if
> REINDEX is needed by showing the growth of physical length and
> "free/reusable space".

Oh.  Hmm, if that's what you want then I do not think an indexscan is
the way to go about it.  The indexscan will only visit leaf pages
(and not, for example, internal nodes of a btree).  Also the
free-space-counting code you're using seems pretty unworkable since the
indexscan is unlikely to visit leaf pages in anything like sequential
order.

I think the only reasonable way to get useful statistics would be to
read the index directly --- page by page, no indexscan, distinguishing
leaf pages, internal pages, and overhead pages for yourself.  This would
require index-AM-specific knowledge about how to tell which type each
page is, but I believe all the index AMs make that possible.

Also, I'd suggest that visiting the heap is just useless overhead.  A
person who wants to know whether the heap needs to be vacuumed can get
that data from pgstattuple.  Reading the heap to check tuple state will
make this function orders of magnitude slower, while not producing much
useful info that I can see.

Something else to think about is how to present the results.  As soon
as you release this we will have people bleating about how come their
btrees always show at least 1/3rd free space :-( unless we can think
of a way to highlight the fact that that's the expected loading factor
for a btree...
        regards, tom lane


Re: pgstatindex

От
Tatsuo Ishii
Дата:
> Oh.  Hmm, if that's what you want then I do not think an indexscan is
> the way to go about it.  The indexscan will only visit leaf pages
> (and not, for example, internal nodes of a btree).  Also the
> free-space-counting code you're using seems pretty unworkable since the
> indexscan is unlikely to visit leaf pages in anything like sequential
> order.

Oh I was not aware of this.

> I think the only reasonable way to get useful statistics would be to
> read the index directly --- page by page, no indexscan, distinguishing
> leaf pages, internal pages, and overhead pages for yourself.  This would
> require index-AM-specific knowledge about how to tell which type each
> page is, but I believe all the index AMs make that possible.

That's what I'm afraid of. 

> Also, I'd suggest that visiting the heap is just useless overhead.  A
> person who wants to know whether the heap needs to be vacuumed can get
> that data from pgstattuple.  Reading the heap to check tuple state will
> make this function orders of magnitude slower, while not producing much
> useful info that I can see.

Ok let me think about this. Thank you for the suggestion!
--
Tatsuo Ishii