Обсуждение: Primary Key Index Bloat?
Hi, I have a table with the following usage characteristic: INSERT bulk data using INSERT INTO ... SELECT ... FROM -- this table uses a varchar(50) for the PK -- the PK is rarely (effectively never unless a mistake was made) altered -- always appending to the existing table; some bulk deletions in the past -- no toasting of values UPDATE these records many, many times (but never the PK) At the moment the table itself (and thus all columns) is showing as size of just under 5GB The corresponding index for the PK (a single column, also present on the table...) is between 8-10GB [1] I have never REINDEXed this table that I can recall. I am guessing that it is the need for the index to point to new versions of the physical record that the index is churning so much and causing this kind of bloat? [1] I actually inadvertently have two indexes, one unique and one not, over the same field. The sizes of these two indexes are the rough endpoints of the 8-10GB range I provided. It appears that both indexes are being chosen for use by the planner. I am preparing to REINDEX the unique index and DROP the non-unique one over the same field - probably Tuesday evening. Does everything I am saying here sound kosher or would someone like me to provide additional information? The deletions were fairly voluminous when they occurred and so I would have expected that enough empty pages would have been freed to avoid any significant index bloat. I have mostly left VACUUMing to auto-vacuum though the occasional manual one has been performed. Never performed VACUUM FULL. Been running 9.0 on Ubuntu 10.04 since day one with a pg_restore load. A while ago and no comparative metrics available. I greatly appreciate any sanity checks you can provide. Thanks! David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Primary-Key-Index-Bloat-tp5778840.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 11/17/2013 11:48 PM, David Johnston wrote: > I am guessing that it is the need for the index to point to new versions of > the physical record that the index is churning so much and causing this kind > of bloat? Bingo. > I am preparing to REINDEX the unique index and DROP the non-unique one over > the same field - probably Tuesday evening. Does everything I am saying here > sound kosher or would someone like me to provide additional information? ... > I have mostly left VACUUMing to auto-vacuum though the occasional manual one > has been performed. Never performed VACUUM FULL. REINDEX will only rebuild the indexes (obviously) and VACUUM FULL would clean up any bloat in the indexes and the table itself. Either would require an exclusive lock on the table. If the exclusive lock is a problem, you might look in to pg_repack as an option as well. It essentially recreates the table in parallel then swaps in the newly built one. It only requires an exclusive lock for a moment. - Chris
On Sun, Nov 17, 2013 at 10:48 PM, David Johnston <polobo@yahoo.com> wrote: > I am preparing to REINDEX the unique index and DROP the non-unique one over > the same field - probably Tuesday evening. Does everything I am saying here > sound kosher or would someone like me to provide additional information? [...] > I have mostly left VACUUMing to auto-vacuum though the occasional manual one > has been performed. Never performed VACUUM FULL. You can try pgcompact from PgToolkit [1] pgcompact -d dbname -t tablename -r -v info that will concurrently reindex those indexes that are bloated and will remove bloat from the table without recreating it in the case if it is bloated only. It might require pgstattuple extension [2] to be installed [3] . [1] https://github.com/grayhemp/pgtoolkit [2] http://www.postgresql.org/docs/9.0/static/pgstattuple.html [3] http://www.postgresql.org/docs/9.0/static/contrib.html -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@gmail.com