Обсуждение: GIN index always doing Re-check condition, postgres 9.1
I have a table with roughly 300 000 rows, each row containing two large strings - one is news article HTML, the other is news article plaintext. The table has a bigint primary key.
A GIN index is used to do fulltext search on the plaintext part. All I want to retrieve when I do fulltext search is values of primary key column.The problem I have is that postgres always does Re-check condition step for my request. That query with 23k rows takes 20 seconds to execute, and EXPLAIN shows that almost all of that time is spent
What this means is that every time user does a search for some word no one searched before, he has to wait a very long time, which is unacceptable for us.
CREATE TABLE kard_md.fulldata
(
id_iu bigint NOT NULL,
url character varying NOT NULL,
original text,
edited text,
plaintext text,
date timestamp without time zone,
CONSTRAINT fulldata_pkey PRIMARY KEY (id_iu)
);
CREATE INDEX fulldata_plaintext_idx
ON kard_md.fulldata
USING gin
(to_tsvector('russian'::regconfig, plaintext));
EXPLAIN (ANALYZE, BUFFERS) select id_iu from kard_md.fulldata where to_tsvector('russian',fulldata.plaintext) @@ plainto_tsquery('russian','москва');
1st run:
Bitmap Heap Scan on fulldata (cost=266.79..39162.57 rows=23069 width=8) (actual time=135.727..19499.667 rows=23132 loops=1)
Recheck Cond: (to_tsvector('russian'::regconfig, plaintext) @@ '''москв'''::tsquery)
Buffers: shared hit=115 read=13000
-> Bitmap Index Scan on fulldata_plaintext_idx (cost=0.00..261.02 rows=23069 width=0) (actual time=104.834..104.834 rows=23132 loops=1)
Index Cond: (to_tsvector('russian'::regconfig, plaintext) @@ '''москв'''::tsquery)
Buffers: shared hit=3 read=21
Total runtime: 19512.479 ms
2nd run:
Bitmap Heap Scan on fulldata (cost=266.79..39162.57 rows=23069 width=8) (actual time=25.423..48.649 rows=23132 loops=1)
Recheck Cond: (to_tsvector('russian'::regconfig, plaintext) @@ '''москв'''::tsquery)
Buffers: shared hit=13115
-> Bitmap Index Scan on fulldata_plaintext_idx (cost=0.00..261.02 rows=23069 width=0) (actual time=18.057..18.057 rows=23132 loops=1)
Index Cond: (to_tsvector('russian'::regconfig, plaintext) @@ '''москв'''::tsquery)
Buffers: shared hit=24
Total runtime: 49.612 ms
select version()
'PostgreSQL 9.1.15, compiled by Visual C++ build 1500, 64-bit'
On Sun, Nov 1, 2015 at 10:52 PM, Andrey Osenenko <andrey.osenenko@gmail.com> wrote: > I have a table with roughly 300 000 rows, each row containing two large > strings - one is news article HTML, the other is news article plaintext. The > table has a bigint primary key. > > A GIN index is used to do fulltext search on the plaintext part. All I want > to retrieve when I do fulltext search is values of primary key column. > > With a popular word, the amount of results from fulltext search query can be > pretty high - a query can return 23 000 rows and some can more, and will > return more as the database continues to grow. > > The problem I have is that postgres always does Re-check condition step for > my request. That query with 23k rows takes 20 seconds to execute, and > EXPLAIN shows that almost all of that time is spent > re-checking condition. Explain does not address the issue of how much time was spent doing rechecks. You are misinterpreting something. > The second time I run the same query, I get results > immediately. That suggests the time is spent reading data from disk the first time, not spent doing rechecks. Rechecks do not get faster by repeated execution, except indirectly to the extent the data has already been pulled into memory. But other things get faster due to that, as well. Now those are not mutually exclusive, as doing a recheck might lead to reading toast tables that don't need to get read at all in the absence of rechecks. So rechecks can lead to IO problems. But there is no evidence that that is the case for you. > > 1st run: > Bitmap Heap Scan on fulldata (cost=266.79..39162.57 rows=23069 width=8) > (actual time=135.727..19499.667 rows=23132 loops=1) > Recheck Cond: (to_tsvector('russian'::regconfig, plaintext) @@ > '''москв'''::tsquery) This tells you what condition will be applied to the recheck, in case a recheck is needed due to bitmap memory overflow. It doesn't tell how many times, if any, that was actually done, or how much time was spent doing it. As far as I know, there is no way to distinguish a "lossy index" recheck form a "lossy bitmap" recheck in version 9.1. > Buffers: shared hit=115 read=13000 That you needed only 13115 blocks to deliver 23069 tells me that there is little if any recheck-driven toast table access going on. That the second execution was very fast tells me that there is little rechecking at all going on, because actual rechecking is CPU intensive. I don't think your problem has anything to do with rechecking. You simply have too much data that is not in memory. You need more memory, or some way to keep your memory pinned with what you need. If you are on a RAID, you could also increase effective_io_concurrency, which lets the bitmap scan prefetch table blocks. Cheers, Jeff
Thank you.
That's really sad news. This means that even though there is an index that lets you find rows you want almost immediately, to retrieve primary keys, you still have to do a lot of disk io.
I created a new table that contains only primary key and tsvector value, and (at least that's how I'm interpreting it) since there is less data to read per row, it returns same results about 2 times as quickly (I restarted computer to make sure nothing is in memory).
Original table:
Bitmap Heap Scan on fulldata (cost=266.79..39162.57 rows=23069 width=8) (actual time=113.472..18368.769 rows=23132 loops=1)
Recheck Cond: (to_tsvector('russian'::regconfig, plaintext) @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=13114
-> Bitmap Index Scan on fulldata_plaintext_idx (cost=0.00..261.02 rows=23069 width=0) (actual time=90.859..90.859 rows=23132 loops=1)
Index Cond: (to_tsvector('russian'::regconfig, plaintext) @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=23
Total runtime: 18425.265 ms
Table with only key and vector:
Bitmap Heap Scan on fts (cost=273.67..27903.61 rows=23441 width=8) (actual time=219.896..10095.159 rows=23132 loops=1)
Recheck Cond: (vector @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=10877
-> Bitmap Index Scan on fts_vector_idx (cost=0.00..267.81 rows=23441 width=0) (actual time=204.631..204.631 rows=23132 loops=1)
Index Cond: (vector @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=23
Total runtime: 10103.858 ms
It also looks like if there was a way to create a table with just primary key and add an index to it that indexes data from another table, it would work much, much faster since there would be very little to read from disk after index lookup. But looks like there isn't.
So am I correct in assumption that as the amount of rows grows, query times for rows that are not in memory (and considering how many of them there are, most won't be) will grow linearly?
That's really sad news. This means that even though there is an index that lets you find rows you want almost immediately, to retrieve primary keys, you still have to do a lot of disk io.
I created a new table that contains only primary key and tsvector value, and (at least that's how I'm interpreting it) since there is less data to read per row, it returns same results about 2 times as quickly (I restarted computer to make sure nothing is in memory).
Original table:
Bitmap Heap Scan on fulldata (cost=266.79..39162.57 rows=23069 width=8) (actual time=113.472..18368.769 rows=23132 loops=1)
Recheck Cond: (to_tsvector('russian'::regconfig, plaintext) @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=13114
-> Bitmap Index Scan on fulldata_plaintext_idx (cost=0.00..261.02 rows=23069 width=0) (actual time=90.859..90.859 rows=23132 loops=1)
Index Cond: (to_tsvector('russian'::regconfig, plaintext) @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=23
Total runtime: 18425.265 ms
Table with only key and vector:
Bitmap Heap Scan on fts (cost=273.67..27903.61 rows=23441 width=8) (actual time=219.896..10095.159 rows=23132 loops=1)
Recheck Cond: (vector @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=10877
-> Bitmap Index Scan on fts_vector_idx (cost=0.00..267.81 rows=23441 width=0) (actual time=204.631..204.631 rows=23132 loops=1)
Index Cond: (vector @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=23
Total runtime: 10103.858 ms
It also looks like if there was a way to create a table with just primary key and add an index to it that indexes data from another table, it would work much, much faster since there would be very little to read from disk after index lookup. But looks like there isn't.
So am I correct in assumption that as the amount of rows grows, query times for rows that are not in memory (and considering how many of them there are, most won't be) will grow linearly?
On Mon, Nov 2, 2015 at 11:14 AM, Andrey Osenenko <andrey.osenenko@gmail.com> wrote:
Thank you.
That's really sad news. This means that even though there is an index that lets you find rows you want almost immediately, to retrieve primary keys, you still have to do a lot of disk io.
I created a new table that contains only primary key and tsvector value, and (at least that's how I'm interpreting it) since there is less data to read per row, it returns same results about 2 times as quickly (I restarted computer to make sure nothing is in memory).
Original table:
Bitmap Heap Scan on fulldata (cost=266.79..39162.57 rows=23069 width=8) (actual time=113.472..18368.769 rows=23132 loops=1)
Recheck Cond: (to_tsvector('russian'::regconfig, plaintext) @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=13114
-> Bitmap Index Scan on fulldata_plaintext_idx (cost=0.00..261.02 rows=23069 width=0) (actual time=90.859..90.859 rows=23132 loops=1)
Index Cond: (to_tsvector('russian'::regconfig, plaintext) @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=23
Total runtime: 18425.265 ms
Table with only key and vector:
Bitmap Heap Scan on fts (cost=273.67..27903.61 rows=23441 width=8) (actual time=219.896..10095.159 rows=23132 loops=1)
Recheck Cond: (vector @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=10877
-> Bitmap Index Scan on fts_vector_idx (cost=0.00..267.81 rows=23441 width=0) (actual time=204.631..204.631 rows=23132 loops=1)
Index Cond: (vector @@ '''москв'''::tsquery)
Buffers: shared hit=1 read=23
Total runtime: 10103.858 ms
It also looks like if there was a way to create a table with just primary key and add an index to it that indexes data from another table, it would work much, much faster since there would be very little to read from disk after index lookup. But looks like there isn't.
So am I correct in assumption that as the amount of rows grows, query times for rows that are not in memory (and considering how many of them there are, most won't be) will grow linearly?
On 11/2/15 2:19 AM, Andrey Osenenko wrote: > It also looks like if there was a way to create a table with just > primary key and add an index to it that indexes data from another table, > it would work much, much faster since there would be very little to read > from disk after index lookup. But looks like there isn't. That probably wouldn't help as much as you'd hope, because heap tuples in Postgres have a minimum 24 byte overhead. Add in 8 bytes for bigint and that's 32 bytes extra per row. I think what might gain you more is if you moved 9.2 and got index only scans. Though, if you're getting lossy results, I don't think that'll help [1]. > So am I correct in assumption that as the amount of rows grows, query > times for rows that are not in memory (and considering how many of them > there are, most won't be) will grow linearly? Maybe, maybe not. Query times for data that has to come from the disk can vary wildly based on what other activity is happening on the IO system. Ultimately, your IO system can only do so many IOs Per Second. [1] https://wiki.postgresql.org/wiki/Index-only_scans#Index-only_scans_and_index-access_methods -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On Mon, Nov 2, 2015 at 12:19 AM, Andrey Osenenko <andrey.osenenko@gmail.com> wrote: > > It also looks like if there was a way to create a table with just primary > key and add an index to it that indexes data from another table, it would > work much, much faster since there would be very little to read from disk > after index lookup. But looks like there isn't. There is a way to do this, but it is pretty gross. You can define function which takes the primary key as input and returns the data to index. Mark the function as immutable, even though it isn't. Something like: CREATE OR REPLACE FUNCTION public.filler_by_aid(integer) RETURNS text LANGUAGE sql IMMUTABLE AS $function$ select filler::text from pgbench_accounts where aid=$1 $function$ Then create a table which has just the primary key, and create a functional index on that table create table foobar as select aid from pgbench_accounts; create index on foobar (filler_by_aid(aid)); Now you can query the skinny table by reference to the data in the wide table: explain analyze select count(*) from foobar where filler_by_aid(aid)='zebra'; Since you fibbed to PostgreSQL about the functions immutability, it is pretty easy to get a corrupt index here. Every time the parent is updated, you have to be sure to delete and reinsert the primary key in the corresponding skinny table, otherwise it will not reflect the updated value. What you gain in the skinny table you could very well lose with the triggers needed to maintain it. Not to mention the fragility. It would be simpler if you could just force the wide data to always be toasted, even if it is not wide enough to trigger the default toast threshold. You could get a skinnier table (although not quite as skinny as one with only a single column), without having to do unsupported hacks. (I am assuming here, without real evidence other than intuition, that most of your news articles are in fact coming in under the toast threshold). > So am I correct in assumption that as the amount of rows grows, query times > for rows that are not in memory (and considering how many of them there are, > most won't be) will grow linearly? Yep. What you really need are index only scans. But those are not supported by gin indexes, and given the gin index structure it seems unlikely they will ever support index-only scans, at least not in a way that would help you. What are you going to do with these 23,000 primary keys once you get them, anyway? Perhaps you can push that analysis into the database and gain some efficiencies there. Or you could change your data structure. If all you are doing is searching for one tsvector token at a time, you could unnest ts_vector and store it in a table like (ts_token text, id_iu bigint). Then build a regular btree index on (ts_token, id_iu) and get index-only scans (once you upgrade from 9.1 to something newer) Cheers, Jeff
Jeff Janes:
That is such a beautiful little trick. I made a table with just ids, and a query for it reads almost 10 times less buffers (as reported by explain analyze buffers), and sure enough, after another reboot, query executes about 10 times faster.
I'm not doing anything special with those results. I have a main table "core" with various information about entries. Some entries have plaintexts attached and those are stored in the additional table "fulldata". fulldata's primary key refers to core's primary key and to do a fulltext search filtering results using core's other fields I have to retrieve primary keys from fulldata.
We tried many different ways to join rows from fulldata and core for that query, and ended up with something along the lines of:
where core.id_iu in (with ids as(select id_iu from fulldata where <fulltext condition here>) select * from ids) and <other core conditions here>
It was just as fast/slow as table joins and subqueries but always used fulltext index no matter what planner had in mind.
I'll be sure to play around with fake immutable function, and I think it might be even worth it to add the index to core instead of thin table.
Thanks!That is such a beautiful little trick. I made a table with just ids, and a query for it reads almost 10 times less buffers (as reported by explain analyze buffers), and sure enough, after another reboot, query executes about 10 times faster.
I'm not doing anything special with those results. I have a main table "core" with various information about entries. Some entries have plaintexts attached and those are stored in the additional table "fulldata". fulldata's primary key refers to core's primary key and to do a fulltext search filtering results using core's other fields I have to retrieve primary keys from fulldata.
We tried many different ways to join rows from fulldata and core for that query, and ended up with something along the lines of:
where core.id_iu in (with ids as(select id_iu from fulldata where <fulltext condition here>) select * from ids) and <other core conditions here>
It was just as fast/slow as table joins and subqueries but always used fulltext index no matter what planner had in mind.
I'll be sure to play around with fake immutable function, and I think it might be even worth it to add the index to core instead of thin table.
Jim Nasby:
Well, it's 32 bytes per row vs only god knows how many bytes per row since every row contains a tsvector value (although since practice shows 10 times less buffers read, it's probably somewhere around 320 bytes on average?).