Обсуждение: Minor inaccuracy in jsonb_path_ops documentation
The jsonb documentation says of the jsonb_path_ops GIN opclass: """ A disadvantage of the jsonb_path_ops approach is that it produces no index entries for JSON structures not containing any values, such as {"a": {}}. If a search for documents containing such a structure is requested, it will require a full-index scan, which is quite slow. jsonb_path_ops is therefore ill-suited for applications that often perform such searches. """ The reference to a full index scan seems questionable. This text should indicate that a sequential scan can be expected. Even without any statistics, in the event of not being able to extract any hash values as GIN keys, the optimizer prefers a sequential scan. Example with query where one jsonb_path_ops GIN hash value is generated: postgres=# explain analyze select count(*) from test where j @> '{"tags":[{"term":"postgres"}]}'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------Aggregate (cost=4732.72..4732.73 rows=1 width=0) (actual time=0.513..0.513 rows=1 loops=1) -> Bitmap Heap Scan on test (cost=33.71..4729.59 rows=1253 width=0) (actual time=0.107..0.496 rows=100 loops=1) Recheck Cond: (j @> '{"tags": [{"term": "postgres"}]}'::jsonb) Heap Blocks: exact=100 -> Bitmap Index Scan on ttt (cost=0.00..33.40 rows=1253 width=0) (actual time=0.076..0.076 rows=100 loops=1) Index Cond: (j @> '{"tags": [{"term": "postgres"}]}'::jsonb)Planningtime: 0.083 msExecution time: 0.560 ms (8 rows) Example of empty query hazard with no such hash values: postgres=# explain select count(*) from test where j @> '{"tags":[]}'; QUERY PLAN ------------------------------------------------------------------Aggregate (cost=191519.46..191519.47 rows=1 width=0) -> Seq Scan on test (cost=0.00..191516.33 rows=1253 width=0) Filter: (j @> '{"tags": []}'::jsonb)Planning time:0.073 ms (4 rows) gincostestimate() does at least have the ability to anticipate that a full index scan will be required, and that actually makes the optimizer do the right thing here. Maybe the text quoted above is intended to indicate that if there was an index scan, it would have to be a full index scan, but that doesn't seem appropriate for user facing documentation like this. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > The jsonb documentation says of the jsonb_path_ops GIN opclass: > """ > A disadvantage of the jsonb_path_ops approach is that it produces no > index entries for JSON structures not containing any values, such as > {"a": {}}. If a search for documents containing such a structure is > requested, it will require a full-index scan, which is quite slow. > jsonb_path_ops is therefore ill-suited for applications that often > perform such searches. > """ > The reference to a full index scan seems questionable. Well, if you get an indexscan, it will be a full index scan, so I find this warning appropriate. The fact that the planner can avoid that in the presence of a simple constant comparison value doesn't make it a good idea to use jsonb_path_ops indexes if you do this type of query a lot, because in some usages the planner won't see a constant comparison value, and if so it'll likely choose an indexscan by default. Getting into exactly when the planner will or won't save your bacon is the kind of detail I'd rather not put in user-facing docs. regards, tom lane
On Thu, Jul 24, 2014 at 3:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The fact that the planner can avoid that in the presence of a > simple constant comparison value doesn't make it a good idea to > use jsonb_path_ops indexes if you do this type of query a lot, > because in some usages the planner won't see a constant comparison > value, and if so it'll likely choose an indexscan by default. Well, with idiomatic usage a constant value generally will be used. But I do see your point. -- Peter Geoghegan