Обсуждение: Is it possible to create an index without keeping the indexed data in a column?

Поиск
Список
Период
Сортировка

Is it possible to create an index without keeping the indexed data in a column?

От
Larry White
Дата:
Hi,

I would like to create a GIN index on a set of JSON documents. Right now I'm storing the data in a JSONB column. The current index looks like this:

CREATE INDEX document_payload_idx
  ON document
  USING gin
  (payload jsonb_path_ops);

The index is pretty small, but the actual data takes up a lot of space. Is there a way to get Postgres to index the table as if the JSON were there, but not actually put the data in the table? I could either store the docs elsewhere and keep a reference, or compress them and put them in the table in compressed form as a blob.

Thanks much for your help.

Larry 


Re: Is it possible to create an index without keeping the indexed data in a column?

От
Michael Paquier
Дата:
On Fri, Aug 1, 2014 at 4:47 AM, Larry White <ljw1001@gmail.com> wrote:
> Is there a way to get Postgres to index the table as if the JSON were there,
> but not actually put the data in the table?
> I could either store the docs
> elsewhere and keep a reference, or compress them and put them in the table
> in compressed form as a blob.
No. This is equivalent to the creation of an index on a foreign table.
Regards,
--
Michael


Re: Is it possible to create an index without keeping the indexed data in a column?

От
Amit Langote
Дата:
On Fri, Aug 1, 2014 at 10:48 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Aug 1, 2014 at 4:47 AM, Larry White <ljw1001@gmail.com> wrote:
>> Is there a way to get Postgres to index the table as if the JSON were there,
>> but not actually put the data in the table?
>> I could either store the docs
>> elsewhere and keep a reference, or compress them and put them in the table
>> in compressed form as a blob.
> No. This is equivalent to the creation of an index on a foreign table.

Not sure exactly if it applies here; but I seem to recall reading
somewhere that you can index "generated" columns. Something like
following (this example is similar what I recall seeing there)

postgres=# CREATE TABLE test(a, b) AS SELECT md5(g::text)::char(10),
md5(g::text)::char(5) FROM generate_series(1, 100000) g;
SELECT 100000

postgres=# CREATE OR REPLACE FUNCTION ab(rec test) RETURNS text AS $$
SELECT rec.a || rec.b; $$ STABLE LANGUAGE SQL;
CREATE FUNCTION

postgres=# CREATE EXTENSION pg_trgm;
CREATE EXTENSION

postgres=# CREATE INDEX test_idx ON test USING GIN (ab(test) gin_trgm_ops);
CREATE INDEX

postgres=# EXPLAIN SELECT * FROM test WHERE ab(test) LIKE '%c4c%';
                               QUERY PLAN
-------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=16.09..52.53 rows=10 width=17)
   Recheck Cond: (((a)::text || (b)::text) ~~ '%c4c%'::text)
   ->  Bitmap Index Scan on test_idx  (cost=0.00..16.08 rows=10 width=0)
         Index Cond: (((a)::text || (b)::text) ~~ '%c4c%'::text)
 Planning time: 0.361 ms
(5 rows)

--
Amit


Re: Is it possible to create an index without keeping the indexed data in a column?

От
Amit Langote
Дата:
On Fri, Aug 1, 2014 at 2:50 PM, Amit Langote <amitlangote09@gmail.com> wrote:
>
> Not sure exactly if it applies here;

Re-reading the OP again, perhaps it doesn't. Sorry about the noise

--
Amit


Re: Is it possible to create an index without keeping the indexed data in a column?

От
David G Johnston
Дата:
Amit Langote wrote
> On Fri, Aug 1, 2014 at 2:50 PM, Amit Langote <

> amitlangote09@

> > wrote:
>>
>> Not sure exactly if it applies here;
>
> Re-reading the OP again, perhaps it doesn't. Sorry about the noise

This is a functional index which lets you store derived data in the index
without having to also store it in the table.  Mostly useful for stuff that
is only relevant in the context of searching and not something you would
ever return to the user.

The restriction here is that the raw data still needs to be stored in the
table.

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Is-it-possible-to-create-an-index-without-keeping-the-indexed-data-in-a-column-tp5813461p5813499.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Is it possible to create an index without keeping the indexed data in a column?

От
David G Johnston
Дата:
larrry wrote
> Hi,
>
> I would like to create a GIN index on a set of JSON documents. Right now
> I'm storing the data in a JSONB column. The current index looks like this:
>
> CREATE INDEX document_payload_idx
>   ON document
>   USING gin
>   (payload jsonb_path_ops);
>
> The index is pretty small, but the actual data takes up a *lot* of space.
> Is there a way to get Postgres to index the table *as if* the JSON were
> there, but not actually put the data in the table? I could either store
> the
> docs elsewhere and keep a reference, or compress them and put them in the
> table in compressed form as a blob.
>
> Thanks much for your help.
>
> Larry

No idea if this works but maybe you can store the compressed data and then
write the index expression like:

USING gin (unzip(payload) jsonb_path_ops)

The unzip function would need to be custom I think...

David J.




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Is-it-possible-to-create-an-index-without-keeping-the-indexed-data-in-a-column-tp5813461p5813500.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Re: Is it possible to create an index without keeping the indexed data in a column?

От
Larry White
Дата:
Thank you David and Amit. This is more or less what I was looking for. 

I _think_ I might be able to store the data as TEXT, which is highly compressed by Toast, and then perhaps write the function in terms of a TEXT to JSONB conversion.  I will give it a try. It might perform terribly, but will be an interesting experiment.:)


On Fri, Aug 1, 2014 at 3:14 AM, David G Johnston <david.g.johnston@gmail.com> wrote:
larrry wrote
> Hi,
>
> I would like to create a GIN index on a set of JSON documents. Right now
> I'm storing the data in a JSONB column. The current index looks like this:
>
> CREATE INDEX document_payload_idx
>   ON document
>   USING gin
>   (payload jsonb_path_ops);
>
> The index is pretty small, but the actual data takes up a *lot* of space.
> Is there a way to get Postgres to index the table *as if* the JSON were
> there, but not actually put the data in the table? I could either store
> the
> docs elsewhere and keep a reference, or compress them and put them in the
> table in compressed form as a blob.
>
> Thanks much for your help.
>
> Larry

No idea if this works but maybe you can store the compressed data and then
write the index expression like:

USING gin (unzip(payload) jsonb_path_ops)

The unzip function would need to be custom I think...

David J.




--
View this message in context: http://postgresql.1045698.n5.nabble.com/Is-it-possible-to-create-an-index-without-keeping-the-indexed-data-in-a-column-tp5813461p5813500.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general