Обсуждение: jsonb search
Hi
In my quest of JSONB querying and searching without having to actually cast into a text, I found JSQuery
I do admit my JSONB knowledge shortcoming and I am not a developer but a DBA. As such some examples would be greatly appreciated since I tend to understand better
I compiled and installed the extension
1 - Exact matching without knowing the hierarchy, just the key and element, I built a set like
col1 | col2
------+--------------------------------------------------
1 | {"Home Email": {"EmailAddress": "1@yahoo.com"}}
2 | {"Home Email": {"EmailAddress": "2@yahoo.com"}}
3 | {"Home Email": {"EmailAddress": "3@yahoo.com"}}
JSQuqery is super
SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "1@yahoo.com"';
Now I can do a performance boost using
CREATE INDEX idx1 ON test1 USING GIN (col2 jsonb_value_path_ops);
I see this yield
from
testdb=# explain analyze^JSELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "10@yahoo.com"';
Seq Scan on test1 (cost=0.00..12423.00 rows=500 width=68) (actual time=0.016..160.777 rows=1 loops=1)
Filter: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Rows Removed by Filter: 499999
Planning time: 0.042 ms
Execution time: 160.799 ms
(5 rows)
to
testdb-# SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "10@yahoo.com"';
Bitmap Heap Scan on test1 (cost=31.88..1559.32 rows=500 width=68) (actual time=0.018..0.019 rows=1 loops=1)
Recheck Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Heap Blocks: exact=1
-> Bitmap Index Scan on idx1 (cost=0.00..31.75 rows=500 width=0) (actual time=0.011..0.011 rows=1 loops=1)
Index Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Planning time: 0.039 ms
Execution time: 0.038 ms
(7 rows)
A whooping 4000 times improvement
But I also noticed a vodka index
testdb=# CREATE INDEX idx2 ON
testdb-# test1 USING vodka (col2);
ERROR: access method "vodka" does not exist
What am I missing ?
2 - Is there anyway I can accomplish a pattern and/or case insensitive search using JSQuery similar to
select * from test2 where upper((col2 -> 'Home Email') ->> 'EmailAddress') ilike '%3%YAH%';
select * from test2 where (col2 -> 'Home Email') ->> 'EmailAddress' like '%3%yah%';
If so what indexing strategy can be used to have similar gains as above ?
I compiled and installed the extension
1 - Exact matching without knowing the hierarchy, just the key and element, I built a set like
col1 | col2
------+--------------------------------------------------
1 | {"Home Email": {"EmailAddress": "1@yahoo.com"}}
2 | {"Home Email": {"EmailAddress": "2@yahoo.com"}}
3 | {"Home Email": {"EmailAddress": "3@yahoo.com"}}
JSQuqery is super
SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "1@yahoo.com"';
Now I can do a performance boost using
CREATE INDEX idx1 ON test1 USING GIN (col2 jsonb_value_path_ops);
I see this yield
from
testdb=# explain analyze^JSELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "10@yahoo.com"';
Seq Scan on test1 (cost=0.00..12423.00 rows=500 width=68) (actual time=0.016..160.777 rows=1 loops=1)
Filter: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Rows Removed by Filter: 499999
Planning time: 0.042 ms
Execution time: 160.799 ms
(5 rows)
to
testdb-# SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "10@yahoo.com"';
Bitmap Heap Scan on test1 (cost=31.88..1559.32 rows=500 width=68) (actual time=0.018..0.019 rows=1 loops=1)
Recheck Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Heap Blocks: exact=1
-> Bitmap Index Scan on idx1 (cost=0.00..31.75 rows=500 width=0) (actual time=0.011..0.011 rows=1 loops=1)
Index Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
Planning time: 0.039 ms
Execution time: 0.038 ms
(7 rows)
A whooping 4000 times improvement
But I also noticed a vodka index
testdb=# CREATE INDEX idx2 ON
testdb-# test1 USING vodka (col2);
ERROR: access method "vodka" does not exist
What am I missing ?
2 - Is there anyway I can accomplish a pattern and/or case insensitive search using JSQuery similar to
select * from test2 where upper((col2 -> 'Home Email') ->> 'EmailAddress') ilike '%3%YAH%';
select * from test2 where (col2 -> 'Home Email') ->> 'EmailAddress' like '%3%yah%';
If so what indexing strategy can be used to have similar gains as above ?
Many thanks for any help
Armand
On Tue, Jun 28, 2016 at 5:15 PM, Armand Pirvu (home) <armand.pirvu@gmail.com> wrote: > Hi > > In my quest of JSONB querying and searching without having to actually cast > into a text, I found JSQuery > > I do admit my JSONB knowledge shortcoming and I am not a developer but a > DBA. As such some examples would be greatly appreciated since I tend to > understand better > > I compiled and installed the extension > > 1 - Exact matching without knowing the hierarchy, just the key and element, > I built a set like > > col1 | col2 > ------+-------------------------------------------------- > 1 | {"Home Email": {"EmailAddress": "1@yahoo.com"}} > 2 | {"Home Email": {"EmailAddress": "2@yahoo.com"}} > 3 | {"Home Email": {"EmailAddress": "3@yahoo.com"}} > > > JSQuqery is super > > SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "1@yahoo.com"'; > > Now I can do a performance boost using > > CREATE INDEX idx1 ON test1 USING GIN (col2 jsonb_value_path_ops); > > I see this yield > > from > > testdb=# explain analyze^JSELECT * FROM test1 WHERE col2 @@ '*.EmailAddress > = "10@yahoo.com"'; > Seq Scan on test1 (cost=0.00..12423.00 rows=500 width=68) (actual > time=0.016..160.777 rows=1 loops=1) > Filter: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery) > Rows Removed by Filter: 499999 > Planning time: 0.042 ms > Execution time: 160.799 ms > (5 rows) > > > to > > testdb-# SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = > "10@yahoo.com"'; > Bitmap Heap Scan on test1 (cost=31.88..1559.32 rows=500 width=68) (actual > time=0.018..0.019 rows=1 loops=1) > Recheck Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery) > Heap Blocks: exact=1 > -> Bitmap Index Scan on idx1 (cost=0.00..31.75 rows=500 width=0) (actual > time=0.011..0.011 rows=1 loops=1) > Index Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery) > Planning time: 0.039 ms > Execution time: 0.038 ms > (7 rows) > > A whooping 4000 times improvement > > > > > But I also noticed a vodka index > > > testdb=# CREATE INDEX idx2 ON > testdb-# test1 USING vodka (col2); > ERROR: access method "vodka" does not exist > > What am I missing ? > > 2 - Is there anyway I can accomplish a pattern and/or case insensitive > search using JSQuery similar to > > > select * from test2 where upper((col2 -> 'Home Email') ->> 'EmailAddress') > ilike '%3%YAH%'; > > select * from test2 where (col2 -> 'Home Email') ->> 'EmailAddress' like > '%3%yah%'; > > > If so what indexing strategy can be used to have similar gains as above ? > > > Many thanks for any help Vodka is our experimental prototype of access method of next generation and it doesn't exists in production-ready form. You can check our presentation http://www.sai.msu.su/~megera/postgres/talks/highload-2014-vodka.pdf to understand jsquery limitation and why we stop its development. Also, 2 years ago I wrote (in russian) http://obartunov.livejournal.com/179422.html about jsonb query language and our plans. Google translate might helps https://translate.google.com/translate?sl=auto&tl=en&js=y&prev=_t&hl=en&ie=UTF-8&u=http%3A%2F%2Fobartunov.livejournal.com%2F179422.html&edit-text=&act=url > > > Armand >
On Tue, Jun 28, 2016 at 5:09 PM, Oleg Bartunov <obartunov@gmail.com> wrote:
--
Arthur Silva
Vodka is our experimental prototype of access method of nextOn Tue, Jun 28, 2016 at 5:15 PM, Armand Pirvu (home)
<armand.pirvu@gmail.com> wrote:
> Hi
>
> In my quest of JSONB querying and searching without having to actually cast
> into a text, I found JSQuery
>
> I do admit my JSONB knowledge shortcoming and I am not a developer but a
> DBA. As such some examples would be greatly appreciated since I tend to
> understand better
>
> I compiled and installed the extension
>
> 1 - Exact matching without knowing the hierarchy, just the key and element,
> I built a set like
>
> col1 | col2
> ------+--------------------------------------------------
> 1 | {"Home Email": {"EmailAddress": "1@yahoo.com"}}
> 2 | {"Home Email": {"EmailAddress": "2@yahoo.com"}}
> 3 | {"Home Email": {"EmailAddress": "3@yahoo.com"}}
>
>
> JSQuqery is super
>
> SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress = "1@yahoo.com"';
>
> Now I can do a performance boost using
>
> CREATE INDEX idx1 ON test1 USING GIN (col2 jsonb_value_path_ops);
>
> I see this yield
>
> from
>
> testdb=# explain analyze^JSELECT * FROM test1 WHERE col2 @@ '*.EmailAddress
> = "10@yahoo.com"';
> Seq Scan on test1 (cost=0.00..12423.00 rows=500 width=68) (actual
> time=0.016..160.777 rows=1 loops=1)
> Filter: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
> Rows Removed by Filter: 499999
> Planning time: 0.042 ms
> Execution time: 160.799 ms
> (5 rows)
>
>
> to
>
> testdb-# SELECT * FROM test1 WHERE col2 @@ '*.EmailAddress =
> "10@yahoo.com"';
> Bitmap Heap Scan on test1 (cost=31.88..1559.32 rows=500 width=68) (actual
> time=0.018..0.019 rows=1 loops=1)
> Recheck Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
> Heap Blocks: exact=1
> -> Bitmap Index Scan on idx1 (cost=0.00..31.75 rows=500 width=0) (actual
> time=0.011..0.011 rows=1 loops=1)
> Index Cond: (col2 @@ '*."EmailAddress" = "10@yahoo.com"'::jsquery)
> Planning time: 0.039 ms
> Execution time: 0.038 ms
> (7 rows)
>
> A whooping 4000 times improvement
>
>
>
>
> But I also noticed a vodka index
>
>
> testdb=# CREATE INDEX idx2 ON
> testdb-# test1 USING vodka (col2);
> ERROR: access method "vodka" does not exist
>
> What am I missing ?
>
> 2 - Is there anyway I can accomplish a pattern and/or case insensitive
> search using JSQuery similar to
>
>
> select * from test2 where upper((col2 -> 'Home Email') ->> 'EmailAddress')
> ilike '%3%YAH%';
>
> select * from test2 where (col2 -> 'Home Email') ->> 'EmailAddress' like
> '%3%yah%';
>
>
> If so what indexing strategy can be used to have similar gains as above ?
>
>
> Many thanks for any help
generation and it doesn't exists in production-ready form. You can
check our presentation
http://www.sai.msu.su/~megera/postgres/talks/highload-2014-vodka.pdf
to understand jsquery limitation and why we stop its development.
Also, 2 years ago I wrote (in russian)
http://obartunov.livejournal.com/179422.html about jsonb query
language and our plans. Google translate might helps
https://translate.google.com/translate?sl=auto&tl=en&js=y&prev=_t&hl=en&ie=UTF-8&u=http%3A%2F%2Fobartunov.livejournal.com%2F179422.html&edit-text=&act=url
>
>
> Armand
>
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Could you share your future plans for it (or it's reincarnation), if any?
Even in the limited form, vodka is very impressive.
--
Arthur Silva