Re: querying with index on jsonb slower than standard column. Why?

Поиск
Список
Период
Сортировка
От Tim Dudgeon
Тема Re: querying with index on jsonb slower than standard column. Why?
Дата
Msg-id 5484F963.40009@gmail.com
обсуждение исходный текст
Ответ на Re: querying with index on jsonb slower than standard column. Why?  (Adrian Klaver <adrian.klaver@aklaver.com>)
Ответы Re: querying with index on jsonb slower than standard column. Why?
Список pgsql-sql
On 07/12/2014 21:53, Adrian Klaver wrote:
> On 12/07/2014 04:43 PM, Tim Dudgeon wrote:
>>
>> On 07/12/2014 21:19, Adrian Klaver wrote:
>>> On 12/07/2014 02:59 PM, Tim Dudgeon wrote:
>>>> I was doing some performance profiling regarding querying against 
>>>> jsonb
>>>> columns and found something I can't explain.
>>>> I created json version and standard column versions of some data, and
>>>> indexed the json 'fields' and the normal columns and executed 
>>>> equivalent
>>>> queries against both.
>>>> I find that the json version is quite a bit (approx 3x) slower which I
>>>> can't explain as both should (and are according to plans are) working
>>>> against what I would expect are equivalent indexes.
>>>>
>>>> Can anyone explain this?
>>>
>>> The docs can:
>>>
>>> http://www.postgresql.org/docs/9.4/interactive/datatype-json.html#JSON-INDEXING 
>>>
>>>
>>
>> If so them I'm missing it.
>> The index created is not a gin index. Its a standard btree index on the
>> data extracted from the json. So the indexes on the standard columns and
>> the ones on the 'fields' extracted from the json seem to be equivalent.
>> But perform differently.
>
> Down into the section there is this:
>
> "jsonb also supports btree and hash indexes. These are usually useful 
> only if it's important to check equality of complete JSON documents. 
> The btree ordering for jsonb datums is seldom of great interest, but 
> for completeness it is:
>
> Object > Array > Boolean > Number > String > Null
>
> Object with n pairs > object with n - 1 pairs
>
> Array with n elements > array with n - 1 elements
>
> Objects with equal numbers of pairs are compared in the order:
>
> key-1, value-1, key-2 ...
>
> Note that object keys are compared in their storage order; in 
> particular, since shorter keys are stored before longer keys, this can 
> lead to results that might be unintuitive, such as:
>
> { "aa": 1, "c": 1} > {"b": 1, "d": 1}
>
> Similarly, arrays with equal numbers of elements are compared in the 
> order:
>
> element-1, element-2 ...
>
> Primitive JSON values are compared using the same comparison rules as 
> for the underlying PostgreSQL data type. Strings are compared using 
> the default database collation.
> "
>
> As I understand it to get useful indexing into the jsonb 
> datum(document) you need to use the GIN indexes.

Yes, but if my understanding is correct I'm not indexing the JSON, I'm 
indexing the PostgreSQL float type extracted from a field of the JSON, 
and indexing using a btree index:

create index idx_data_json_assay2_ic50 on json_test (((data ->>
'assay2_ic50')::float));

The data ->> 'assay2_ic50' bit extracts the value from the JSON as text, 
the ::float bit casts to a float, and the index is built on the 
resulting float type.

And the index is being used, and is reasonably fast, just not as fast as 
the equivalent index on the 'normal' float column.

Tim
>
>>
>> Tim
>>>
>>>>
>>>> Example code is here:
>>>>
>>>>
>>>> create table json_test (
>>>> id SERIAL,
>>>> assay1_ic50 FLOAT,
>>>> assay2_ic50 FLOAT,
>>>> data JSONB
>>>> );
>>>>
>>>> DO
>>>> $do$
>>>> DECLARE
>>>> val1 FLOAT;
>>>> val2 FLOAT;
>>>> BEGIN
>>>> for i in 1..10000000 LOOP
>>>> val1 = random() * 100;
>>>> val2 = random() * 100;
>>>> INSERT INTO json_test (assay1_ic50, assay2_ic50, data) VALUES
>>>>      (val1, val2, ('{"assay1_ic50": ' || val1 || ', "assay2_ic50": 
>>>> ' ||
>>>> val2 || ', "mod": "="}')::jsonb);
>>>> end LOOP;
>>>> END
>>>> $do$
>>>>
>>>> create index idx_data_json_assay1_ic50 on json_test (((data ->>
>>>> 'assay1_ic50')::float));
>>>> create index idx_data_json_assay2_ic50 on json_test (((data ->>
>>>> 'assay2_ic50')::float));
>>>>
>>>> create index idx_data_col_assay1_ic50 on json_test (assay1_ic50);
>>>> create index idx_data_col_assay2_ic50 on json_test (assay2_ic50);
>>>>
>>>> select count(*) from json_test;
>>>> select * from json_test limit 10;
>>>>
>>>> select count(*) from json_test where (data->>'assay1_ic50')::float 
>>>> > 90
>>>> and (data->>'assay2_ic50')::float < 10;
>>>> select count(*) from json_test where assay1_ic50 > 90 and assay2_ic50
>>>> < 10;
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Tim
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>




В списке pgsql-sql по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: querying with index on jsonb slower than standard column. Why?
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: querying with index on jsonb slower than standard column. Why?