Обсуждение: creating index on changed field type

Поиск
Список
Период
Сортировка

creating index on changed field type

От
David Smith
Дата:
Hello,the subject is obscure, so I will try to explain. I would like to 
develop index based on text field (or tsvector stolen from tsearch2), 
but containing different type (for example cstring, varchar,etc.) in 
order to tokenize the original field. I would like to use postgresql 
btree implementation, but AFAICS I can not do it. Example:

CREATE TABLE test (id int, mytext text);
CREATE INDEX myindex on test USING myindex (mytext) ;
INSERT INTO test VALUES(1,'this is my first text');

In index I do not want to keep whole phrase, but words derived from it
('this', 'is', 'my', 'first', 'text').

My idea was to create functions mybtgettuple, mybtinsert, mybtbeginscan , mybtrescan and so on. And in every case
ignoringoriginal
 
IndexTuple, and create set of new IndexTuple's (one for every term) and 
involving original functiions.

The problem is that index_create() in catalog/index.c creates everything
in system tables, especially type of index field.

Should I forget about btrees and move to GIST, or is there any hack,
which could solve my problem? Please help me.

Thanks in advance,
David

ps. maybe I should create index on TEXT field, store terms (words form 
the original field) also as TEXT type? Will it work?




Re: creating index on changed field type

От
Tom Lane
Дата:
David Smith <gegez-pgh@instytut.com.pl> writes:
> Should I forget about btrees and move to GIST,

Yes.  There's no provision in the btree code for an index storage type
different from the column datatype.
        regards, tom lane


Re: creating index on changed field type

От
David Smith
Дата:
Użytkownik Tom Lane napisał:

> David Smith <gegez-pgh@instytut.com.pl> writes:
> 
>>Should I forget about btrees and move to GIST,
> 
> 
> Yes.  There's no provision in the btree code for an index storage type
> different from the column datatype.
> 
>             regards, tom lane
> 
> 
Thank You for reply.
Let us suppose, the we retain type of field (column). But instead of 
storing original value(key), we will store tokens(Instead 'this is my 
first text', we would keep 5 tokens (5 different BTItems) respectively: 
'this', 'is', 'my', 'first', 'text'). Will it work or is there any other 
catch I can not see.

My performance tests resulted that GIST would be slower than original 
btree index. Maybe I mistaken somehow...

Best regards,
David