Обсуждение: building tsquery directly in memory (avoid makepol)
I know in advance the structure of a whole tsquery, it has already been reduced and lexemes have been already computed. I'd like to directly write it in memory without having to pass through pushValue/makepol. Anyway I'm not pretty sure about what is the layout of a tsquery in memory and I still haven't been able to find the MACRO that could help me [1]. Before doing it the trial and error way can somebody just make me an example? I'm not pretty sure about my interpretation of the comments of the documentation. This is how I'd write X:AB | YY:C | ZZZ:D TSQuery vl_len_ (total # of bytes of the whole following structure QueryItems*size + total lexeme length) size (# of QueryItemsin the query) QueryItem type QI_OPR oper OP_OR left -> distance from QueryItem X:AB QueryItem type QI_OPR oper OP_OR left -> distance from QueryItem ZZZ:D QueryItem (X) type QI_VAL weight 1100 valcrc ??? lenght1 distance QueryItem (YY) type QI_VAL weight 0010 valcrc ??? lenght 2 distance QueryItem (ZZZ) type QI_VAL weight 0001 valcrc ??? lenght 3 distance X YY ZZZ [1] the equivalent of POSTDATALEN, WEP_GETWEIGHT, macro to compute the size of various parts of TSQuery etc... I couldn't see any place in the code where TSQuery is built in "one shot" in spite of using pushValue. Another thing I'd like to know is: what is going to be preferred during a scan between 'java:1A,2B '::tsvector @@ to_tsquery('java:A | java:B'); vs. 'java:1A,2B '::tsvector @@ to_tsquery('java:AB') ? they look equivalent. Are they? thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it
> Before doing it the trial and error way can somebody just make me an > example? > I'm not pretty sure about my interpretation of the comments of the > documentation. > TSQuery [skipped] Right, valcrc is computed in pushValue > I couldn't see any place in the code where TSQuery is built in "one > shot" in spite of using pushValue. That because in all places we could parse rather complex structure. Simple OR-ed query could be hardcoded as pushValue('X') pushValue('YY') pushOperator(OP_OR); pushValue('ZZZ') pushOperator(OP_OR); You need to call pushValue/pushOperator imagery order of polish notation. Note, you can do another order: pushValue('X') pushValue('YY') pushValue('ZZZ') pushOperator(OP_OR); pushOperator(OP_OR); So, first example will produce ( X | YY ) | ZZZ, second one X | ( YY | XXX ) > > Another thing I'd like to know is: what is going to be preferred > during a scan between > 'java:1A,2B '::tsvector @@ to_tsquery('java:A | java:B'); > vs. > 'java:1A,2B '::tsvector @@ to_tsquery('java:AB') > ? > they look equivalent. Are they? Yes, but second one should be more efficient. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Thu, 04 Feb 2010 22:13:02 +0300 Teodor Sigaev <teodor@sigaev.ru> wrote: > > Before doing it the trial and error way can somebody just make > > me an example? > > I'm not pretty sure about my interpretation of the comments of > > the documentation. > > TSQuery > [skipped] > Right, valcrc is computed in pushValue Anyway the structure I posted is correct, isn't it? Is there any equivalent MACRO to POSTDATALEN, WEP_GETWEIGHT and macro to know the memory size of a TSQuery? I think I've seen MACRO that could help me to determine the size of a TSQuery... but I haven't noticed anything like POSTDATALEN that could come very handy to traverse a TSQuery. I was thinking to skip pushValue and directly build the TSQuery in memory since my queries have very simple structure and they are easy to reduce... Still it is not immediate to know the memory size in advance. For OR queries it is easy but for AND queries I'll have to loop over a tsvector, filter the weight according to a passed parameter and see how many time I've to duplicate a lexeme for each weight. eg. tsvector_to_tsquery( 'pizza:1A,2B risotto:2C,4D barolo:5A,6C', '&', 'ACD' ); should be turned into pizza:A & risotto:C & risotto:D & barolo:A & barolo:C I noticed you actually loop over the tsvector in tsvectorout to allocate the memory for the string buffer and I was wondering if it is really worth for my case as well. Any good receipt in Moscow? ;) thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it