Обсуждение: Connection Broken with Custom Dicts for TSearch2

Поиск
Список
Период
Сортировка

Connection Broken with Custom Dicts for TSearch2

От
"Rodrigo Hjort"
Дата:
Sorry, but I thought it that was the most appropriate list for the issue.<br /><br />I was following these
instructions:<br/><a
href="http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html">http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html
</a><br/><br />And what happens is that the function works just once. Perhaps a malloc/free issue?<br /><br /><br />$
psqlfuzzy<br /><br />fuzzy=# select to_tsvector('the quick brown fox jumped over the lazy dog 100');<br
/>                           to_tsvector <br />--------------------------------------------------------------------<br
/> 'dog':9'fox':4 'jump':5 'lazi':8 'brown':3 'quick':2 'hundred':10<br />(1 registro)<br /><br />fuzzy=# select
to_tsvector('thequick brown fox jumped over the lazy dog 100'); <br />server closed the connection unexpectedly<br
/>       This probably means the server terminated abnormally<br />        before or while processing the request.<br
/>Aconexão com servidor foi perdida. Tentando reiniciar: Falhou. <br />!> \q<br clear="all" /><br /><br
/>Regards,<br/><br />Rodrigo Hjort<br /><a href="http://icewall.org/~hjort">http://icewall.org/~hjort</a><br /><br /> 

Re: Connection Broken with Custom Dicts for TSearch2

От
Oleg Bartunov
Дата:
Rodrigo,

you gave us too little information. Did you use your own dictionary ?
What's your configuration, version, etc.

Oleg
On Fri, 2 Jun 2006, Rodrigo Hjort wrote:

> Sorry, but I thought it that was the most appropriate list for the issue.
>
> I was following these instructions:
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html
>
> And what happens is that the function works just once. Perhaps a malloc/free
> issue?
>
>
> $ psql fuzzy
>
> fuzzy=# select to_tsvector('the quick brown fox jumped over the lazy dog
> 100');
>                           to_tsvector
> --------------------------------------------------------------------
> 'dog':9 'fox':4 'jump':5 'lazi':8 'brown':3 'quick':2 'hundred':10
> (1 registro)
>
> fuzzy=# select to_tsvector('the quick brown fox jumped over the lazy dog
> 100');
> server closed the connection unexpectedly
>       This probably means the server terminated abnormally
>       before or while processing the request.
> A conex?o com servidor foi perdida. Tentando reiniciar: Falhou.
> !> \q
>
>
> Regards,
>
> Rodrigo Hjort
> http://icewall.org/~hjort
>
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


Re: Connection Broken with Custom Dicts for TSearch2

От
"Rodrigo Hjort"
Дата:
Oleg,

Actually I got PG 8.1.4 compiled from source on a Debian GNU/Linux 2.6.16-k7-2.
My locale is pt_BR, but I configured TSearch2 to use rules from the 'simple'.
Then I just followed the instructions from the link. The fact is that it only works at the first time.

Regards,

Rodrigo Hjort
http://icewall.org/~hjort


2006/6/2, Oleg Bartunov <oleg@sai.msu.su>:
Rodrigo,

you gave us too little information. Did you use your own dictionary ?
What's your configuration, version, etc.

Oleg

Re: Connection Broken with Custom Dicts for TSearch2

От
Oleg Bartunov
Дата:
On Fri, 2 Jun 2006, Rodrigo Hjort wrote:

> Oleg,
>
> Actually I got PG 8.1.4 compiled from source on a Debian GNU/Linux
> 2.6.16-k7-2.
> My locale is pt_BR, but I configured TSearch2 to use rules from the
> 'simple'.
> Then I just followed the instructions from the link. The fact is that it
> only works at the first time.

Rodrigo, it's not enough. If you built your dictionary, please, post it,
as well as tsearch2 configuration.


>
> Regards,
>
> Rodrigo Hjort
> http://icewall.org/~hjort
>
>
> 2006/6/2, Oleg Bartunov <oleg@sai.msu.su>:
>> 
>> Rodrigo,
>> 
>> you gave us too little information. Did you use your own dictionary ?
>> What's your configuration, version, etc.
>> 
>> Oleg
>
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


Re: Connection Broken with Custom Dicts for TSearch2

От
Oleg Bartunov
Дата:
Aha,

I got the same problem on 8.2dev.

    Oleg
On Fri, 2 Jun 2006, Rodrigo Hjort wrote:

> Oleg,
>
> Actually I got PG 8.1.4 compiled from source on a Debian GNU/Linux
> 2.6.16-k7-2.
> My locale is pt_BR, but I configured TSearch2 to use rules from the
> 'simple'.
> Then I just followed the instructions from the link. The fact is that it
> only works at the first time.
>
> Regards,
>
> Rodrigo Hjort
> http://icewall.org/~hjort
>
>
> 2006/6/2, Oleg Bartunov <oleg@sai.msu.su>:
>> 
>> Rodrigo,
>> 
>> you gave us too little information. Did you use your own dictionary ?
>> What's your configuration, version, etc.
>> 
>> Oleg
>
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


Re: Connection Broken with Custom Dicts for TSearch2

От
Teodor Sigaev
Дата:
Sorry, it isn't mentioned on page, but this example of code working only with 
before 8.1 versions. In 8.1 interface to dictionary was changed.

More precisely, in 8.1, lexize function (in num2english dlexize_num2english()) 
should return pointer to TSLexeme array instead of char**.

Rodrigo Hjort wrote:
> Sorry, but I thought it that was the most appropriate list for the issue.
> 
> I was following these instructions:
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html 
> <http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/custom-dict.html>
> 
> And what happens is that the function works just once. Perhaps a 
> malloc/free issue?
> 
> 
> $ psql fuzzy
> 
> fuzzy=# select to_tsvector('the quick brown fox jumped over the lazy dog 
> 100');
>                             to_tsvector
> --------------------------------------------------------------------
>  'dog':9 'fox':4 'jump':5 'lazi':8 'brown':3 'quick':2 'hundred':10
> (1 registro)
> 
> fuzzy=# select to_tsvector('the quick brown fox jumped over the lazy dog 
> 100');
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> A conexão com servidor foi perdida. Tentando reiniciar: Falhou.
> !> \q
> 
> 
> Regards,
> 
> Rodrigo Hjort
> http://icewall.org/~hjort
> 

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: Connection Broken with Custom Dicts for TSearch2

От
Teodor Sigaev
Дата:

Teodor Sigaev wrote:
> Sorry, it isn't mentioned on page, but this example of code working only
> with before 8.1 versions. In 8.1 interface to dictionary was changed.

Try attached dict_tmpl.c

2Oleg: place file on site, pls

--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/
/*
 * num2english dictionary by Ben Chobot <bench@silentmedia.com>, based on
 * example of dictionary
 * Teodor Sigaev <teodor@sigaev.ru>
 *
 */
#include <errno.h>
#include <stdlib.h>
#include <string.h>

#include "postgres.h"

#include "dict.h"
#include "common.h"

#include "subinclude.h"

/* special names for values */
struct nx {
    char name[20];
    int value;
};

static struct nx num2english_numarr[] =
{
{ "zero", 0 },
{ "one", 1 },
{ "two", 2 },
{ "three", 3 },
{ "four", 4 },
{ "five", 5 },
{ "six", 6 },
{ "seven", 7 },
{ "eight", 8 },
{ "nine", 9 },
{ "ten", 10 },
{ "eleven", 11 },
{ "twelve", 12 },
{ "thirteen", 13 },
{ "fourteen", 14 },
{ "fifteen", 15 },
{ "sixteen", 16 },
{ "seventeen", 17 },
{ "eighteen", 18 },
{ "nineteen", 19 },
{ "twenty", 20 },
{ "thirty", 30 },
{ "forty", 40 },
{ "fifty", 50 },
{ "sixty", 60 },
{ "seventy", 70 },
{ "eighty", 80 },
{ "ninety", 90 },
{ "", 999 }
};

static char *num2english_denom[]=
{
"",
"thousand",
"million",
"billion",
"trillion",
"quadrillion",
"quintillion",
"sextillion",
"septillion",
"octillion",
"nonillion",
"decillion",
"undecillion",
"duodecillion",
"tredecillion",
"quattuordecillion",
"sexdecillion",
"septendecillion",
"octodecillion",
"novemdecillion",
"vigintillion"
};


static char *cvt2(int);
static char *cvt3(int);
static char *itowords(long long);


 PG_FUNCTION_INFO_V1(dinit_num2english);
 Datum dinit_num2english(PG_FUNCTION_ARGS);

 Datum
 dinit_num2english(PG_FUNCTION_ARGS) {
    /* nothing to init */

     PG_RETURN_POINTER(NULL);
 }

PG_FUNCTION_INFO_V1(dlexize_num2english);
Datum dlexize_num2english(PG_FUNCTION_ARGS);
Datum
dlexize_num2english(PG_FUNCTION_ARGS) {
     void* dummy = PG_GETARG_POINTER(0);
    char       *in = (char*)PG_GETARG_POINTER(1);
    char *txt = pnstrdup(in, PG_GETARG_INT32(2));
    TSLexeme    *res=0;

    char    *phrase;
    char    *cursor;
    char    *last;
    int    lexes = 1;
    int    thisLex = 0;

     if ( *txt=='\0' ) {
        res = palloc0(sizeof(TSLexeme));
         pfree(txt);
     }
    else
    {
        phrase = itowords(atoll(txt));
        if((cursor = strchr(txt,'.')) && *(cursor+1))
        {
            char    *phrase2;
            char    *ptemp = phrase;

            phrase2 = itowords(atoll(cursor+1));
            phrase = palloc(strlen(phrase2) + strlen(ptemp) + strlen(" . ") + 1);
            sprintf(phrase,"%s . %s",ptemp,phrase2);
            pfree(ptemp);
            pfree(phrase2);
        }
        pfree(txt);

        for(cursor=phrase; *cursor; cursor++) if(*cursor == ' ') lexes++;

        res = palloc0(sizeof(TSLexeme)*(lexes +1));
        for(last=cursor=phrase; *cursor; cursor++)
        {
            if(*cursor == ' ')
            {
                res[thisLex].lexeme = palloc((cursor-last+1));
                memcpy(res[thisLex].lexeme,last,(cursor-last));
                res[thisLex++].lexeme[cursor-last] = '\0';
                /* done with this lex. */
                if(*(cursor+1) == ' ') // if the next space is *also* whitespace....
                {
                    /* We don't want it.
                       Fortunately we know we'll never get more than 2 spaces in a row. */
                    cursor++;
                }
                last=cursor+1;
            }
        }

        /* finish up this last lex */
        res[thisLex].lexeme = palloc((cursor-last+1));
        memcpy(res[thisLex].lexeme,last,(cursor-last));
        res[thisLex++].lexeme[cursor-last] = 0;

        pfree(phrase);
        res[thisLex].lexeme = NULL;
    }

    PG_RETURN_POINTER(res);
}

/* The code below was taken from
http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,3556,00.html 
 and modified slightly to fit in the postgres stored proc framework. It appears to be without copywrite. */

/* take a two-digit number and cvt to words. */
static char *cvt2(int val)
{
    int i=0;
    char word[80];
    char *ret = 0;

    while(num2english_numarr[++i].value <= val)
        /* nothing */;
    strcpy(word,num2english_numarr[i-1].name);
    val -= num2english_numarr[i-1].value;
    if (val > 0)
    {
        strcat(word," ");
        strcat(word,num2english_numarr[val].name);
    }

    ret = palloc(strlen(word)+1);
    memcpy(ret,word,strlen(word)+1);
    return (ret);
}



/* take a 3 digit number and cvt it to words */
static char *cvt3(int val)
{
    int rem, mod;
    char word[80];
    char *ret = 0;

    word[0] = '\0';
    mod = val % 100;
    rem = val / 100;

    if ( rem > 0 )
    {
        strcat(word,num2english_numarr[rem].name);
        strcat(word," hundred");
        if (mod > 0)
            strcat(word," ");
    }
    if ( mod > 0 )
    {
        char *sub = cvt2(mod);
        strcat(word, sub);
        pfree(sub);
    }

    ret = palloc(strlen(word)+1);
    memcpy(ret,word,strlen(word)+1);
    return(ret);
}

/* here's the routine that does the rest */
static char *itowords(long long val)
{
    long long tri;    /* last three digits */
    long long place = 0;    /* which power of 10 we are on */
    int neg=0;    /* sign holder */
    char temp[255];    /* temporary string space */

    char word[255];
    char phrase[100];
    char *ret = 0;

    word[0] = '\0';

    /* check for negative int */
    if (val < 0 )
    {
        neg = 1;
        val = -val;
    }

    if ( val == 0 )
    {
        ret = palloc(5);
        sprintf(ret,"zero");
        return(ret);
    }

    /* what we do now is break it up into sets of three, and add the */
    /* appropriate denominations to each. */
    while (val > 0 )
    {
        phrase[0] = '\0';
        tri = val % 1000; /* last three digits */
        val = val / 1000; /* base 10 shift by 3 */
        if (tri > 0 )
        {
            char *sub = cvt3(tri);
            strcat(phrase,sub);
            pfree(sub);
            strcat(phrase," ");
        }
        if ((place > 0 ) && (tri > 0))
        {
            strcat(phrase,num2english_denom[place]);
            strcat(phrase," ");
        }
        place++;

        /* got the phrase, now put it in the string */
        strcpy(temp,word);
        if ((val > 0) && (tri > 0))
        {
            strcpy(word," ");
            strcat(word,phrase);
        }
        else
            strcpy(word,phrase);

        strcat(word,temp);
    }

    /* remember that minus sign ? */
    if (neg)
    {
        strcpy(temp,word);
        strcpy(word,"negative ");
        strcat(word,temp);
    }

    /* chop off the last space */
    word[strlen(word)-1] = 0;

    ret = palloc(strlen(word)+1);
    memcpy(ret,word,strlen(word)+1);
    return(ret);
}

Re: Connection Broken with Custom Dicts for TSearch2

От
Oleg Bartunov
Дата:
On Mon, 5 Jun 2006, Teodor Sigaev wrote:

>
>
> Teodor Sigaev wrote:
>> Sorry, it isn't mentioned on page, but this example of code working only 
>> with before 8.1 versions. In 8.1 interface to dictionary was changed.
>
> Try attached dict_tmpl.c
>
> 2Oleg: place file on site, pls

done

>
>
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


Re: Connection Broken with Custom Dicts for TSearch2

От
"John Jawed"
Дата:
Since we are on the topic, is there a timeline/plans for openfts being
brought into core? If not, I'll continue my work on bringing it into
Gentoo Portage.

John

On 6/5/06, Oleg Bartunov <oleg@sai.msu.su> wrote:
> On Mon, 5 Jun 2006, Teodor Sigaev wrote:
>
> >
> >
> > Teodor Sigaev wrote:
> >> Sorry, it isn't mentioned on page, but this example of code working only
> >> with before 8.1 versions. In 8.1 interface to dictionary was changed.
> >
> > Try attached dict_tmpl.c
> >
> > 2Oleg: place file on site, pls
>
> done
>
> >
> >
>
>         Regards,
>                 Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>


Re: Connection Broken with Custom Dicts for TSearch2

От
Teodor Sigaev
Дата:
> Since we are on the topic, is there a timeline/plans for openfts being
> brought into core? If not, I'll continue my work on bringing it into
> Gentoo Portage.

OpenFTS never, but tsearch2 is possible. But it requires enough work to do, so I   have doubt that it will be done in
8.2...


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: Connection Broken with Custom Dicts for TSearch2

От
"Rodrigo Hjort"
Дата:
How about those "pg_ts*" tables, which are specific for a database? Will they serve to the whole cluster?

2006/6/7, Teodor Sigaev <teodor@sigaev.ru >:
OpenFTS never, but tsearch2 is possible. But it requires enough work to do, so I
   have doubt that it will be done in 8.2...

--
Rodrigo Hjort
http://icewall.org/~hjort

Re: Connection Broken with Custom Dicts for TSearch2

От
Teodor Sigaev
Дата:

Rodrigo Hjort wrote:
> How about those "pg_ts*" tables, which are specific for a database? Will 
> they serve to the whole cluster?

No, it plans per database only.

If you need in all database, you can install tsearch2 into template1, so all 
newly created database will have the same tsearch2 configuration.


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/