Обсуждение: bug: fuzzystrmatch levenshtein is wrong
The current behavior of levenshtein(text,text,int,int,int) is wrong. Consider: leki_dev=# select levenshtein('','a',2,4,5); levenshtein ------------- 1 (1 row) leki_dev=# select levenshtein('a','',2,4,5); levenshtein ------------- 1 (1 row) leki_dev=# select levenshtein('aa','a',2,4,5); levenshtein ------------- 1 (1 row) leki_dev=# select levenshtein('a','aa',2,4,5); levenshtein ------------- 1 (1 row) versus (after patch) postgres=# select levenshtein('','a',2,4,5); levenshtein ------------- 2 (1 row) postgres=# select levenshtein('a','',2,4,5); levenshtein ------------- 4 (1 row) postgres=# select levenshtein('aa','a',2,4,5); levenshtein ------------- 4 (1 row) postgres=# select levenshtein('a','aa',2,4,5); levenshtein ------------- 2 (1 row) patch attached. Greetings Marcin Mańk
Вложения
also there is integer overflow: postgres=# select levenshtein('aaaaaaaaaaaaaaaa','',1,1000000000,1);levenshtein --------------1179869184 (1 row) should we reject arguments greater than,say, 10000 ? maximum input length is 255 currently, so the maximum numbers involved would be about 10000*255*2 This would leave some breathing room if we wanted to increase the maximum input string length. Greetings Marcin
On Mon, Dec 7, 2009 at 8:33 AM, marcin mank <marcin.mank@gmail.com> wrote: > The current behavior of levenshtein(text,text,int,int,int) is wrong. Consider: Is this the same problem as bug #5098? ...Robert
On Tue, Dec 8, 2009 at 4:10 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> The current behavior of levenshtein(text,text,int,int,int) is wrong. Consider: > > Is this the same problem as bug #5098? Yes. Exact same change, plus the shortcut evaluation (for when one of the inputs is empty) was also wrong. I fixed that too. Greetings Maricn Mańk
On Mon, Dec 7, 2009 at 8:33 AM, marcin mank <marcin.mank@gmail.com> wrote: > The current behavior of levenshtein(text,text,int,int,int) is wrong. Consider: > > leki_dev=# select levenshtein('','a',2,4,5); > levenshtein > ------------- > 1 > (1 row) > > > leki_dev=# select levenshtein('a','',2,4,5); > levenshtein > ------------- > 1 > (1 row) > > > leki_dev=# select levenshtein('aa','a',2,4,5); > levenshtein > ------------- > 1 > (1 row) > > > leki_dev=# select levenshtein('a','aa',2,4,5); > levenshtein > ------------- > 1 > (1 row) > > versus (after patch) > > postgres=# select levenshtein('','a',2,4,5); > levenshtein > ------------- > 2 > (1 row) > > postgres=# select levenshtein('a','',2,4,5); > levenshtein > ------------- > 4 > (1 row) > > postgres=# select levenshtein('aa','a',2,4,5); > levenshtein > ------------- > 4 > (1 row) > > postgres=# select levenshtein('a','aa',2,4,5); > levenshtein > ------------- > 2 > (1 row) > > patch attached. I cannot get this patch to apply for anything. All 4 hunks fail, both on HEAD and on the the pre-8.4-pgindent version of fuzzystrmatch.c. Either I'm doing something wrong here, or there's something wrong with this patch file. ...Robert
Robert Haas wrote: > > patch attached. > > I cannot get this patch to apply for anything. All 4 hunks fail, both > on HEAD and on the the pre-8.4-pgindent version of fuzzystrmatch.c. > Either I'm doing something wrong here, or there's something wrong with > this patch file. The author converted tabs to spaces --- there is not a single tab in the diff file. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Tue, Dec 8, 2009 at 9:08 PM, Bruce Momjian <bruce@momjian.us> wrote: > Robert Haas wrote: >> > patch attached. >> >> I cannot get this patch to apply for anything. All 4 hunks fail, both >> on HEAD and on the the pre-8.4-pgindent version of fuzzystrmatch.c. >> Either I'm doing something wrong here, or there's something wrong with >> this patch file. > > The author converted tabs to spaces --- there is not a single tab in the > diff file. Ah. I knew there had to be a reason. I'm attaching my version of this patch. Barring objections, I am going to apply this to HEAD and backpatch to 8.4, where this feature (and the associated bug) were introduced. ...Robert
Вложения
On Tue, Dec 8, 2009 at 9:45 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Dec 8, 2009 at 9:08 PM, Bruce Momjian <bruce@momjian.us> wrote: >> Robert Haas wrote: >>> > patch attached. >>> >>> I cannot get this patch to apply for anything. All 4 hunks fail, both >>> on HEAD and on the the pre-8.4-pgindent version of fuzzystrmatch.c. >>> Either I'm doing something wrong here, or there's something wrong with >>> this patch file. >> >> The author converted tabs to spaces --- there is not a single tab in the >> diff file. > > Ah. I knew there had to be a reason. > > I'm attaching my version of this patch. Barring objections, I am > going to apply this to HEAD and backpatch to 8.4, where this feature > (and the associated bug) were introduced. Done. Yeah, my first commit! However, it appears that someone needs to tell the pgsql-committers list that rhaas@postgresql.org is allowed to post there. ...Robert
On Wed, Dec 9, 2009 at 9:00 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Dec 8, 2009 at 9:45 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Tue, Dec 8, 2009 at 9:08 PM, Bruce Momjian <bruce@momjian.us> wrote: >>> Robert Haas wrote: >>>> > patch attached. >>>> >>>> I cannot get this patch to apply for anything. All 4 hunks fail, both >>>> on HEAD and on the the pre-8.4-pgindent version of fuzzystrmatch.c. >>>> Either I'm doing something wrong here, or there's something wrong with >>>> this patch file. >>> >>> The author converted tabs to spaces --- there is not a single tab in the >>> diff file. >> >> Ah. I knew there had to be a reason. >> >> I'm attaching my version of this patch. Barring objections, I am >> going to apply this to HEAD and backpatch to 8.4, where this feature >> (and the associated bug) were introduced. > > Done. Yeah, my first commit! However, it appears that someone needs > to tell the pgsql-committers list that rhaas@postgresql.org is allowed > to post there. Uh-oh. Is it bad that I did this between the time Tom updated the release notes and the time Marc stamped 8.4.2? *ducks and runs for cover* ...Robert
On Wed, 2009-12-09 at 21:00 -0500, Robert Haas wrote: > Done. Yeah, my first commit! :) I think you forgot to update release notes for 8.4.2 :( -- Devrim GÜNDÜZ, RHCE Command Prompt - http://www.CommandPrompt.com devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz
On Thu, Dec 10, 2009 at 3:00 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Done. Yeah, my first commit! Great! Also, thanks for getting this in 8.4.2. Otherwise I would have to compile this on Windows myself, which is no fun. About the tabs vs spaces problem - I`ve decided that copying the patch from a remote machine is best done by selecting it in the terminal and pasting into a text file. Don`t do that :) Greetings Marcin Mańk