Re: How to find double entries

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: How to find double entries
Дата
Msg-id 21481.1208316212@sss.pgh.pa.us
обсуждение исходный текст
Ответ на How to find double entries  (Andreas <maps.on@gmx.net>)
Ответы Re: How to find double entries  (Vivek Khera <vivek@khera.org>)
Список pgsql-sql
Andreas <maps.on@gmx.net> writes:
> I'd like to identify and then merge records of e.g.   'google', 'gogle', 
> 'guugle' 

> Then I want to match abbrevations like  'A-Company Ltd.', 'a company 
> ltd.', 'A-Company Limited'

> Is there a way to do this?
> It would be OK just to list candidats up to be manually checked afterwards.

There are some functions in contrib/fuzzystrmatch that seem like they'd
help you find candidate duplicates.  contrib/pg_trgm and text search
might also offer promising tools.

What's really a duplicate sounds like a judgment call here, so you
probably shouldn't even think of automating it completely.
        regards, tom lane


В списке pgsql-sql по дате отправления:

Предыдущее
От: Andreas
Дата:
Сообщение: How to find double entries
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: How to find double entries