Re: extracting location info from string

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: extracting location info from string
Дата
Msg-id 4DD9BB87.7070000@postnewspapers.com.au
обсуждение исходный текст
Ответ на Re: extracting location info from string  (Andrej <andrej.groups@gmail.com>)
Ответы Re: extracting location info from string  (Lew <noone@lewscanon.com>)
Список pgsql-sql
On 23/05/2011 9:11 AM, Andrej wrote:
> On 23 May 2011 10:00, Tarlika Elisabeth Schmitz
> <postgresql3@numerixtechnology.de>  wrote:
>> On Sun, 22 May 2011 21:05:26 +0100
>> Tarlika Elisabeth Schmitz<postgresql3@numerixtechnology.de>  wrote:
>>
>>> A column contains location information, which may contain any of the
>>> following:
>>>
>>> 1) null
>>> 2) country name (e.g. "France")
>>> 3) city name, region name (e.g. "Bonn, Nordrhein-Westfalen")
>>> 4) city name, Rg. region name (e.g. "Frankfurt, Rg. Hessen")
>>> 5) city name, Rg region name (e.g. "Frankfurt, Rg Hessen")
>>
>>
>> I also need to cope with variations of COUNTRY.NAME and REGION.NAME.

This is a hard problem. You're dealing with free-form data that might be 
easily understood by humans, but relies on various contextual 
information and knowledge that makes it really hard for computers to 
understand.

If you want to do a good job of this, your best bet is to plug in 3rd 
party address analysis software that is dedicated to this task. Most 
(all?) such packages are commercial, proprietary affairs. They exist 
because it's really, really hard to do this right.

> Another thing of great import is whether the city can occur in the
> data column all by itself; if yes, it's next to impossible to distinguish
> it from a country.

Not least because some places are both, eg:
  Luxembourg  The Vatican  Singapore

(The Grand Duchy of Luxembourg has other cities, but still serves as an 
example).

-- 
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/


В списке pgsql-sql по дате отправления:

Предыдущее
От: Andrej
Дата:
Сообщение: Re: extracting location info from string
Следующее
От: Gavin Baumanis
Дата:
Сообщение: Re: Which version of PostgreSQL should I use.