Re: Replacing Apache Solr with Postgre Full Text Search?

Поиск
Список
Период
Сортировка
От Mike Rylander
Тема Re: Replacing Apache Solr with Postgre Full Text Search?
Дата
Msg-id CAO8ar=nBCqt+aTrSrNNvqp5FbKzX+kJiZe1jtDWgga=TDau7Ag@mail.gmail.com
обсуждение исходный текст
Ответ на Replacing Apache Solr with Postgre Full Text Search?  (J2eeInside J2eeInside <j2eeinside@gmail.com>)
Ответы Re: Replacing Apache Solr with Postgre Full Text Search?  (J2eeInside J2eeInside <j2eeinside@gmail.com>)
Список pgsql-general
On Wed, Mar 25, 2020 at 8:37 AM J2eeInside J2eeInside
<j2eeinside@gmail.com> wrote:
>
> Hi all,
>
> I hope someone  can help/suggest:
> I'm currently maintaining a project that uses Apache Solr /Lucene. To be honest, I wold like to replace Solr with
PostgreFull Text Search. However, there is a huge amount of documents involved - arround 200GB. Wondering, can Postgre
handlethis efficiently? 
> Does anyone have specific experience, and what should the infrastructure look like?
>
> P.S. Not to be confused, the Sol works just fine, i just wanted to eliminate one component from the whole system (if
Fulltext search can replace Solr at all) 

I'm one of the core developers (and the primary developer of the
search subsystem) for the Evergreen ILS [1] (integrated library system
-- think book library, not software library).  We've been using PGs
full-text indexing infrastructure since day one, and I can say it is
definitely capable of handling pretty much anything you can throw at
it.

Our indexing requirements are very complex and need to be very
configurable, and need to include a lot more than just "search and
rank a text column," so we've had to build a ton of infrastructure
around record (document) ingest, searching/filtering, linking, and
display.  If your indexing and search requirements are stable,
specific, and well-understood it should be straight forward,
especially if you don't have to take into account non-document
attributes like physical location, availability, and arbitrary
real-time visibility rules like Evergreen does.

As for scale, it's more about document count than total size.  There
are Evergreen libraries with several million records to search, and
with proper hardware and tuning everything works well.  Our main
performance issue has to do with all of the stuff outside the records
(documents) themselves that have to be taken into account during
search.  The core full-text search part of our queries is extremely
performant, and has only gotten better over the years.

[1] http://evergreen-ils.org

HTH,
--
Mike Rylander
 | Executive Director
 | Equinox Open Library Initiative
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker@equinoxinitiative.org
 | web:  http://equinoxinitiative.org



В списке pgsql-general по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: PLPGSQL: when the local variable used and when the table field?
Следующее
От: Paul Förster
Дата:
Сообщение: Re: How to query "primary_slot_name" in slave server?