Re: vector search support

Поиск
Список
Период
Сортировка
От Jonathan S. Katz
Тема Re: vector search support
Дата
Msg-id e083ced8-83a0-9b73-156b-da968b83ac9c@postgresql.org
обсуждение исходный текст
Ответ на Re: vector search support  (Oliver Rice <oliver@oliverrice.com>)
Список pgsql-hackers
On 5/25/23 1:48 PM, Oliver Rice wrote:

> A nice side effect of using the float8[] to represent vectors is that it 
> allows for vectors of different sizes to coexist in the same column.
> 
> We most frequently see (pgvector) vector columns being used for storing 
> ML embeddings. Given that different models produce embeddings with a 
> different number of dimensions, the need to specify a vector’s size in 
> DDL tightly couples the schema to a single model. Support for variable 
> length vectors would be a great way to decouple those concepts. It would 
> also be a differentiating feature from existing vector stores.

I hadn't thought of that, given most of what I've seen (or at least my 
personal bias in designing systems) is you keep a vector of one 
dimensionality in a column. But this sounds like where having native 
support in a variable array would help.

> One drawback is that variable length vectors complicates indexing for 
> similarity search because similarity measures require vectors of 
> consistent length. Partial indexes are a possible solution to that challenge

Yeah, that presents a challenge. This may also be an argument for a 
vector data type, since that would eliminate the need to check for 
consistent dimensionality on the indexing.

Jonathan

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Jonathan S. Katz"
Дата:
Сообщение: Re: vector search support
Следующее
От: "Jonathan S. Katz"
Дата:
Сообщение: Re: vector search support