Обсуждение: graphs in PostgreSQL
Hello, I'm trying to organize storage and processing of a graph (pretty spare, 100,000 vertices and 5,000,000 edges) with PostgreSQL. I have two main problems: - standart problem of finding all shortest paths between two given vertices; - search thru vertices' properties with ordering by path lengths from given vertix. So, basically, I need to decide what additional data (some preprocessed data about a graph or indexes) I need to store, how to store it, and how maintain it when graph changes. It seems that the second problem (ordering by path length) requires to store all path lengths between all vertices pairs (roadmap), that is very expensive to maintain. I would appreciate any suggestions... -- Sincerely, Ivan Zolotukhin
On 10/13/05 8:27 AM, "Ivan Yu. Zolotukhin" <iz@itpeople.ru> wrote: > Hello, > > I'm trying to organize storage and processing of a graph (pretty spare, > 100,000 vertices and 5,000,000 edges) with PostgreSQL. > > I have two main problems: > - standart problem of finding all shortest paths between two given vertices; > - search thru vertices' properties with ordering by path lengths from > given vertix. > > So, basically, I need to decide what additional data (some preprocessed > data about a graph or indexes) I need to store, how to store it, and how > maintain it when graph changes. > > It seems that the second problem (ordering by path length) requires to > store all path lengths between all vertices pairs (roadmap), that is > very expensive to maintain. > > I would appreciate any suggestions... Try googling for "transitive closure SQL" for some hits on the subject. The following website describes how some folks have stored a DAG for a biological system: http://www.godatabase.org/dev/sql/doc/godb-sql-doc.html I don't know if any folks from the CHADO (http://www.gmod.org/schema/index.shtml) project can comment on computing the transitive closure using stored procedures, but I think they do that for their project. Sean
On Thu, 13 Oct 2005, Ivan Yu. Zolotukhin wrote: > Hello, > > I'm trying to organize storage and processing of a graph (pretty spare, > 100,000 vertices and 5,000,000 edges) with PostgreSQL. > > I have two main problems: > - standart problem of finding all shortest paths between two given vertices; > - search thru vertices' properties with ordering by path lengths from > given vertix. > > So, basically, I need to decide what additional data (some preprocessed > data about a graph or indexes) I need to store, how to store it, and how > maintain it when graph changes. > > It seems that the second problem (ordering by path length) requires to > store all path lengths between all vertices pairs (roadmap), that is > very expensive to maintain. > You might look at PostGIS, as it seems you essentially want to store spatial data and this is a better tool for this purpose than the native Postgresql spatial datatypes. Brent Wood