Keeping a huge graph to calculate the shortest path

I am trying to store an unweighted directed graph larger than 5GB in a MySQL database in an efficient way for finding shortest paths. It is currently stored in a single table with column source and column targets (comma separated), but I feel this is not the way to go, so I plan on converting it to a vertex table and an edge table.

I have two questions:

  • What's the best way to store a graph?
  • What's the shortest path algorithm I should use?
+3


source to share


2 answers


You should have two tables. One for knots and one for ribs. In the edge table, you must have source_node_id and dest_node_id. Thus, you can easily create queries on the edge table to get all outgoing nodes that are used by Dijkstra's algorithm.



For a simple description of Diiksha's algorithm see this: http://www.sce.carleton.ca/faculty/chinneck/po/Chapter8.pdf

+2


source


Another very efficient way to store dense graphs (sparse graphs are not that efficient) is to use an adjacency matrix. Here is a link that explains it -

Saving Graphs Using Adjacency Matrix

Now, to store the matrix in MySQL database, you have to use rowid as the vertex id for the rows (if you think your vertices are 1,2, ...). Columns can just be valid vertex names or vertex identifiers. You can save a table that maps vertex names to IDs.



One of the problems you will run into is the maximum number of columns. If your matrix is ​​too large, you might have to split the columns into multiple tables. If you have an indexing / hashing scheme to tell you the table name from a node right away, your query should be relatively fast.

And for the shortest path, as mentioned by others, Dijkstra's algorithm is the best shortest path finding algorithm.

+1


source







All Articles