Download csv file for python graphing tool

I am loading an oriented weighted plot from a csv file to a graph plot in python. Organization of the input csv file:

1,2,300

2,4,432

3,89,1.24

...

In the case where the first part of the first line of the string identifies the source and target of the edge, and the third number is the weight of the edge.

I am currently using:

g = gt.Graph()
e_weight = g.new_edge_property("float")
csv_network = open (in_file_directory+ '/'+network_input, 'r')
csv_data_n = csv_network.readlines()
for line in csv_data_n:
    edge = line.replace('\r\n','')
    edge = edge.split(delimiter)
    e = g.add_edge(edge[0], edge[1])
    e_weight[e] = float(edge[2])

      

However, loading the data takes quite a long time (I have a network of 10 million nodes and takes about 45 minutes). I tried to make it faster using g.add_edge_list, but this only works for unweighted graphs. Any suggestion how to make it faster?

+4


source to share


2 answers


This is answered on the graph-tool mailing list:

http://lists.skewed.de/pipermail/graph-tool/2015-June/002043.html

In short, you have to use the g.add_edge_list () function as you said and put the weight separately via the array interface for property maps:



e_weight.a = weight_list

      

The list of weights must be in the same order as the edges you passed to g.add_edge_list ().

+4


source


I suggest you try to get some performance using the csv library. This example returns a edge

list of 3 parameters.

import csv

reader = csv.reader(open(in_file_directory+ '/'+network_input, 'r'), delimiter=",")

for edge in reader:
    if len(edge) == 3:
        edge_float = [float(param) for param in edge]

      



So you get the following to work with ...

edge_float = [1.0, 2.0, 300.0]

      

0


source







All Articles