Julia: How to import graph from text file (csv with unequal number of "columns")?

I was wondering if anyone knows of a clean way to import a graph into Julia from a text file that is formatted like the following example (which I will call "graph.csv"):

1,6,7,3
2,5
3,9,8

      

Thus, the rows have a non-fixed number of records (> 1). If I just use readdlm()

naively, I end up with a matrix of entries with an empty string:

readdlm("graph.csv", ',', '\n')

# 3x4 Array{Any,2}:
#  1.0  6.0  7.0  3.0
#  2.0  5.0   ""   ""
#  3.0  9.0  8.0   ""

      

I have two problems with this. First, I don't like using more memory than I need to. Secondly, due to empty fields, I cannot interpret strings as whole arrays, i.e. readdlm("graph.csv", ',', Int, '\n')

does not work.

The way to import my graph now uses two steps. First, I import each line as a string, and then parse each line for integers:

graph_strings = readdlm("graph.csv", '\n')

graph = map(line -> map(parseint, split(line,',')), graph_strings)

# 3x1 Array{Array{Int64,1},2}:
#  [1,6,7,3]
#  [2,5]    
#  [3,9,8]

      

An alternative, more "Matlabby", uses an array Any

:

graph_strings = readdlm("graph.csv",'\n')

graph = {map(parseint, split(graph_strings[i],',')) for i=1:length(graph_strings)}

# 3-element Array{Any,1}:
#  [1,6,7,3]
#  [2,5]    
#  [3,9,8]

      

My question is twofold:

1. Is there a better way to do this?

2. If not, which of the two import methods described above would be preferable for a large chart?

Thank!

+3


source to share


2 answers


How about this:

graph = map(
  line -> map(int, split(chomp(line), ",")),
  open("graph.csv") |> eachline
)

      



It gives you Array{Any, 1}

. If you need Array{Array{Int, 1}, 1}

:

graph = Base.mapfoldl(
  line -> map(int, split(chomp(line), ",")),
  push!,
  Vector{Int}[],
  open("graph.csv") |> eachline
)

      

+2


source


I don't think that readdlm

fits this data, its just not intrinsically tabular.

My approach, which I believe is fairly minimal in memory usage, would be

f = open("graph.csv","r")
adjlist = 
while !eof(f)
  push!(adjlist, map(int, split(chomp(readline(f)),",")))
end
close(f)

      



which produces

julia> adjlist
3-element Array{Array{Int64,1},1}:
 [1,6,7,3]
 [2,5]
 [3,9,8]

      

+3


source







All Articles