Julia: How to import graph from text file (csv with unequal number of "columns")?
I was wondering if anyone knows of a clean way to import a graph into Julia from a text file that is formatted like the following example (which I will call "graph.csv"):
1,6,7,3 2,5 3,9,8
Thus, the rows have a non-fixed number of records (> 1). If I just use readdlm()
naively, I end up with a matrix of entries with an empty string:
readdlm("graph.csv", ',', '\n')
# 3x4 Array{Any,2}:
# 1.0 6.0 7.0 3.0
# 2.0 5.0 "" ""
# 3.0 9.0 8.0 ""
I have two problems with this. First, I don't like using more memory than I need to. Secondly, due to empty fields, I cannot interpret strings as whole arrays, i.e. readdlm("graph.csv", ',', Int, '\n')
does not work.
The way to import my graph now uses two steps. First, I import each line as a string, and then parse each line for integers:
graph_strings = readdlm("graph.csv", '\n')
graph = map(line -> map(parseint, split(line,',')), graph_strings)
# 3x1 Array{Array{Int64,1},2}:
# [1,6,7,3]
# [2,5]
# [3,9,8]
An alternative, more "Matlabby", uses an array Any
:
graph_strings = readdlm("graph.csv",'\n')
graph = {map(parseint, split(graph_strings[i],',')) for i=1:length(graph_strings)}
# 3-element Array{Any,1}:
# [1,6,7,3]
# [2,5]
# [3,9,8]
My question is twofold:
1. Is there a better way to do this?
2. If not, which of the two import methods described above would be preferable for a large chart?
Thank!
source to share
How about this:
graph = map(
line -> map(int, split(chomp(line), ",")),
open("graph.csv") |> eachline
)
It gives you Array{Any, 1}
. If you need Array{Array{Int, 1}, 1}
:
graph = Base.mapfoldl(
line -> map(int, split(chomp(line), ",")),
push!,
Vector{Int}[],
open("graph.csv") |> eachline
)
source to share
I don't think that readdlm
fits this data, its just not intrinsically tabular.
My approach, which I believe is fairly minimal in memory usage, would be
f = open("graph.csv","r")
adjlist =
while !eof(f)
push!(adjlist, map(int, split(chomp(readline(f)),",")))
end
close(f)
which produces
julia> adjlist
3-element Array{Array{Int64,1},1}:
[1,6,7,3]
[2,5]
[3,9,8]
source to share