CSV to JSON format with JavaScript
My task is to define dataset in CSV format, display sankey diagram using D3.
Data format: (I cannot change this)
Uses,Types,Feedback Use1,Type1,Feedback1 Use2,Type1,Feedback1 Use2,Type2,Feedback1 ...
Required format for D3 Sankey plugin:
{ "nodes": [
{"name": "Use1"},
{"name": "Use2"},
{"name": "Type1"},
...
], "links": [
{"source":"Use1", "target":"Type1", "value":1},
...
}
My problem: Convert CSV data to JSON required for Sankey chart. I cannot change the original data provided to me, so I have to dynamically build the JSON.
My research led me here , but the only example of massaging CSV data (which didn't include values yet, only sources and targets) was via MySQL. Since I don't have access to the database in my project, I resorted to using Underscore.js to help me transform (in Backbone.js app)
Here's what I have so far that works as intended.
// buildJSON is a method of a Backbone View that oversees the creation of the diagram
buildJSON: function( csv ) {
var json = {
nodes: [], // unique nodes found in data
links: [] // links between nodes
};
// get unique nodes!
var uniqueNodes = _.chain(csv).map(_.values).flatten().unique().value().sort();
uniqueNodes.forEach(function( node ) {
json.nodes.push({ name: node });
});
// map colors to nodes
this.color.domain(uniqueNodes);
// map links
var links = [];
var rMap = {};
var keys = _.keys(csv[0]);
for ( var i = 0; i < csv.length; i++ ) {
for ( var j = 0; j < keys.length - 1; j++ ) {
var relationship = csv[i][keys[j]] + '-' + csv[i][keys[j + 1]];
rMap[relationship] = ++rMap[relationship] || 1;
}
}
// create links from the linkmap
for ( var r in rMap ) {
if ( rMap.hasOwnProperty(r) ) {
var rel = r.split('-');
links.push({
source: rel[0],
target: rel[1],
value: rMap[r]
});
}
}
var nodeMap = {};
json.nodes.forEach(function( node ) { nodeMap[node.name] = node; });
json.links = links.map(function( link ) {
return {
source: nodeMap[link.source],
target: nodeMap[link.target],
value: link.value
};
});
return json;
}
This won't be a problem for a small dataset, but the data can contain thousands of rows and possibly up to ~ 10 columns.
So, long story, my question has two parts:
-
Is there an obvious performance improvement I can achieve, and
-
Is there a better (more efficient) way of massaging data for Sankey Chart in D3?
I realize this is a particularly narrow issue, so I appreciate any help on this!
source to share
No one has answered this question yet
Check out similar questions: