CSV to JSON format with JavaScript

My task is to define dataset in CSV format, display sankey diagram using D3.

Data format: (I cannot change this)

Uses,Types,Feedback
Use1,Type1,Feedback1
Use2,Type1,Feedback1
Use2,Type2,Feedback1
...

      

Required format for D3 Sankey plugin:

{ "nodes": [
  {"name": "Use1"},
  {"name": "Use2"},
  {"name": "Type1"},
  ...
], "links": [
  {"source":"Use1", "target":"Type1", "value":1},
  ...
}

      

My problem: Convert CSV data to JSON required for Sankey chart. I cannot change the original data provided to me, so I have to dynamically build the JSON.

My research led me here , but the only example of massaging CSV data (which didn't include values ​​yet, only sources and targets) was via MySQL. Since I don't have access to the database in my project, I resorted to using Underscore.js to help me transform (in Backbone.js app)

Here's what I have so far that works as intended.

// buildJSON is a method of a Backbone View that oversees the creation of the diagram
buildJSON: function( csv ) {
    var json = {
        nodes: [], // unique nodes found in data
        links: []  // links between nodes
    };

    // get unique nodes!
    var uniqueNodes = _.chain(csv).map(_.values).flatten().unique().value().sort();
    uniqueNodes.forEach(function( node ) {
        json.nodes.push({ name: node });
    });

    // map colors to nodes
    this.color.domain(uniqueNodes);

    // map links
    var links = [];
    var rMap = {};
    var keys = _.keys(csv[0]);
    for ( var i = 0; i < csv.length; i++ ) {
        for ( var j = 0; j < keys.length - 1; j++ ) {
            var relationship = csv[i][keys[j]] + '-' + csv[i][keys[j + 1]];
            rMap[relationship] = ++rMap[relationship] || 1;
        }
    }

    // create links from the linkmap
    for ( var r in rMap ) {
        if ( rMap.hasOwnProperty(r) ) {
            var rel = r.split('-');
            links.push({
                source: rel[0],
                target: rel[1],
                value: rMap[r]
            });
        }
    }

    var nodeMap = {};
    json.nodes.forEach(function( node ) { nodeMap[node.name] = node; });
    json.links = links.map(function( link ) {
        return {
            source: nodeMap[link.source],
            target: nodeMap[link.target],
            value: link.value
        };
    });

    return json;
}

      

This won't be a problem for a small dataset, but the data can contain thousands of rows and possibly up to ~ 10 columns.

So, long story, my question has two parts:

  • Is there an obvious performance improvement I can achieve, and

  • Is there a better (more efficient) way of massaging data for Sankey Chart in D3?

I realize this is a particularly narrow issue, so I appreciate any help on this!

+3


source to share





All Articles