D3: Most densely populated areas of the scatter plot

I made a scatterplot with D3. How to identify the most densely populated areas in a plot and surround them with an ellipse. For example, in the upper right corner of the graph there are below 2 settlements. Is there a function for this? If not, I appreciate suggesting 2 things: identify and surround or mark them in some way.

Scater plot http://tetet.net/clusterLab/scatter.png

var width = 300,
    height = 200;

var x = d3.scale.linear().range([0, width]),
    y = d3.scale.linear().range([height, 0]);

var svg = d3.select("body")
    .attr("width", width)
    .attr("height", height); 

d3.tsv("data.tsv", function(error, data) {
    if (error) console.warn(error);
    x.domain(d3.extent(data, function(q) {return q.xCoord;}));
    y.domain(d3.extent(data, function(q) {return q.yCoord;}));

            .attr("r", 5)
            .attr("cx", function(d) { return x(d.xCoord); })
            .attr("cy", function(d) { return y(d.yCoord); })



xCoord  yCoord
0   0
5   3
2   1
4   7
7   4
5   2
9   9
3   4
1   6
5   4
8.1 6.2
8.4 6.6
8   6
8   7
7   8
6.8 8.3
6.4 8.4
6.2 8.3



source to share

2 answers

There are a number of clustering algorithms . I'll give an example of the OPTICS algorithm (I picked it at random, actually) and a way to mark points with unique colors for each cluster.

Please note that I am using the clustering density package available for npm.

Once we load and parse the data (but before drawing anything to the screen), set up the algorithm:

var optics = new OPTICS(),

    // The algorithm requires a dataset of arrays of points,
    // so we need to create a modified copy of our original data:
    opticsData = data.map(function (d) {
        return [d.xCoord, d.yCoord];

    // Algorithm configuration:
    epsilon = 2, // min distance between points to be considered a cluster
    minPts = 2, // min number of points in a cluster

    // Now compute the clusters:
    clusters = optics.run(opticsData, epsilon, minPts);


Now we can mark points in our original data with information about which class they belong to. A very crude solution ... you might think of something more elegant:

clusters.forEach(function (cluster, clusterIndex) {
    cluster.forEach(function (index) {
        // data is our original dataset:
        data[index].cluster = clusterIndex;


Now let's create a very simple color scale and apply it to our points:

var colorScale = d3.scale.category20();

// Some code omitted for brevity:
    .style('fill', function (d) {
        return colorScale(d.cluster);


You can watch a demo . I had to include the library as is, so you need to scroll to the bottom of the JavaScript panel, sorry.



If you only want a visual representation, and you don't need to compute a location or center or anything, then the solution can be very simple. In addition to the existing circles that represent the data points, draw each data point with a semi-transparent large circle. Where these larger circles intersect, the intersection will be darker, and the larger this overlap, the darker it will be (assuming you keep the background white). You can choose the size of the circles, their color and the degree of transparency / transparency.



All Articles