Algorithm for placing nodes in a circle, taking into account their distance to each other

I have the following problem. I have brain areas and correlations between them. Areas of the brain I know distances. We now expect correlations to be negatively correlated with the distance between brain regions. Therefore, as we increase the distance, the correlation drops to zero. It is expected to be 1 / D ^ 2.

I want to visualize my correlation matrix to check for anomalies. I already have some other implementations like the Tyune correlation matrix visualization and a simple 2D scatter plot with a 1 / D ^ 2 curve as a blue line.

Next, I want to have something based on correlation circles .

Brain areas I created a Node class for. So my brain areas are nodes. I am simulating correlation with Edges. My Edges have sourceNode and destinationNode as well as correlation and distance, so I can link them to the correct Node. Distance and correlation are needed to find the table (feedback from regionID and regionName, etc.).

Now I want to place all the nodes in a circle so that the nodes that are a little apart from each other are close to each other, and the nodes that are far from each other are further away. This way the strong edges (thick) are close to each other. And when you have a very strong edge crossing the circle, it is uncomfortable and the eye gets scattered easily. Of course I am looking for the optimal one as below, one real answer is not exciting.

I have searched google but since I have no idea what to look for I have not found any results. I suspect there is a standard algorithm for this, but I don't know that. The link to such an algorithm is fine too.

What I have come up with so far is to arrange the nodes on the circle so that the SUM of all distances is the smallest. But for this I need to create a kind of point system, so that areas that are close to each other and are close to each other get, for example, several + points and points that are close to each other, but located farther apart, get some low points. Now optimize the point algorithm and get the maximum result.

Any advice on this matter? My math is not that great;). I am currently googling on circles, knots, weights.

Note

If you have any other bright ideas for matrix visualization, be sure to email me about it or comment here :).

+2


source to share


3 answers


The general problem you are describing has no solution because you are trying to make a map from a 2D surface to a 1D line that preserves all distances, which is not possible. If there is a specific region that you want to compare with all others, you can place all the others around the circle so that their distance matches the distance to that region (but then the distance between these other regions will be distorted).

But you can certainly do better than just random in the distance approximation. Here's the approach: the first step would be to make a few random locations and then pick the best one. The next improvement would be to optimize each of these schemes depending on some cost function, moving the regions in small steps until they reach a local minimum, and then choose the best of these local minima. The results are shown in the graphs below, and the Python code is below.

alt text

import pylab as px
import numpy as nx
import numpy.random as rand
rand.seed(1)
rt2 = nx.sqrt(2)

N = 10 # number of brain regions

# make brain region locations r=1
regions = []
s = 2.
while len(regions)<N:
    p = 2*s*rand.rand(2)-s
    if nx.sqrt(px.dot(p,p))<s:
        regions.append(p)
regions = nx.array(regions)

#px.figure()
px.subplot(2,2,1)
for i in range(len(regions)):
    px.text(regions[i,0], regions[i,1], 'i', fontsize=15)
px.xlim(-1.1*s, 1.1*s)
px.ylim(-1.1*s, 1.1*s)
px.title("inital positions")

# precalc distance matrix for future comparisons
dm = nx.zeros((N,N), dtype=nx.float)
for i in range(N):
    for j in range(N):
        dm[i,j] = nx.sqrt(nx.sum((regions[i,:]-regions[j,:])**2))

def randomize_on_circle(n):
    """return array of n random angles"""
    return 2*nx.pi*rand.rand(n)

def cost_fcn(d_target, d_actual): # cost for distances not matching
    return abs(d_target-d_actual)

def calc_cost(angles):
    """calc cost for the given arrangement    """
    c = 0.
    for i in range(N-1):
        for j in range(i, N):
            # sqrt(...) is distance between two pts on a circle (I think)
            c += cost_fcn(dm[j, i], rt2*nx.sqrt(1-nx.cos(angles[i]-angles[j])))
    return c

def optimize_step(a, shift=2*nx.pi/360):
    """try shifting all points a bit cw and ccw, and return the most beneficial"""
    max_benefit, ref_cost = None, None
    best_i, best_shift = None, None
    for imove in range(N): # loop through the regions and try moving each one
        cost0 = calc_cost(a)
        for da in (shift, -shift):
            a_temp = nx.array(a)
            a_temp[imove] += da
            cost = calc_cost(a_temp)
            benefit = cost0 - cost  # benefit if moving lowers the cost
            if max_benefit is None or benefit > max_benefit:
                max_benefit, best_i, best_shift, ref_cost = benefit, imove, da, cost
    return max_benefit, best_i, best_shift, ref_cost       

lowest_cost, best_angles = None, None
cost_initials, cost_plateaus = [], []
for i in range(30):  # loop though 20 randomized placements on the circle
    angles = randomize_on_circle(N)
    costs = []
    benefits = []
    # optimize each original arrangement by shifting placements one-by-one in small steps
    count_benefits_neg = 0
    count_total, max_total = 0, 2000
    while count_benefits_neg < 10: # better to do a variable step size
        b, i, s, c = optimize_step(angles)
        angles[i] += s
        costs.append(c)
        benefits.append(b)
        if b < 0:
            count_benefits_neg += 1
        count_total += 1
        if count_total > max_total:
            print count_total, b, costs[-20:], benefits[-20]
            raise "not finding an equilibrium"
    if lowest_cost is None or c < lowest_cost:
        lowest_cost = c
        best_angles = nx.array(angles)
        cost_graph = costs[:]
        benefit_graph = nx.array(benefits)
    cost_plateaus.append(c)
    cost_initials.append(costs[0])

px.subplot(2, 2, 2)
px.plot(cost_graph, 'o') # make sure the cost is leveling off
px.title("cost evoloution of best")
px.subplot(2, 2, 3)
px.plot(cost_initials, 'o')
px.plot(cost_plateaus, 'd')
px.title("initial and final costs")

px.subplot(2, 2, 4)
for i in range(len(best_angles)):
    px.text(nx.cos(best_angles[i]), nx.sin(best_angles[i]), 'i', fontsize=15)
px.xlim(-1.2, 1.2)
px.ylim(-1.2, 1.2)
px.title("positioned on circle")

px.show()

      



Interestingly, this seems to have caused distant things to be distant and close ones close, but with messed up orders of the middle, so maybe this will do what you want? (This also illustrates the fundamental problem when going from 2D to 1D. For example, on circle 4 wants to be further away from 9, but this cannot be done without approaching other numbers, whereas in 2D it might just go out of the way.)

You might want to change cost_fnc

which determines the penalty for not matching the distances of the points on the circle with the distance from the 2D schematic. Changing this value may help to increase the cost of large errors (say, quadratic errors) or to emphasize the correctness of costs over long distances, say, d_target*(abs(d_actual-d_target))

etc.

Also, changing the size of the circle relative to the size of the 2D data will greatly change the appearance of this, and you probably want to circle a slightly smaller circle than the data, as I did here, as this will extend the points around the circle more. (The circle here has an R = 1, so just scale the data accordingly.) Also note that quantifying the cost will not be very meaningful as the best schemes never achieve very low costs as some regions can never be that far apart. friend as in 2D data.

The trigger point for multiple random runs is that an evolving pattern can get stuck at local lows. This method seems to be useful: the calculations help to correctly determine the distance and reduce costs (graph # 3, blue dots = initial random variable, diamonds = local minimum), and it helps some initial agreements much more than others, so it is worth trying a few initial agreements. Also, since some of them appear to be as high as about 15, this gives some confidence that this pattern might be representative.

+3


source


I suggest you use an energy minimization algorithm to place your nodes in a circle in such a way as to minimize something like a square (distance in a circle is distance in the brain). Then smooth the edges as you describe and the rendering should be complete.



+2


source


It is possible that GraphViz has some links to algorithms. Alternatively, you can get your data in a format that GraphViz accepts and runs through it.

+1


source







All Articles