How to make columns of numpy array sum to 1

I am working on building a transition matrix to implement the PageRank algorithm. How can I use numpy to make sure the columns are one.

For example:

1 1 1   
1 1 1  
1 1 1

      

should be normalized as

.33 .33 .33  
.33 .33 .33  
.33 .33 .33

      

+3


source to share


2 answers


Divide the elements of each column by their column sums -

a/a.sum(axis=0,keepdims=1) # or simply : a/a.sum(0)

      

To make the lines unified, change the axis input -

a/a.sum(axis=1,keepdims=1)

      

Example run -

In [78]: a = np.random.rand(4,5)

In [79]: a
Out[79]: 
array([[ 0.37,  0.74,  0.36,  0.41,  0.44],
       [ 0.51,  0.86,  0.91,  0.03,  0.76],
       [ 0.56,  0.46,  0.01,  0.86,  0.38],
       [ 0.72,  0.66,  0.56,  0.84,  0.69]])

In [80]: b = a/a.sum(axis=0,keepdims=1)

In [81]: b.sum(0) # Verify
Out[81]: array([ 1.,  1.,  1.,  1.,  1.])

      



To make sure it works with arrays int

also for Python 2.x, use from __future__ import division

or use np.true_divide

.


For columns adding upto 0

For columns adding upto 0

, assuming that we are fine with keeping them, we can set the sums 1

rather than divide by 0

, like so -

sums = a.sum(axis=0,keepdims=1); 
sums[sums==0] = 1
out = a/sums

      

+6


source


for i in range(len(A[0])):
    col_sum = A[:, i].sum()
    if col_sum != 0:
        A[:, i] = A[:, i]/col_sum
    else: 
        pass

      



for loop

a bit sloppy and I'm sure there is a much more elegant way out there, but it works.
Replace pass

with A[:, i] = 1/len(A[0])

to eliminate dangling nodes and make the matrix column stoastic.

0


source







All Articles