Creating a matrix of arbitrary size where rows are summed to 1?
My task is to create a program that simulates a discrete-time Markov chain for an arbitrary number of events. However, for now, the part I'm struggling with is creating the correct stochastic matrix that will represent the probabilities. A correct stochastic matrix is ââa matrix that has row entries that add up to 1. And for a given size, I kind of can write a matrix that does this, however the problem is that I don't know how to do this for an arbitrary size ...
For example: here is my code for a 3x3 matrix as well as an example of the output I was getting.
However, my code doesn't work every time - there are certain times when the third entry in the line is negative because the first two are too big. And I don't know how to get around this, as far as I know, there is no function in Python that makes it so that you can generate random numbers that add up to something in particular.
Any help is appreciated.
(Note that this is not a homework problem, this is only for extra credit in my Math class, and the professor doesn't mind using outside sources.)
source to share
Generate an NxN matrix with random values.
For each row:
Find the sum of row S
S[j] = Sum(0..N-1){A[j, i]}
Then subtract (S-1) / N from each value in that line
A[j, i] = A[j, i] - (S[j] - 1) / N
If you only want non-negative values, create non-negative randoms and split each value in a line by the sum of that line
A[j, i] = A[j, i] / S[j]
source to share
Here's some code:
import random
precision = 1000000
def f(n) :
matrix = []
for l in range(n) :
lineLst = []
sum = 0
crtPrec = precision
for i in range(n-1) :
val = random.randrange(crtPrec)
sum += val
lineLst.append(float(val)/precision)
crtPrec -= val
lineLst.append(float(precision - sum)/precision)
matrix.append(lineLst)
return matrix
matrix = f(5)
print matrix
I assumed that the random numbers should be positive, the sum of the numbers in raw should be 1. I used the precision in the variable "precision", if it is 1000, this means that the random numbers will have 3 digits after the decimal point. Example y uses 6 digits, you can use more.
Output:
[[0.086015, 0.596464, 0.161664, 0.03386, 0.121997],
[0.540478, 0.040961, 0.374275, 0.003793, 0.040493],
[0.046263, 0.249761, 0.460089, 0.006739, 0.237148],
[0.594743, 0.125554, 0.142809, 0.056124, 0.08077],
[0.746161, 0.151382, 0.068062, 0.005772, 0.028623]]
source to share
The right stochastic matrix is ââa real square matrix, with each row being added by 1.
Here's a sample from which you can create a function, I leave that as homework
In [26]: import numpy as np
In [27]: N, M = 5, 5
In [28]: matrix = np.random.rand(N, M)
In [29]: matrix
Out[29]:
array([[ 0.27926909, 0.37026136, 0.35978443, 0.75216853, 0.53517512],
[ 0.93285517, 0.54825643, 0.43948394, 0.15134782, 0.31310007],
[ 0.91934362, 0.51707873, 0.3604323 , 0.78487053, 0.85757986],
[ 0.53595238, 0.80467646, 0.88001499, 0.4668259 , 0.63567632],
[ 0.83359167, 0.41603073, 0.21192656, 0.22650423, 0.95721952]])
In [30]: matrix = np.apply_along_axis(lambda x: x - (np.sum(x) - 1)/len(x), 1, matrix)
In [31]: matrix
Out[31]:
array([[ 0.01993739, 0.11092965, 0.10045272, 0.49283682, 0.27584341],
[ 0.65584649, 0.27124774, 0.16247526, -0.12566087, 0.03609139],
[ 0.43148261, 0.02921772, -0.12742871, 0.29700952, 0.36971886],
[ 0.07132317, 0.34004725, 0.41538578, 0.00219669, 0.17104711],
[ 0.50453713, 0.08697618, -0.11712798, -0.10255031, 0.62816498]])
Description
Create an N x M matrix
Then we calculate (sum - 1) / N
which is subtracted from each element line by line
We then apply it to each row of the matrix using np.apply_along_axis()
c axis=1
to apply it to each row
Check the result
Each line must sum to 1
In [37]: matrix.sum(axis=1)
Out[37]: array([ 1., 1., 1., 1., 1.])
but how do you subtract that value from each entry in the line?
In my example I used lambda
which is equivalent to this function
def subtract_value(x):
return x - (np.sum(x) - 1)/len(x)
You can pass a function apply_along_axis()
for each element on the axis, in our case these are lines
There are other ways too : numpy.vectorize () and numpy.frompyfunc
Creating a function and applying it, like any method from the above, is better than iterating over each element on each line, faster and less code, easier to read / understand the intent
source to share