Avoiding determinism with Markov logic

I just started reading more about Markov circuit generators today and am really intrigued by the whole process of building it. In my opinion, the future state depends on the statistical past states to the present.

Example:

Hello World. Hello Dolly. Hello World.

"World" follows "Hello" ~ 66% of the time in this source.

If this is always the case, how then do you avoid enduring the same results every time? Statistical events won't change with a static string, so am I right to assume that no variants will be generated unless the original data is changed in some way?

How can I get the variation from a static source, given the statistical values, while still allowing some flexibility? Using my example above, how do I allow the generator to follow "Hello" with "Dolly" when "Dolly" only follows "Hello" 33% of the time?

I'm guessing I'm asking: how do I base the probability of my next choice on the statistical presence of words that follow my current choice? So "Dolly" appears 33% of the time, and "Peace" appears 66% of the time - or am I completely lost here?

+2


source to share


2 answers


You use a random number generator to choose which path you are heading for. You have to store each state (it really is a history of N previous items) and the probabilities for that state. Then you choose a random number and based on that, decide what is the next state you transition to.

In your example, you have an N of 1 Markov chain, you will have a chain structure that looks something like this:

<start> -> Hello : 1.0

Hello -> World. : 0.66666
Hello -> Dolly. : 0.33333

Dolly. -> Hello : 1.0

World. -> <end> : 0.5
World. -> Hello : 0.5

      



If your current state is Hello, then your next possible states are World. and Dolly. Create a random number between 0 and 1 and select World. if it is less than 0.666666, otherwise select Dolly.

With N = 2 Markov chaining, you get almost deterministic behavior with this input:

<start> <start> -> <start> Hello : 1.0

<start> Hello -> Hello World. : 1.0

Hello World. -> World. Hello : 0.5
Hello World. -> World. <end> : 0.5

World. Hello -> Hello Dolly. : 1.0

Hello Dolly. -> Dolly. Hello : 1.0

Dolly. Hello -> Hello World. : 1.0

      

+3


source


Two comments:

1) To generate samples from a random process, regardless of whether a certain choice (> 50%) is probable and others less similar, a weighted "flip of coins" is simply required: generate a random real number uniformly at [0,1), and consider the possibilities in the same fixed order, while keeping the sum of the probabilities. As soon as this amount exceeds your randomly selected number, select this choice. If your choice has an unnormalized (not summing up to 1) probability, you first need to calculate the sum of the probabilities s and either divide them by s, or choose your random number by [0, s)



2) To prevent overfitting when evaluating your model from a small amount of sample training data (compared to the number of parameters), use Bayesian attributes on model parameters. For a really cool example of this, where the number of model parameters (story size) is not fixed to any finite number in advance, see Infinite HMM . If you are not using Bayesian techniques, then you will need to choose a history length that matches the size of the training data and / or implement some ad-hoc anti-aliasing (such as linear interpolation between order-2 and order-1).

0


source







All Articles