Word decomposition algorithm
Let A be a finite alphabet and A * the set of all finite strixes that you can form from that alphabet. Then you are assigned a set F of some strings from A * (typically much less than A *), which contains the alphabet (A) itself and the string S carrying A *.
Is there a polynomial time algorithm to find the minimum decomposition of S in terms of the strings belonging to F?
As a result of the decomposition, I mean the notation S = F 1... F k concatenated strings F jthat belong to F and you are trying to minimize the number k.
source to share
We can solve this in polynomial time using dynamic programming as follows.
Let D [i] be the minimal decomposition of the first letters i of your string S.
If n is the length of S and f is the total length of all strings in F, we can compute D in O (nf) time by iterating over i.
For each value of i, we try to use all the options in F for the end of the line and choose the one that results in the smallest expansion.
Once we do that, the answer to the original question is in D [n].
Python code
F=['abcd','de','a','b','c','d']
S='abcde'
D = [ [] ]
for i in range(1,len(S)+1):
s = S[:i]
best = None
for f in F:
if s.endswith(f):
a = D[i-len(f)]
if a is None:
continue
if best is None or len(a) < len(best):
best = a + [f]
D.append(best)
print D[len(S)]
prints:
['a', 'b', 'c', 'de']
source to share