Adding values ββin various combinations
Not sure how best to explain this other than using an example ...
Imagine that a customer has 10 outstanding invoices and one day they give you a check, but they don't tell you which invoices are for them.
What would be the best way to return all possible combination of values ββthat can provide the required total?
My current thinking is a kind of brute force method that involves using a self-triggering feature that works despite all possibilities ( see current version ).
For example, with 3 numbers, there are 15 ways to add them:
- AND
- A + B
- A + B + C
- A + C
- A + C + B
- IN
- B + A
- B + A + C
- B + C
- B + C + A
- FROM
- C + A
- C + A + B
- C + B
- C + B + A
What if you remove duplicates, give you 7 unique ways to add them:
- AND
- A + B
- A + B + C
- A + C
- IN
- B + C
- FROM
However, this look falls apart after you:
- 15 numbers (32,767 possibilities / ~ 2 seconds for calculation)
- 16 numbers (65,535 possibilities / ~ 6 seconds for calculation)
- 17 numbers (131 071 possibility / ~ 9 seconds for calculation)
- 18 numbers (262 144 possibilities / ~ 20 seconds for calculation)
Where would I like this function to handle at least 100 numbers.
So, any ideas on how to improve it? (in any language)
source to share
This is a fairly common change to the subset sum problem , and it is really quite difficult. The section on Pseudo-Polynomial Dynamic Programming Software Solution on the page is related to what you're after.
source to share
This is strictly for the number of possibilities and does not consider overlap. I'm not sure what you want.
Consider states in which there can be one value at a time - it can be included or excluded. These are two different states, so the number of different states for all n elements will be 2 ^ n. However, there is one condition that is not required; this is the state when none of the numbers are included.
Thus, for any n numbers, the number of combinations is 2 ^ n-1.
def setNumbers(n): return 2**n-1
print(setNumbers(15))
These results are very closely related to combinations and permutations.
Instead, I think you can tell if a set of values ββhas given any combination of them the sum to the value of k. For this Bill the Lizard pointed you in the right direction.
Following this, and bearing in mind that I have not read the entire Wikipedia article, I suggest this algorithm in Python:
def combs(arr):
r = set()
for i in range(len(arr)):
v = arr[i]
new = set()
new.add(v)
for a in r: new.add(a+v)
r |= new
return r
def subsetSum(arr, val):
middle = len(arr)//2
seta = combs(arr[:middle])
setb = combs(arr[middle:])
for a in seta:
if (val-a) in setb:
return True
return False
print(subsetSum([2, 3, 5, 8, 9], 8))
Basically, the algorithm works like this:
- Splitting a list into 2 lists by about half the length. [O (n)]
- Finds a collection of subsets of sums. [O (2 n / 2 n)]
- Loops through the first set of values ββup to 2 floor (n / 2) -1 see if the other value in the second set is k. [O (2 n / 2 n)]
So, I think it generally runs in O (2 n / 2 n) - still pretty slow, but much better.
Sounds like a bin packaging problem . They are NP-complete, i.e. It is almost impossible to find the perfect solution for large sets of tasks. But you can very closely use a heuristic that is probably applicable to your problem, even if it is not a cart packing problem.
source to share
This is an option for a similar problem.
But you can solve this by creating a counter with n bits. Where n is the number of numbers. Then you count from 000 to 111 (n 1) and for each number a 1 is equivalent to the available number:
001 = A
010 = B
011 = A+B
100 = C
101 = A+C
110 = B+C
111 = A+B+C
(But that was not a question, well, I leave it as a target).
source to share
This is not a problem with the packaging of the basket. It is what combination of values ββcould give a different meaning.
It looks more like a change problem that has a bunch of articles detailing how to fix it. Google pointed me out here: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.57.3243
source to share
I don't know how often this will work in practice, as there are many exceptions to this simplified case, but here's a thought:
In an ideal world, invoices would be paid up to a certain point. People would pay A or + B or + B + C, but not A + C - if they received invoice C, they already received invoice B. In an ideal world, the problem is not finding the combination, but to find a point along the line.
Instead of roughly billing each invoice totals combination, you can iterate over the remaining invoices in the order of the date issue and simply add each invoice amount to the total you are comparing to the target figure.
Back in the real world, this is a trivially quick check you can do before starting the heavy number crunching or chasing them. Any hits he receives are a bonus :)
source to share
Here is an optimized object-oriented version of the exact integer solution to the Subset Sums problem (Horowitz, Sahni 1974). On my laptop (which is nothing special) this vb.net class solves 1900 subsets sums up a second (for 20 items):
Option Explicit On
Public Class SubsetSum
'Class to solve exact integer Subset Sum problems'
''
' 06-sep-09 RBarryYoung Created.'
Dim Power2() As Integer = {1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32764}
Public ForceMatch As Boolean
Public watch As New Stopwatch
Public w0 As Integer, w1 As Integer, w1a As Integer, w2 As Integer, w3 As Integer, w4 As Integer
Public Function SolveMany(ByVal ItemCount As Integer, ByVal Range As Integer, ByVal Iterations As Integer) As Integer
' Solve many subset sum problems in sequence.'
''
' 06-sep-09 RBarryYoung Created.'
Dim TotalFound As Integer
Dim Items() As Integer
ReDim Items(ItemCount - 1)
'First create our list of selectable items:'
Randomize()
For item As Integer = 0 To Items.GetUpperBound(0)
Items(item) = Rnd() * Range
Next
For iteration As Integer = 1 To Iterations
Dim TargetSum As Integer
If ForceMatch Then
'Use a random value but make sure that it can be matched:'
' First, make a random bitmask to use:'
Dim bits As Integer = Rnd() * (2 ^ (Items.GetUpperBound(0) + 1) - 1)
' Now enumerate the bits and match them to the Items:'
Dim sum As Integer = 0
For b As Integer = 0 To Items.GetUpperBound(0)
'build the sum from the corresponding items:'
If b < 16 Then
If Power2(b) = (bits And Power2(b)) Then
sum = sum + Items(b)
End If
Else
If Power2(b - 15) * Power2(15) = (bits And (Power2(b - 15) * Power2(15))) Then
sum = sum + Items(b)
End If
End If
Next
TargetSum = sum
Else
'Use a completely random Target Sum (low chance of matching): (Range / 2^ItemCount)'
TargetSum = ((Rnd() * Range / 4) + Range * (3.0 / 8.0)) * ItemCount
End If
'Now see if there is a match'
If SolveOne(TargetSum, ItemCount, Range, Items) Then TotalFound += 1
Next
Return TotalFound
End Function
Public Function SolveOne(ByVal TargetSum As Integer, ByVal ItemCount As Integer _
, ByVal Range As Integer, ByRef Items() As Integer) As Boolean
' Solve a single Subset Sum problem: determine if the TargetSum can be made from'
'the integer items.'
'first split the items into two half-lists: [O(n)]'
Dim H1() As Integer, H2() As Integer
Dim hu1 As Integer, hu2 As Integer
If ItemCount Mod 2 = 0 Then
'even is easy:'
hu1 = (ItemCount / 2) - 1 : hu2 = (ItemCount / 2) - 1
ReDim H1((ItemCount / 2) - 1), H2((ItemCount / 2) - 1)
Else
'odd is a little harder, give the first half the extra item:'
hu1 = ((ItemCount + 1) / 2) - 1 : hu2 = ((ItemCount - 1) / 2) - 1
ReDim H1(hu1), H2(hu2)
End If
For i As Integer = 0 To ItemCount - 1 Step 2
H1(i / 2) = Items(i)
'make sure that H2 doesnt run over on the last item of an odd-numbered list:'
If (i + 1) <= ItemCount - 1 Then
H2(i / 2) = Items(i + 1)
End If
Next
'Now generate all of the sums for each half-list: [O( 2^(n/2) * n )] **(this is the slowest step)'
Dim S1() As Integer, S2() As Integer
Dim sum1 As Integer, sum2 As Integer
Dim su1 As Integer = 2 ^ (hu1 + 1) - 1, su2 As Integer = 2 ^ (hu2 + 1) - 1
ReDim S1(su1), S2(su2)
For i As Integer = 0 To su1
' Use the binary bitmask of our enumerator(i) to select items to use in our candidate sums:'
sum1 = 0 : sum2 = 0
For b As Integer = 0 To hu1
If 0 < (i And Power2(b)) Then
sum1 += H1(b)
If i <= su2 Then sum2 += H2(b)
End If
Next
S1(i) = sum1
If i <= su2 Then S2(i) = sum2
Next
'Sort both lists: [O( 2^(n/2) * n )] **(this is the 2nd slowest step)'
Array.Sort(S1)
Array.Sort(S2)
' Start the first half-sums from lowest to highest,'
'and the second half sums from highest to lowest.'
Dim i1 As Integer = 0, i2 As Integer = su2
' Now do a merge-match on the lists (but reversing S2) and looking to '
'match their sum to the target sum: [O( 2^(n/2) )]'
Dim sum As Integer
Do While i1 <= su1 And i2 >= 0
sum = S1(i1) + S2(i2)
If sum < TargetSum Then
'if the Sum is too low, then we need to increase the ascending side (S1):'
i1 += 1
ElseIf sum > TargetSum Then
'if the Sum is too high, then we need to decrease the descending side (S2):'
i2 -= 1
Else
'Sums match:'
Return True
End If
Loop
'if we got here, then there are no matches to the TargetSum'
Return False
End Function
End Class
Here is the form code to go along with it:
Public Class frmSubsetSum
Dim ssm As New SubsetSum
Private Sub btnGo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnGo.Click
Dim Total As Integer
Dim datStart As Date, datEnd As Date
Dim Iterations As Integer, Range As Integer, NumberCount As Integer
Iterations = CInt(txtIterations.Text)
Range = CInt(txtRange.Text)
NumberCount = CInt(txtNumberCount.Text)
ssm.ForceMatch = chkForceMatch.Checked
datStart = Now
Total = ssm.SolveMany(NumberCount, Range, Iterations)
datEnd = Now()
lblStart.Text = datStart.TimeOfDay.ToString
lblEnd.Text = datEnd.TimeOfDay.ToString
lblRate.Text = Format(Iterations / (datEnd - datStart).TotalMilliseconds * 1000, "####0.0")
ListBox1.Items.Insert(0, "Found " & Total.ToString & " Matches out of " & Iterations.ToString & " tries.")
ListBox1.Items.Insert(1, "Tics 0:" & ssm.w0 _
& " 1:" & Format(ssm.w1 - ssm.w0, "###,###,##0") _
& " 1a:" & Format(ssm.w1a - ssm.w1, "###,###,##0") _
& " 2:" & Format(ssm.w2 - ssm.w1a, "###,###,##0") _
& " 3:" & Format(ssm.w3 - ssm.w2, "###,###,##0") _
& " 4:" & Format(ssm.w4 - ssm.w3, "###,###,##0") _
& ", tics/sec = " & Stopwatch.Frequency)
End Sub
End Class
Let me know if you have any questions.
source to share
For the record, here is some pretty simple Java code that uses recursion to solve this problem. It's optimized for simplicity, not performance, although with 100 elements it looks pretty fast. With 1000 elements, it takes significantly longer, so if you are processing large amounts of data, you might be better off using a more complex algorithm.
public static List<Double> getMatchingAmounts(Double goal, List<Double> amounts) {
List<Double> remaining = new ArrayList<Double>(amounts);
for (final Double amount : amounts) {
if (amount > goal) {
continue;
} else if (amount.equals(goal)) {
return new ArrayList<Double>(){{ add(amount); }};
}
remaining.remove(amount);
List<Double> matchingAmounts = getMatchingAmounts(goal - amount, remaining);
if (matchingAmounts != null) {
matchingAmounts.add(amount);
return matchingAmounts;
}
}
return null;
}
source to share