Shape matching algorithm (one to many)
This is my first post on stackoverflow.
I need to report an algorithm for a financial application. Suppose we have 2 lists of such numbers (yes, these are bank transactions):
List 1 | List 2
-------------------------------------
1000 | 200
7000 | 300
3000 | 10000
16000 | 500
| 16000
| 4100
I need to match the numbers to each other, given some conditions:
-
Matches can be individual, one-to-many, or even many-to-many. So here are two 16000 matches (one to one), 1000 from list 1 correspond to 200 + 300 + 500 from list 2 (one to three), 10000 from list 2 correspond to 7000 + 3000 from list 1 (one to two) and etc.
-
The drawing can be used in multiple matches.
-
The number of digits in the two lists may or may not be equal.
-
The maximum number of digits in a one-to-many match must be set.
-
Many-to-many combinations are optional. But it would be nice if we had them too!
-
Some numbers can be left unmatched. This is normal.
What I am doing to achieve this is using two complex nested loops. It works, but as the number of numbers or the maximum number of pieces allowed in each match increases, it takes age to complete.
Is there a better algorithm for this?
source to share
I think I'm right to assert and SO will give me a kick if I'm wrong that the core of your computation is NP-hard, which means you are (very) unlikely to find a polynomial time solution. Your kernel, given one number (for example 10000
) and a list of other numbers, will find all subsets of the list that add up to one number.
This kernel is a variation of the subset sum problem .
Given that there are limits to how much better algorithm you can find, your expectations for finding a "fast" algorithm are likely to be disappointed.
To speed up your algorithm, I would suggest starting by sorting both lists. Take the first number from list 1, from list 2 take all numbers less than or equal to a number from list 1, find out the matches, repeat ... Then roll down list 2 by number ...
source to share
To do this, first create combinations of each list. For example, for list 1 the following combinations:
1000
3000
7000
16000
1000 3000
1000 7000
1000 16000
3000 7000
3000 16000
7000 16000
1000 3000 7000
1000 3000 16000
1000 7000 16000
3000 7000 16000
1000 3000 7000 16000
For each combination, you generate the sum of the elements in the combination. You now have two lists of amounts. To solve the problem, you cross two lists. Various algorithms exist for performing the intersection. One simple approach is to make the smaller of the two lists in a binary tree. Then, for each item in the larger list, you will find it in the binary tree. This algorithm has a time complexity of n * log (n).
source to share