Optimizing recursive function for very large .Net lists
I have created an application that is used to simulate the number of products a company can produce in different modes per month. This simulation is used to help find the optimal range of modes to operate over the course of the month to best match the forecast monthly sales forecast. This app works well until recently, when the plant was modified to operate in additional modes. Now you can work in 16 modes. For a month with 22 business days, this gives 9,364,199,760 possible combinations. In the past, this has been associated with 8 modes, which would only yield 1,560,780 possible combinations. The PC that runs this application is on the old side and cannot handle the amount of computation before the memory exception is thrown. Actually,the whole application cannot support more than 15 modes because it uses integers to keep track of the number of modes and exceeds the upper limit for an integer. With this problem in mind, I need to do my best to reduce the memory usage in the application and optimize it to run as efficiently as possible, even if it cannot achieve its stated goal in 16 modes. I was considering writing data to disk instead of storing the list in memory, but before I take on this overhead, I'd like to get people's opinion on the method to see if there is room for optimization.to reduce memory usage in the application and optimize it to run as efficiently as possible, even if it can't reach its stated goal in 16 modes. I was considering writing data to disk instead of storing the list in memory, but before I take on this overhead, I would like to get people's opinion on the method to see if there is room for optimization.to reduce the memory usage in the application and optimize it to run as efficiently as possible, even if it cannot reach its stated goal in 16 modes. I was considering writing data to disk instead of storing the list in memory, but before I take on this overhead, I'd like to get people's opinion on the method to see if there is room for optimization.
EDIT Based on the assumption that few consider something more academic, and simply calculating all the possible answers listed below is a short explanation of how the optimal run (combination of modes) is chosen. The computer is currently identifying all the possible ways that a factory can run for a few working days this month. For example, 3 modes for a maximum of 2 business days will result in combinations (where the number represents the selected mode) (1,1), (1,2), (1,3), (2,2), (2,3), ( 3.3). For each mode, product produces at a different production rate, for example in mode 1, product x can produce at 50 units per hour, when product y produces 30 units per hour and product z is 0 units per hour. Each combination is then multiplied by hours of operation and production figures. A run is selected,which produces the numbers that most closely match the predicted value for each product for the month. However, since the plant does not match the predicted value of the product for several months, the algorithm increases the priority of the product for the next month to ensure that the product reaches the predicted value at the end of the year. Since the storage space is tight, it is important that the products are not overproduced either.it is important that food is not overproduced either.it is important that food is not overproduced either.
thank
private List<List<int>> _modeIterations = new List<List<int>>();
private void CalculateCombinations(int modes, int workDays, string combinationValues)
{
List<int> _tempList = new List<int>();
if (modes == 1)
{
combinationValues += Convert.ToString(workDays);
string[] _combinations = combinationValues.Split(',');
foreach (string _number in _combinations)
{
_tempList.Add(Convert.ToInt32(_number));
}
_modeIterations.Add(_tempList);
}
else
{
for (int i = workDays + 1; --i >= 0; )
{
CalculateCombinations(modes - 1, workDays - i, combinationValues + i + ",");
}
}
}
source to share
This optimization problem is complex, but very well understood. You should probably read the literature in it and not try to reinvent the wheel. The keywords you are looking for are "operations research" and "combinatorial optimization problem".
It is well known that when looking for optimization problems, finding the optimal solution to a problem is almost always computationally impracticable, since the problem grows, as you discovered for yourself. However, it often happens that finding a solution that is guaranteed within a certain percentage of the optimal solution is feasible. You should probably focus on finding approximate solutions. After all, your sales goals are only enlightened guesses, so finding the optimal solution will no longer be possible; you don't have complete information.)
What I would do is start by reading the wikipedia page on the Knapsack issue:
http://en.wikipedia.org/wiki/Knapsack_problem
This is the problem "I have a whole bunch of items of different values and different weights, I can carry 50 pounds in my backpack, what is the largest possible value I can carry to fulfill my purpose?"
It's not really your problem, but of course it is related - you have a certain "value" to maximize and a limited number of slots to pack that value. If you can begin to understand how people find near-optimal solutions to the backpack problem, you can apply that to your specific problem.
source to share
You can handle the permutation as soon as you've generated it, instead of collecting them in a list first:
public delegate void Processor(List<int> args);
private void CalculateCombinations(int modes, int workDays, string combinationValues, Processor processor)
{
if (modes == 1)
{
List<int> _tempList = new List<int>();
combinationValues += Convert.ToString(workDays);
string[] _combinations = combinationValues.Split(',');
foreach (string _number in _combinations)
{
_tempList.Add(Convert.ToInt32(_number));
}
processor.Invoke(_tempList);
}
else
{
for (int i = workDays + 1; --i >= 0; )
{
CalculateCombinations(modes - 1, workDays - i, combinationValues + i + ",", processor);
}
}
}
I'm assuming your current way of working is something like lines
CalculateCombinations(initial_value_1, initial_value_2, initial_value_3);
foreach( List<int> list in _modeIterations ) {
... process the list ...
}
With the direct process method, this would be
private void ProcessPermutation(List<int> args)
{
... process ...
}
... somewhere else...
CalculateCombinations(initial_value_1, initial_value_2, initial_value_3, ProcessPermutation);
I also suggest that you try to prune the search tree as early as possible; if you can already say that some combinations of arguments will never yield anything that can be processed, you should catch them already during generation and avoid recursion alltogether if possible.
In newer versions of C #, generating combinations using the iterator (?) Function can be used to preserve the original structure of your code. I have not used this function ( yield
) yet, so I cannot comment on it.
source to share
The problem lies in the broader Brute Force approach in the code itself. It is possible that brute force might be the only way to approach the problem, but I doubt it. Chess, for example, is insoluble Brute Force, but computers play it well enough, using heuristics to abandon less promising approaches and focus on the good ones. Perhaps you should take a similar approach.
On the other hand, we need to know how each "mode" is evaluated in order to suggest any heuristics. In your code, you are only calculating all possible combinations, which, anyway, won't scale if the modes are increased to 32 ... even if you store them on disk.
source to share
if (modes == 1)
{
List<int> _tempList = new List<int>();
combinationValues += Convert.ToString(workDays);
string[] _combinations = combinationValues.Split(',');
foreach (string _number in _combinations)
{
_tempList.Add(Convert.ToInt32(_number));
}
processor.Invoke(_tempList);
}
Everything in this block of code is executed over and over, so no line in that code should use memory without freeing it. The most obvious place to avoid memory frenzy is to write combinationValues
to disk as it is processed (i.e., use it FileStream
, not string
). I think that in general doing string concatenation the way you are doing here is bad, since each concatenation leads to sadness in memory. At least use stringbuilder (see back to basics , which discusses the same problem from a C perspective). However, there may be other places with problems. The easiest way to figure out why you are getting an out of memory error might be to use a memory profiler ( Download link from download.microsoft.com).
List
that is
Clear()
ed rather than a temporary one that gets instantiated over and over.
source to share
I would replace List objects with my own class, which uses pre-allocated arrays to store int. I'm not entirely sure about this right now, but I believe that every integer in the list is boxed, which means that much more memory is used than when using a simple int array.
Edit . On the other hand, I seem to be wrong: Which one is more efficient: List <int> or int []
source to share