Overflow when calculating combinations
I am trying to calculate a combination of C (40, 20) in C ++, however the data types in C ++ cannot seem to handle this calculation correctly, even if I used the data type long long
. Below is my code:
#include <iostream>
long long fac(int x) {
register long long i,f = 1; // Optimize with regFunction
for(i = 1;i <= x;i++)
f *= i;
std::cout << f << std::endl;
return f;
}
// C(n,r) = n!/r!(n-r)!
long long C(long long n, long long r) {
return fac(n) / (fac(r) * fac(n - r));
}
int main(int argc, char const *argv[]) {
std::cout << C(40, 20) << std::endl;
return 0;
}
Any idea to resolve this issue?
Calculate C right away by performing division immediately after multiplication:
long long C(long long n, long long r)
{
long long f = 1; // Optimize with regFunction
for(auto i = 0; i < r;i++)
f = (f * (n - i)) / (i + 1);
return f ;
}
The result must be accurate (division without remainders until overflow), since (n -i) already contains any integer coefficient present in (i + 1). (Shouldn't be too hard to prove)
source to share
Your numbers are growing too much and this is a common problem in such calculations and I'm afraid there is no easy solution. Even if you can slightly reduce the number of multiplications you are likely to do, you end up with overflow withlong long
You can check them out:
https://mattmccutchen.net/bigint/
I know there are various algorithmic approaches to this question. I remember there were some use cases for strings to store integer representations, etc. But, since @Konrad mentioned this, there might be a bad approach to the matter.
source to share
The problem is that factorials are getting very fast. 40! too large to store in long long
. Fortunately, you don't really need to calculate this number here, as you can reduce the fraction of the calculation C(n, r)
before calculating it. This gives the equation (from Wikipedia ):
This works much better since k ! ( r ! in your code) is much smaller than n !. However, it will also break at some point.
Alternatively, you can also use the repetition definition by implementing a recursive algorithm. However, it will be very inefficient (exponential running time) if you don't memoise intermediate results.
source to share
The lazy way out is to use a library that supports multiple precision, such as GNU GMP .
Once properly installed (available from the repositories on most Linux distributions), it boils down to:
- append
#include <gmpxx.h>
to source file - replacing
long long
withmpz_class
- compiling with
-lgmpxx -lgmp
Source:
#include <iostream>
#include <gmpxx.h>
mpz_class fac(mpz_class x) {
int i;
mpz_class f(1); // Optimize with regFunction
for(i = 1;i <= x;i++)
f *= i;
std::cout << f << std::endl;
return f;
}
// C(n,r) = n!/r!(n-r)!
mpz_class C(mpz_class n, mpz_class r) {
return fac(n) / (fac(r) * fac(n - r));
}
int main(int argc, char const *argv[]) {
std::cout << C(40, 20) << std::endl;
return 0;
}
Compiling and running:
$ g++ comb.cpp -lgmpxx -lgmp -o comb
$ ./comb
2432902008176640000
2432902008176640000
815915283247897734345611269596115894272000000000
137846528820
There is much more you can do if you want to be solid, but that will give you the answers.