Strange error in an algorithm written in Python for finding the least common plural
For educational purposes, I am trying to create an efficient least common plural search algorithm. For this I already have a quadratic and slow implementation. I am trying to build a new one. My new implementation uses a mathematical property involving Greatest Common Divisor (GCD) and Least Common Plural (LCM).
Basically: for any two natural numbers a and b,
LCM(a, b) * GCD(a, b) = a * b
I am using Python 3 for this.
I have a very efficient implementation for GCD (it uses a different math property, but it's pointless to talk about it):
def euclidean_gcd(a,b):
if b == 0:
return a
else:
a_prime = a%b
return euclidean_gcd(b,a_prime)
My implementation for LCM:
def lcm_fast(a,b):
return (int((a*b)/(euclidean_gcd(a,b))))
However, when I call:
lcm_fast(1023473145,226553150)
I get as result:
46374212988031352
The correct answer would be close:
46374212988031350
I'm a newbie (2nd year in Applied Mathematics), why is this happening?
I'm not sure if I can understand the concept of integer overflow, but as per my understanding of the above little research I did, there is no integer overflow in Python.
I did stress testing and tried to find this error in a clearer case. However, the problem only seems to be happening with really big numbers. Below you can check my stress tests:
import random
#defina a fronteira máxima dos testes randômicos
print ("insira um número para ser o final do intervalo de testes aleatórios")
bound_right = int(input())
#versão lenta, ou naive
def check_elem_in_list(list_1,list_2):
for element in list_1:
if element in list_2:
return element
else:
return False
#nested loops, vai ter comportamento quadrático
def lcm_slow(num_1,num_2):
list_of_num_1_prod = []
list_of_num_2_prod = []
max_nums = max(num_1,num_2)
end_range = max_nums +1
for i in range(1, end_range):
list_of_num_1_prod.append(i*num_1)
list_of_num_2_prod.append(i*num_2)
if check_elem_in_list(list_of_num_1_prod,list_of_num_2_prod) != False:
return (check_elem_in_list(list_of_num_1_prod,list_of_num_2_prod))
def euclidean_gcd(a,b):
if b == 0:
return a
else:
a_prime = a%b
return euclidean_gcd(b,a_prime)
def lcm_fast(a,b):
return (int((a*b)/(euclidean_gcd(a,b))))
# está dando pau com os inputs 1023473145, 226553150
# vou fazer stress testing
#primeiro, fazer função para gerar testes
a_in = random.randint(1,bound_right)
b_in = random.randint(1,bound_right)
while (lcm_slow(a_in,b_in)==lcm_fast(a_in,b_in)):
a_in = random.randint(1,bound_right)
b_in = random.randint(1,bound_right)
print (a_in,b_in,"OK",lcm_fast(a_in,b_in),lcm_slow(a_in,b_in))
if (lcm_slow(a_in,b_in)!=lcm_fast(a_in,b_in)):
print (a_in, b_in,"OPS",lcm_fast(a_in,b_in),lcm_slow(a_in,b_in))
break
#
EDIT AFTER SOME COMMENTS / ANSWERS TO THE ORIGINAL PROBLEM
Within this problem comes a new problem. I am creating this for the platform. My solution is correct. Following a comment from Blender. I did this (this was my original solution):
def lcm_fast(a,b):
a = ((a*b)/(euclidean_gcd(a,b)))
return a
The problem is I get this error message in platform test cases:
Failed case #1/42: Cannot check answer. Perhaps output format is wrong.
Input: 18 35 Your output: 630.0 Correct output: 630 (Time used: 0.01/5.00, memory used: 9613312/536870912.)
This is funny. If I avoid using the approximation int()
, the code is correct for large numbers. However, without converting from float
to, int
I cannot provide the answer in the format I want.
You are converting the result of your division back to an integer c int()
, because "regular" integer division results in a float. CPython boards are of fixed precision, so your back and forth conversion will lose information for large enough numbers.
Avoid losing precision and do floor division using //
that returns an integer:
def lcm_fast(a,b):
return (a * b) // euclidean_gcd(a,b)
source to share
In python 3, standard division ( /
) operations will automatically float, and integer division ( //
) will always return int
.
So python can handle arbitrarily large int
s, at some point your data is treated as a float rather than an int, exposing it to floating point precision errors.
(I see someone else typed a similar answer as I wrote this, and feel obligated to mark it before being killed to death to copy someone else.)
source to share