Python large list error
I made the following program to create a list of sequential numbers. However, the calculations seem to fail for over 70,000 items in the list. I tried using the Pycharm IDE as well as the python console. The result is the same. I am using Python 3.4.1 32 bit. What should I do? What could be the reason?
from pylab import *
a = 100000 # the number of elements from my_array
my_array = [i for i in range(a)]
missing_number = randint(a)
print('Generate a Random number: ', missing_number)
my_array.remove(missing_number) # We remove the random generated number from my_array
print('The number of elements of the list is: ', len(my_array)) #Length of my_array
print('the sum of the list is :',sum(my_array)) # Sum
sum02 = (a *(a-1)/2) # The sum of consecutive numbers
print('The complete sum of the consecutive numbers:',int(sum02),'\n')
print('And the missing number is:', int(sum02) - sum(my_array))
I will reproduce the result I have on my machine:
C: \ Util \ Python34 \ python.exe "find_missing_number_2.py"
Generate a Random number: 15019
The number of elements of the list is: 99999
the sum of the list is : 704967685
The complete sum of the consecutive numbers: 4999950000
And the missing number is: 4294982315
Process finished with exit code 0
This does not result in an error. It just does the wrong calculation, as you can see if you compare the two variables missing_number with the result int (sum02) -sum (my_array)
source to share
from pylab import *
does a from numpy import *
. This includes the numpy.sum function which explicitly says thatArithmetic is modular when using integer types, and no error is raised on overflow.
To avoid this, use the built-in sum function as shown by Reut Sharabani or without doing from pylab import *
, which is bad practice anyway. It can replace any built-in functions without noticing them. As far as I know, it replaces at least the amount and everything at the moment, but I am not sure what it is, and you cannot be sure that it will not replace others in the future.
source to share
If your problem is with the size of the list, try using xrange :
# my_array = [i for i in range(a)]
my_array = xrange(a)
Also the my_array = [i for i in range(a)]
same as my_array = range(a)
if you are using python 2.X
Edit: use inline sum
(arbitrary percision):
__builtins__.sum(a)
source to share
How to create your own function to sum sequential numbers ...
def consecutive_sum(first, last):
half = (first + last) / 2.0
return half * (last - first + 1)
Then you don't need a list of numbers and you can just get one random number (like n) and ...
sum1 = consecutive_sum(1, n-1)
sum2 = consecutive_sum(n+1, max_num)
source to share
I found what was the reason. The reason was that it was mentioned by user2313067 by the fact that I imported the whole pylab module and some of its functions override some other python built-in functions. It's bad practice to really import the entire module, especially if you're only using a function. So the solution in this case is:
from pylab import randint
and the code works even for very large lists (a = 10000000). Now my result is correct:
C:\Util\Python34\python.exe "find_missing_number_2.py"
Generate a Random number: 3632972
The number of elements of the list is: 9999999
the sum of the list is : 49999991367028
The complete sum of the consecutive numbers: 49999995000000
And the missing number is: 3632972
Process finished with exit code 0
source to share