Python: Object Creation Timing in Loop / Understanding / Matching vs. One Time
I am asking for this in general, as I cannot post the actual code for various reasons. IPython Notebook does the following:
I created a class structured this way (it requires numpy)
class MyClassName(object):
def __init__(self, filename):
self.filename = filename
self.read_binary_file() # Run these on object creation
self.calculate_parameters()
self.check_for_errors()
...
def read_binary_file( self ): # This requires numpy.
# # The file is 250MB binary and
# # ultimately yields a numpy array
# # 32 x 32 x 100000 element
...
def calculate_parameters( self ):
...
def check_for_errors( self ):
...
def other_function1( self ):
...
def other_function2( self ):
and etc.
The code sounds. I can do the following
q = MyClassName('testfile.dat') # Instantiate an object
q.other_function1() # Invoke methods
and etc.
%timeit q = MyClassName('testfile.dat')
gives about 0.9 seconds for this creation
But , if I have a list of files
and create objects in a loop, understanding or map filenames = ['f1.dat', 'f2.dat', ..., 'f10.dat']
Chomp = map( MyClassName, filenames ) Chomp = [ MyClassName(j) for j in filenames ] Chomp = [] for j in filenames: Chomp.append( MyClassName(j) )
each object takes over 3.5 seconds to create. Loop takes 3.5 seconds / file x number of files to complete
What I have tried: I have been looking for information on list creation, list addition timings, memory management / assumptions, disable / re-garbage collection after each object is created, etc.
I also imported the cprofile run when creating one object.
They all report 3.5 seconds. cprofile says that a numeric binary read took 2.5 seconds 3.5s to create a single object. But this same procedure is called when I create a separate object outside of the loop or cprofile.
Only the creation of one object is fast.
I am running on a Windows 7 computer and controlled by the task manager. At one point it looked like I pulled out of physical memory and was replaced by a page, so I rebooted, restarted iPython / Notebook, only enabled one core, and had few other programs. Memory load decreased, but loop performance did not improve.
I am new to OOP in general, have been working with Python for several months now and am interested in understanding what is going on, so I can code in a more appropriate way.
source to share
[Answer converted from question]
Decision
- There was no real problem (!) ... just really bad observations with me.
As noted by M. Wasowski and JonZwink in the comments, it %timeit
is executed several times. And as they said, subsequent runs artificially deflate the time due to caching.
From everything I've tried, I haven't tried the following:
import time tin = time.time() q = MyClassName('testfile.dat') print time.time() - tin
The first time I create an instance of "testfile.dat" it takes a full 3.3-3.5 seconds. If I run this snippet again, it appears in ~ 0.9 seconds So timeit was taking the best of multiple runs as commenters said
And I should know better than trusting my empirical observations about how long it took to instantiate the object manually. A single object has never instantiated faster than a loop.
Thanks everyone for the quick answers.
source to share