Python: Object Creation Timing in Loop / Understanding / Matching vs. One Time
I am asking for this in general, as I cannot post the actual code for various reasons. IPython Notebook does the following:
I created a class structured this way (it requires numpy)
class MyClassName(object): def __init__(self, filename): self.filename = filename self.read_binary_file() # Run these on object creation self.calculate_parameters() self.check_for_errors() ... def read_binary_file( self ): # This requires numpy. # # The file is 250MB binary and # # ultimately yields a numpy array # # 32 x 32 x 100000 element ... def calculate_parameters( self ): ... def check_for_errors( self ): ... def other_function1( self ): ... def other_function2( self ):
The code sounds. I can do the following
q = MyClassName('testfile.dat') # Instantiate an object q.other_function1() # Invoke methods
%timeit q = MyClassName('testfile.dat')
gives about 0.9 seconds for this creation
But , if I have a list of files
and create objects in a loop, understanding or map
filenames = ['f1.dat', 'f2.dat', ..., 'f10.dat']
Chomp = map( MyClassName, filenames ) Chomp = [ MyClassName(j) for j in filenames ] Chomp =  for j in filenames: Chomp.append( MyClassName(j) )
each object takes over 3.5 seconds to create. Loop takes 3.5 seconds / file x number of files to complete
What I have tried: I have been looking for information on list creation, list addition timings, memory management / assumptions, disable / re-garbage collection after each object is created, etc.
I also imported the cprofile run when creating one object.
They all report 3.5 seconds. cprofile says that a numeric binary read took 2.5 seconds 3.5s to create a single object. But this same procedure is called when I create a separate object outside of the loop or cprofile.
Only the creation of one object is fast.
I am running on a Windows 7 computer and controlled by the task manager. At one point it looked like I pulled out of physical memory and was replaced by a page, so I rebooted, restarted iPython / Notebook, only enabled one core, and had few other programs. Memory load decreased, but loop performance did not improve.
I am new to OOP in general, have been working with Python for several months now and am interested in understanding what is going on, so I can code in a more appropriate way.
source to share
[Answer converted from question]
- There was no real problem (!) ... just really bad observations with me.
As noted by M. Wasowski and JonZwink in the comments, it
is executed several times. And as they said, subsequent runs artificially deflate the time due to caching.
From everything I've tried, I haven't tried the following:
import time tin = time.time() q = MyClassName('testfile.dat') print time.time() - tin
The first time I create an instance of "testfile.dat" it takes a full 3.3-3.5 seconds. If I run this snippet again, it appears in ~ 0.9 seconds So timeit was taking the best of multiple runs as commenters said
And I should know better than trusting my empirical observations about how long it took to instantiate the object manually. A single object has never instantiated faster than a loop.
Thanks everyone for the quick answers.
source to share