Python: Something Faster Than Not For Large Lists?
I am doing a project with word lists. I want to concatenate two word lists, but only store unique words.
I am reading words from a file and it seems to be taking a long time to read the file and save it as a list. I intend to copy the same block of code and run it using a second (or any subsequent) text file. The slow part of the code looks like this:
while inLine!= "":
inLine = inLine.strip()
if inLine not in inList:
inList.append(inLine)
inLine = inFile.readline()
Please correct me if I am wrong, but I think the slow (est) part of the program is "out of comparison". What are some ways I can rewrite this to make it faster?
source to share
Judging by this line:
if inLine not in inList:
inList.append(inLine)
It looks like you are enforcing uniqueness in the container inList
. You should consider using a more efficient data structure like a set inSet
. The check not in
can then be discarded as redundant, since duplicates will be prevented by the container anyway.
If the order of insertion must be preserved, you can achieve a similar result by using OrderedDict
with null values.
source to share