MemoryError of Randomforest in scikit-learn
I am following the example Python
given in For Beginners - Bag of Words . However, the following code segment gives an error message like MemoryError
. What can cause this error
forest = forest.fit(train_data_features, train["sentiment"])
Traceback (most recent call last):
File "C:/Users/PycharmProjects/Project3/demo4.py", line 60, in <module>
forest = forest.fit(train_data_features, train["sentiment"])
File "C:\Users\AppData\Roaming\Python\Python27\site-
packages\sklearn\ensemble\forest.py", line 195, in fit
X = check_array(X, dtype=DTYPE, accept_sparse="csc")
File "C:\Users\AppData\Roaming\Python\Python27\site-
packages\sklearn\utils\validation.py", line 341, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
MemoryError
+3
source to share
2 answers
MemoryError
as the name says, means you're out of free memory.
If you are following the code example from here , there are a few things that can help you:
- delte variables using
del
when you no longer need them
(e.g.clean_train_reviews
not needed after line 62) - After line 42 is only required
train["sentiment"]
, the resttrain
can be discarded to free memory - don't read both the training and test kits at the beginning. The kit
test
is only needed after the forest has been created, and at this moment nothing else is required to assemble the train. - The whole learning part can be wrapped in a function that returns a forest that will take care of all references that are no longer needed after that.
+4
source to share