Python: ValueError: Could not convert string to float: 'D'
I am downloading the train.csv file to put it in the RandomForestClassifier. Loading and processing the .csv file is fine. I can play with my file frame.
When I try:
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators=150, min_samples_split=2, n_jobs=-1)
rf.fit(train, target)
I get this:
ValueError: could not convert string to float: 'D'
I tried:
train=train.astype(float)
Replace all "D" with a different value.
train.convert_objects(convert_numeric=True)
But the problem still persists.
I have also tried printing all the Errors values โโin my csv file but cannot find a reference to "D".
This is my footprint:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-20-9d8e309c06b6> in <module>()
----> 1 rf.fit(train, target)
\Anaconda3\lib\site-packages\sklearn\ensemble\forest.py in fit(self, X, y, sample_weight)
222
223 # Convert data
--> 224 X, = check_arrays(X, dtype=DTYPE, sparse_format="dense")
225
226 # Remap output
\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_arrays(*arrays, **options)
279 array = np.ascontiguousarray(array, dtype=dtype)
280 else:
--> 281 array = np.asarray(array, dtype=dtype)
282 if not allow_nans:
283 _assert_all_finite(array)
\Anaconda3\lib\site-packages\numpy\core\numeric.py in asarray(a, dtype, order)
460
461 """
--> 462 return array(a, dtype, copy=False, order=order)
463
464 def asanyarray(a, dtype=None, order=None):
ValueError: could not convert string to float: 'D'
How should I approach this problem?
source to share
Without RandomForestClassifier there is no (as far as I could find) python library (as python pointed out), it's hard to figure out what's going on in your case. However, what really happens is at some point you are trying to convert the D string to float. I can reproduce your error by running:
float('D')
Now, to debug this issue, I recommend that you catch the exception:
try:
rf.fit(train, target)
except ValueError as e:
print(e)
#do something clever with train and target like pprint them or something.
Then you can see what's really going on. I couldn't find out much about this random forest classifier except that it might help: https://www.npmjs.com/package/random-forest-classifier
source to share