Python: ValueError: Could not convert string to float: 'D'

Question

Python: ValueError: Could not convert string to float: 'D'

I am downloading the train.csv file to put it in the RandomForestClassifier. Loading and processing the .csv file is fine. I can play with my file frame.

When I try:

from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators=150, min_samples_split=2, n_jobs=-1)
rf.fit(train, target)

I get this:

ValueError: could not convert string to float: 'D'

I tried:

train=train.astype(float)

Replace all "D" with a different value.

train.convert_objects(convert_numeric=True)

But the problem still persists.

I have also tried printing all the Errors values in my csv file but cannot find a reference to "D".

This is my footprint:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-20-9d8e309c06b6> in <module>()
----> 1 rf.fit(train, target)

\Anaconda3\lib\site-packages\sklearn\ensemble\forest.py in fit(self, X, y, sample_weight)
    222 
    223         # Convert data
--> 224         X, = check_arrays(X, dtype=DTYPE, sparse_format="dense")
    225 
    226         # Remap output

\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_arrays(*arrays, **options)
    279                     array = np.ascontiguousarray(array, dtype=dtype)
    280                 else:
--> 281                     array = np.asarray(array, dtype=dtype)
    282                 if not allow_nans:
    283                     _assert_all_finite(array)

\Anaconda3\lib\site-packages\numpy\core\numeric.py in asarray(a, dtype, order)
    460 
    461     """
--> 462     return array(a, dtype, copy=False, order=order)
    463 
    464 def asanyarray(a, dtype=None, order=None):

ValueError: could not convert string to float: 'D'

How should I approach this problem?

+3

python scikit-learn

swamoch 08 Aug 15 at 19:39

source to share

2 answers

zom-pro · Answer 1 · 2015-08-08T20:04:48+0000

Without RandomForestClassifier there is no (as far as I could find) python library (as python pointed out), it's hard to figure out what's going on in your case. However, what really happens is at some point you are trying to convert the D string to float. I can reproduce your error by running:

float('D')

Now, to debug this issue, I recommend that you catch the exception:

try:
  rf.fit(train, target)
except ValueError as e:
  print(e)
  #do something clever with train and target like pprint them or something.

Then you can see what's really going on. I couldn't find out much about this random forest classifier except that it might help: https://www.npmjs.com/package/random-forest-classifier

Claude coulombe · Answer 2 · 2015-11-13T06:36:13+0000

You should examine and clear the data. You probably have a "D" somewhere in your data that your code is trying to convert to float. Trace inside a try-except block is a good idea.

Python: ValueError: Could not convert string to float: 'D'

More articles: