Python: ValueError: Could not convert string to float: 'D'

I am downloading the train.csv file to put it in the RandomForestClassifier. Loading and processing the .csv file is fine. I can play with my file frame.

When I try:

from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators=150, min_samples_split=2, n_jobs=-1)
rf.fit(train, target)

      

I get this:

ValueError: could not convert string to float: 'D'

      

I tried:

train=train.astype(float)

      

Replace all "D" with a different value.

train.convert_objects(convert_numeric=True)

      

But the problem still persists.

I have also tried printing all the Errors values โ€‹โ€‹in my csv file but cannot find a reference to "D".

This is my footprint:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-20-9d8e309c06b6> in <module>()
----> 1 rf.fit(train, target)

\Anaconda3\lib\site-packages\sklearn\ensemble\forest.py in fit(self, X, y, sample_weight)
    222 
    223         # Convert data
--> 224         X, = check_arrays(X, dtype=DTYPE, sparse_format="dense")
    225 
    226         # Remap output

\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_arrays(*arrays, **options)
    279                     array = np.ascontiguousarray(array, dtype=dtype)
    280                 else:
--> 281                     array = np.asarray(array, dtype=dtype)
    282                 if not allow_nans:
    283                     _assert_all_finite(array)

\Anaconda3\lib\site-packages\numpy\core\numeric.py in asarray(a, dtype, order)
    460 
    461     """
--> 462     return array(a, dtype, copy=False, order=order)
    463 
    464 def asanyarray(a, dtype=None, order=None):

ValueError: could not convert string to float: 'D'

      

How should I approach this problem?

+3


source to share


2 answers


Without RandomForestClassifier there is no (as far as I could find) python library (as python pointed out), it's hard to figure out what's going on in your case. However, what really happens is at some point you are trying to convert the D string to float. I can reproduce your error by running:

float('D')

      

Now, to debug this issue, I recommend that you catch the exception:



try:
  rf.fit(train, target)
except ValueError as e:
  print(e)
  #do something clever with train and target like pprint them or something.

      

Then you can see what's really going on. I couldn't find out much about this random forest classifier except that it might help: https://www.npmjs.com/package/random-forest-classifier

+2


source


You should examine and clear the data. You probably have a "D" somewhere in your data that your code is trying to convert to float. Trace inside a try-except block is a good idea.



0


source







All Articles