How to specify formula in rfsrc in R
I have a dataframe (train3) with 10 numeric variables and a coefficient.
I would like to make a random forest classifier using a rfsrc
function from a package randomForestSRC
.
The data looks like this:
summary(train3)
roll_belt pitch_belt yaw_belt total_accel_belt gyros_belt_x gyros_belt_y
Min. :-28.90 Min. :-55.8000 Min. :-180.00 Min. : 0.00 Min. :-1.040000 Min. :-0.64000
1st Qu.: 1.10 1st Qu.: 1.7600 1st Qu.: -88.30 1st Qu.: 3.00 1st Qu.:-0.030000 1st Qu.: 0.00000
Median :113.00 Median : 5.2800 Median : -13.00 Median :17.00 Median : 0.030000 Median : 0.02000
Mean : 64.41 Mean : 0.3053 Mean : -11.21 Mean :11.31 Mean :-0.005592 Mean : 0.03959
3rd Qu.:123.00 3rd Qu.: 14.9000 3rd Qu.: 12.90 3rd Qu.:18.00 3rd Qu.: 0.110000 3rd Qu.: 0.11000
Max. :162.00 Max. : 60.3000 Max. : 179.00 Max. :29.00 Max. : 2.220000 Max. : 0.64000
gyros_belt_z accel_belt_x accel_belt_y accel_belt_z classe
Min. :-1.4600 Min. :-120.000 Min. :-69.00 Min. :-275.00 A:5580
1st Qu.:-0.2000 1st Qu.: -21.000 1st Qu.: 3.00 1st Qu.:-162.00 B:3797
Median :-0.1000 Median : -15.000 Median : 35.00 Median :-152.00 C:3422
Mean :-0.1305 Mean : -5.595 Mean : 30.15 Mean : -72.59 D:3216
3rd Qu.:-0.0200 3rd Qu.: -5.000 3rd Qu.: 61.00 3rd Qu.: 27.00 E:3607
Max. : 1.6200 Max. : 85.000 Max. :164.00 Max. : 105.00
My rfsrc call looks like this:
fit = rfsrc (classe ~ ., data = train3)
Error in parseFormula(formula, data) :
the y-outcome must be either real or a factor.
Cluster seems to be a factor:
str(train3)
Classes โtbl_dfโ and 'data.frame': 19622 obs. of 11 variables:
$ roll_belt : num 1.41 1.41 1.42 1.48 1.48 1.45 1.42 1.42 1.43 1.45 ...
$ pitch_belt : num 8.07 8.07 8.07 8.05 8.07 8.06 8.09 8.13 8.16 8.17 ...
$ yaw_belt : num -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 -94.4 ...
$ total_accel_belt: int 3 3 3 3 3 3 3 3 3 3 ...
$ gyros_belt_x : num 0 0.02 0 0.02 0.02 0.02 0.02 0.02 0.02 0.03 ...
$ gyros_belt_y : num 0 0 0 0 0.02 0 0 0 0 0 ...
$ gyros_belt_z : num -0.02 -0.02 -0.02 -0.03 -0.02 -0.02 -0.02 -0.02 -0.02 0 ...
$ accel_belt_x : int -21 -22 -20 -22 -21 -21 -22 -22 -20 -21 ...
$ accel_belt_y : int 4 4 5 3 2 4 3 4 2 4 ...
$ accel_belt_z : int 22 22 23 21 24 21 21 21 24 22 ...
$ classe : Factor w/ 5 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 1 1 ...
What am I missing? The y-result would seem to be complex, which is the factor.
+3
source to share