Is this the correct way to get predictions / performance on sample and out of sample in the R caret package?

I want to know how to get both inside and outside the fetch in a batch R

caret

.

I wrote a simple (reproducible) example code for training a random forest on iris

data to demonstrate the same.

library(caret)
set.seed(1234)
data("iris")

# create an output variable with 1/0 values
iris$Class <- as.factor(ifelse(iris$Species == "setosa", 1, 0))

# remove Species column
iris$Species <- NULL

# shuffle iris data
iris <- iris[sample(nrow(iris)), ]

# divide data into sample train and test sets
train_index <- 1:as.integer(nrow(iris) * 0.7)
train_iris <- iris[train_index, ]
test_iris <- iris[-train_index, ]

# train control object
myTrainControl <- trainControl(method = "none",
                               savePredictions = TRUE)

random_forest_model <- train(x = train_iris[, colnames(train_iris) != "Class"], 
                             y = train_iris$Class, 
                             method = "rf", 
                             trControl = myTrainControl, 
                             tuneLength = 1, 
                             preProcess = c("center", "scale"))

predictions_on_complete_data <- extractProb(models = list(random_forest_model), 
                                            testX = test_iris[, colnames(train_iris) != "Class"], 
                                            testY = test_iris$Class)

in_sample_performance <- predictions_on_complete_data[predictions_on_complete_data$dataType == "Training", ]
out_of_sample_performance <- predictions_on_complete_data[predictions_on_complete_data$dataType == "Test", ]

      

Is the above mentioned way correct to achieve the same?

+3


source to share





All Articles