Error in train.default (x, y, weight = w, ...): final setting parameters could not be determined
I am very new to computer learning and I am trying to run a forest cover prediction competition on Kaggle , but I got hung up early on. I am getting the following error when running the code below.
Error in train.default (x, y, weights = w, ...): final tuning parameters could not be determined In addition: There were 50 or more warnings (use warnings () to see the first 50)
# Load the libraries
library(ggplot2); library(caret); library(AppliedPredictiveModeling)
library(pROC)
library(Amelia)
set.seed(1234)
# Load the forest cover dataset from the csv file
rawdata <- read.csv("train.csv",stringsAsFactors = F)
#this data won't be used in model evaluation. It will only be used for the submission.
test <- read.csv("test.csv",stringsAsFactors = F)
########################
### DATA PREPARATION ###
########################
#create a training and test set for building and evaluating the model
samples <- createDataPartition(rawdata$Cover_Type, p = 0.5,list = FALSE)
data.train <- rawdata[samples, ]
data.test <- rawdata[-samples, ]
model1 <- train(as.factor(Cover_Type) ~ Elevation + Aspect + Slope + Horizontal_Distance_To_Hydrology,
data = data.train,
method = "rf", prox = "TRUE")
source to share
The following should work:
model1 <- train(as.factor(Cover_Type) ~ Elevation + Aspect + Slope + Horizontal_Distance_To_Hydrology,
data = data.train,
method = "rf", tuneGrid = data.frame(mtry = 3))
It is always best to specify a parameter tuneGrid
that is a data frame with possible customization values. Take a look ?randomForest
and ?train
for more information. rf
has only one setting parameter mtry
that controls the number of functions selected for each tree.
You can also run modelLookup
to get a list of settings for each model
> modelLookup("rf")
# model parameter label forReg forClass probModel
#1 rf mtry #Randomly Selected Predictors TRUE TRUE TRUE
source to share
I also run Kaggle contests and use the "caret" package to help you choose the "best" model parameters. After getting many of these errors, I looked into the script behind the scenes and discovered a call to a function called "class2ind" that doesn't exist (at least anywhere I know). Finally, I found another function called "class.ind", which is in the "nnet" package. I decided to just try to create a local function called "class2ind" and pop in code from the "class.ind" function. And low, and now it worked!
# fix for caret
class2ind <- function(cl)
{
n <- length(cl)
cl <- as.factor(cl)
x <- matrix(0, n, length(levels(cl)) )
x[(1:n) + n*(unclass(cl)-1)] <- 1
dimnames(x) <- list(names(cl), levels(cl))
x
}
source to share