Plural calculus on new / predictive data
Can someone please help me understand how to handle missing values ββin new / invisible data? I've researched several attachment packages in R and all seem to only attribute the set for training and testing (at the same time). How do you process new unlabeled data for evaluation in the same way as you train / test? Basically, I want to use multiple imputation for missing values ββin the training / test set, and the same model / method for the prediction data. Based on my research on multiple imputation (not expert), isn't it possible to do this with MI? However, for example with a caret function, you can easily use the same model that was used to train / test the new data. Any help would be greatly appreciated. Thank.
** Edit
Basically, my dataset contains a lot of missing values. Deleting is not an option as it will drop most of my train / test suite. Up to this point, I have coded categorical variables, removed nearly zero variance and high correlated variables. After this preprocessing, I was able to easily apply the imputation mouse pack
m=mice(sg.enc)
At this point, I could use the pool command to apply the model to the imputed datasets. This works great. However, I know that future data will have missing values ββand would like to somehow apply this MI gradually?
source to share
It doesn't have multiple imputation, but the yaImpute package has a predictor () function to pass values ββfor new data. I ran a test using training data (including NA) to create a "yai" object and then use that object via pred () to cast the values ββinto a new test dataset. Unlike CareT preProcess (), yaImpute supports variable factors (at least for imputing values ββfor them) into its knn algorithm. I have not yet tested if factors can be part of the "predictors" for missing targets. yaImpute supports other imputation methods besides knn.
source to share