Function tune_parameters
allows to tune parameters for the implemented MERF method. Essentially,
this function can be understood as a modified wrapper for train from the package caret,
treating MERFs as a custom method.
Usage
tune_parameters(
Y,
X,
data,
dName,
trControl,
tuneGrid,
seed = 11235,
gg_theme = theme_minimal(),
plot_res = TRUE,
return_plot = FALSE,
na.rm = TRUE,
...
)
Arguments
- Y
Continuous input value of target variable.
- X
Matrix or data.frame of predictive covariates.
- data
data.frame of survey sample data including the specified elements of
Y
andX
.- dName
Character specifying the name of domain identifier, for which random intercepts are modeled.
- trControl
Control parameters passed to train. Most important parameters are
method
("repeatedcv" for x-fold cross-validation),number
(the number of folds) andrepeats
(the number of repetitions). For further details see trainControl and the example below.- tuneGrid
A data.frame with possible tuning values. The columns must have the same names as the tuning parameters. For this tuning function the grid must comprise entries for the following parameters:
num.trees, mtry, min.node.size, splitrule
.- seed
Enabling reproducibility of for cross-validation and tuning. Defaults to
11235
.- gg_theme
Specify a predefined theme from ggplot2. Defaults to
theme_minimal
.- plot_res
Optional logical. If
TRUE
, the plot with results of cross-validation and tuning is shown. Defaults toTRUE
.- return_plot
If set to
TRUE
, a list of the comparative plot produced by ggplot2 is returned for further individual customization and processing.- na.rm
Logical. Whether missing values should be removed. Defaults to
TRUE
.- ...
Additional parameters are directly passed to the random forest ranger and/or the training function train. For further details on possible parameters and examples see ranger or train.
Value
Prints requested optimal tuning parameters and (if requested) an additional comparative plot produced by ggplot2.
Details
Tuning can be performed on the following four parameters: num.trees
(the number of trees
for a forest), mtry
(number of variables as split candidates at in each node), min.node.size
(minimal individual node size) and splitrule
(general splitting rule). For details see
ranger.
Examples
# \donttest{
# Loading data
data("eusilcA_pop")
data("eusilcA_smp")
library(caret)
#> Loading required package: ggplot2
#> Loading required package: lattice
income <- eusilcA_smp$eqIncome
X_covar <- eusilcA_smp[, -c(1, 16, 17, 18)]
# Specific characteristics of Cross-validation
fitControl <- trainControl(method = "repeatedcv", number = 5,
repeats = 1)
# Define a tuning-grid
merfGrid <- expand.grid(num.trees = 50, mtry = c(3, 7, 9),
min.node.size = 10, splitrule = "variance")
tune_parameters(Y = income, X = X_covar, data = eusilcA_smp,
dName = "district", trControl = fitControl,
tuneGrid = merfGrid)
#> Warning: model fit failed for Fold1.Rep1: num.trees=50, mtry=3, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold1.Rep1: num.trees=50, mtry=7, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold1.Rep1: num.trees=50, mtry=9, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold2.Rep1: num.trees=50, mtry=3, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold2.Rep1: num.trees=50, mtry=7, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold2.Rep1: num.trees=50, mtry=9, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold3.Rep1: num.trees=50, mtry=3, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold3.Rep1: num.trees=50, mtry=7, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold3.Rep1: num.trees=50, mtry=9, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold4.Rep1: num.trees=50, mtry=3, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold4.Rep1: num.trees=50, mtry=7, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold4.Rep1: num.trees=50, mtry=9, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold5.Rep1: num.trees=50, mtry=3, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold5.Rep1: num.trees=50, mtry=7, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: model fit failed for Fold5.Rep1: num.trees=50, mtry=9, min.node.size=10, splitrule=variance Error in initializePtr() :
#> function 'cholmod_factor_ldetA' not provided by package 'Matrix'
#> Warning: There were missing values in resampled performance measures.
#> Something is wrong; all the RMSE metric values are missing:
#> RMSE Rsquared MAE
#> Min. : NA Min. : NA Min. : NA
#> 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
#> Median : NA Median : NA Median : NA
#> Mean :NaN Mean :NaN Mean :NaN
#> 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
#> Max. : NA Max. : NA Max. : NA
#> NA's :3 NA's :3 NA's :3
#> Error: Stopping
# }