Skip to contents

Plots model-specific characteristics of the fixed effects random forest component of the MERF from a SAEforestObject. A variable importance plot is produced to visualize the importance of individual covariates for the predictive performance of the model. For the variable importance plot, arguments are passed internally to the function vip. If requested, the plot function additionally provides a partial dependence plot (pdp) to visualize the impact of a given number of influential covariates on the target variable. The pdp plot is produced using partial from the package pdp. The plot-engine for both plots is ggplot2.

Usage

# S3 method for SAEforest
plot(
  x,
  num_features = 6,
  col = "darkgreen",
  fill = "darkgreen",
  alpha = 0.8,
  include_type = TRUE,
  horizontal = TRUE,
  gg_theme = theme_minimal(),
  lsize = 1.5,
  lty = "solid",
  grid_row = 2,
  out_list = FALSE,
  pdp_plot = TRUE,
  ...
)

Arguments

x

An object of class SAEforest including a random forest model of class ranger.

num_features

Number of features for which a partial dependence plot is required.

col

Parameter specifying the color of selected plots. The argument must be specified such that it can be processed by aes. Defaults to a character name of the color "darkgreen".

fill

Parameter specifying the fill of selected plots. The argument must be specified such that it can be processed by aes. Defaults to a character name of the color "darkgreen".

alpha

Parameter specifying the transparency of fill for vip plots. The argument must be a number in [0,1].

include_type

Logical. If set to TRUE, the type of importance specified in the fitting process of the model is included in the vip plot. Defaults to TRUE.

horizontal

Logical. If set to TRUE, the importance scores appear on the x-axis. If parameter is set to FALSE, the importance scores are plot on the y-axis. Defaults to TRUE.

gg_theme

Specify a predefined theme from ggplot2. Defaults to theme_minimal.

lsize

Parameter specifying the line size of pdp plots. The argument must be specified such that it can be processed by aes. Defaults to 1.5.

lty

Parameter specifying the line size of pdp plots. The argument must be specified such that it can be processed by aes. Defaults to "solid".

grid_row

Parameter specifying the amount of rows for the joint pdp plot. Defaults to 2.

out_list

Logical. If set to TRUE, a list of individual plots produced by ggplot2 is returned for further individual customization and processing. Defaults to FALSE.

pdp_plot

Logical. If set to TRUE, partial dependence plots produced by partial from the package pdp are included. Defaults to TRUE.

...

Optional additional inputs that are ignored for this method.

Value

Plots of variable importance and/or partial dependence of covariates ranked by corresponding importance. Additionally, a list of individual plots can be returned facilitating individual customization and exporting. See the following examples for details.

Details

For the production of importance plots, be sure to specify the parameter of importance != 'none' before producing estimates with function SAEforest_model.

For pdp plots, note that covariates of type factor or character cannot be used for partial dependence plots. Dummy-variables can be used, however, their pdp plots are always lines connecting two effect points for 0 and 1. Most informative pdp plots can be produced for continuous predictors.

See also

Examples

# \donttest{
# Loading data
data("eusilcA_pop")
data("eusilcA_smp")

income <- eusilcA_smp$eqIncome
X_covar <- eusilcA_smp[, -c(1, 16, 17, 18)]

# Example 1:
# Calculating point estimates and discussing basic generic functions

model1 <- SAEforest_model(Y = income, X = X_covar, dName = "district",
                          smp_data = eusilcA_smp, pop_data = eusilcA_pop,
                          num.trees = 50)
#> Error in initializePtr(): function 'cholmod_factor_ldetA' not provided by package 'Matrix'
plot(model1)
#> Error in eval(expr, envir, enclos): object 'model1' not found
# }