4  Step3: Consistency and Sensitivity of Local Explanations

4.1 Consistency - Random Forest

Code
library(randomForest)
randomForest 4.7-1.2
Type rfNews() to see new features/changes/bug fixes.
Code
library(DALEX)
Welcome to DALEX (version: 2.4.3).
Find examples and detailed introduction at: http://ema.drwhy.ai/
Code
library(iml)
library(mlbench)
library(caret)
Loading required package: ggplot2

Attaching package: 'ggplot2'
The following object is masked from 'package:randomForest':

    margin
Loading required package: lattice
Code
data(PimaIndiansDiabetes)
df <- na.omit(PimaIndiansDiabetes)
set.seed(5293)

train_idx <- createDataPartition(df$diabetes, p = 0.8, list = FALSE)
train_data <- df[train_idx, ]
test_data <- df[-train_idx, ]

lime_results <- list()
shap_results <- list()

for (i in 1:10) {
  set.seed(i)
  rf_model <- randomForest(diabetes ~ ., data = train_data, ntree = 100)

  # LIME
  explainer_lime <- DALEX::explain(
    rf_model,
    data = train_data[, -ncol(train_data)],
    y = train_data$diabetes
  )
  lime_expl <- predict_parts(
    explainer_lime,
    new_observation = test_data[1, -ncol(test_data)],
    type = "break_down"
  )
  lime_results[[i]] <- lime_expl

  # SHAP
  X <- train_data[, -ncol(train_data)]
  predictor <- iml::Predictor$new(
    rf_model,
    data = X,
    y = train_data$diabetes,
    type = "prob"
  )
  shap <- iml::Shapley$new(predictor, x.interest = test_data[1, -ncol(test_data)])
  shap_results[[i]] <- shap$results
}
Preparation of a new explainer is initiated
  -> model label       :  randomForest  (  default  )
  -> data              :  615  rows  8  cols 
  -> target variable   :  615  values 
  -> predict function  :  yhat.randomForest  will be used (  default  )
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package randomForest , ver. 4.7.1.2 , task classification (  default  ) 
  -> model_info        :  Model info detected classification task but 'y' is a factor .  (  WARNING  )
  -> model_info        :  By deafult classification tasks supports only numercical 'y' parameter. 
  -> model_info        :  Consider changing to numerical vector with 0 and 1 values.
  -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
  -> predicted values  :  numerical, min =  0 , mean =  0.3471545 , max =  0.99  
  -> residual function :  difference between y and yhat (  default  )
Warning in Ops.factor(y, predict_function(model, data)): '-' not meaningful for
factors
  -> residuals         :  numerical, min =  NA , mean =  NA , max =  NA  
  A new explainer has been created!  
Preparation of a new explainer is initiated
  -> model label       :  randomForest  (  default  )
  -> data              :  615  rows  8  cols 
  -> target variable   :  615  values 
  -> predict function  :  yhat.randomForest  will be used (  default  )
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package randomForest , ver. 4.7.1.2 , task classification (  default  ) 
  -> model_info        :  Model info detected classification task but 'y' is a factor .  (  WARNING  )
  -> model_info        :  By deafult classification tasks supports only numercical 'y' parameter. 
  -> model_info        :  Consider changing to numerical vector with 0 and 1 values.
  -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
  -> predicted values  :  numerical, min =  0 , mean =  0.3489431 , max =  0.98  
  -> residual function :  difference between y and yhat (  default  )
Warning in Ops.factor(y, predict_function(model, data)): '-' not meaningful for
factors
  -> residuals         :  numerical, min =  NA , mean =  NA , max =  NA  
  A new explainer has been created!  
Preparation of a new explainer is initiated
  -> model label       :  randomForest  (  default  )
  -> data              :  615  rows  8  cols 
  -> target variable   :  615  values 
  -> predict function  :  yhat.randomForest  will be used (  default  )
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package randomForest , ver. 4.7.1.2 , task classification (  default  ) 
  -> model_info        :  Model info detected classification task but 'y' is a factor .  (  WARNING  )
  -> model_info        :  By deafult classification tasks supports only numercical 'y' parameter. 
  -> model_info        :  Consider changing to numerical vector with 0 and 1 values.
  -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
  -> predicted values  :  numerical, min =  0 , mean =  0.3501789 , max =  0.99  
  -> residual function :  difference between y and yhat (  default  )
Warning in Ops.factor(y, predict_function(model, data)): '-' not meaningful for
factors
  -> residuals         :  numerical, min =  NA , mean =  NA , max =  NA  
  A new explainer has been created!  
Preparation of a new explainer is initiated
  -> model label       :  randomForest  (  default  )
  -> data              :  615  rows  8  cols 
  -> target variable   :  615  values 
  -> predict function  :  yhat.randomForest  will be used (  default  )
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package randomForest , ver. 4.7.1.2 , task classification (  default  ) 
  -> model_info        :  Model info detected classification task but 'y' is a factor .  (  WARNING  )
  -> model_info        :  By deafult classification tasks supports only numercical 'y' parameter. 
  -> model_info        :  Consider changing to numerical vector with 0 and 1 values.
  -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
  -> predicted values  :  numerical, min =  0 , mean =  0.3503252 , max =  0.98  
  -> residual function :  difference between y and yhat (  default  )
Warning in Ops.factor(y, predict_function(model, data)): '-' not meaningful for
factors
  -> residuals         :  numerical, min =  NA , mean =  NA , max =  NA  
  A new explainer has been created!  
Preparation of a new explainer is initiated
  -> model label       :  randomForest  (  default  )
  -> data              :  615  rows  8  cols 
  -> target variable   :  615  values 
  -> predict function  :  yhat.randomForest  will be used (  default  )
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package randomForest , ver. 4.7.1.2 , task classification (  default  ) 
  -> model_info        :  Model info detected classification task but 'y' is a factor .  (  WARNING  )
  -> model_info        :  By deafult classification tasks supports only numercical 'y' parameter. 
  -> model_info        :  Consider changing to numerical vector with 0 and 1 values.
  -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
  -> predicted values  :  numerical, min =  0 , mean =  0.3474309 , max =  0.99  
  -> residual function :  difference between y and yhat (  default  )
Warning in Ops.factor(y, predict_function(model, data)): '-' not meaningful for
factors
  -> residuals         :  numerical, min =  NA , mean =  NA , max =  NA  
  A new explainer has been created!  
Preparation of a new explainer is initiated
  -> model label       :  randomForest  (  default  )
  -> data              :  615  rows  8  cols 
  -> target variable   :  615  values 
  -> predict function  :  yhat.randomForest  will be used (  default  )
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package randomForest , ver. 4.7.1.2 , task classification (  default  ) 
  -> model_info        :  Model info detected classification task but 'y' is a factor .  (  WARNING  )
  -> model_info        :  By deafult classification tasks supports only numercical 'y' parameter. 
  -> model_info        :  Consider changing to numerical vector with 0 and 1 values.
  -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
  -> predicted values  :  numerical, min =  0 , mean =  0.3496748 , max =  0.99  
  -> residual function :  difference between y and yhat (  default  )
Warning in Ops.factor(y, predict_function(model, data)): '-' not meaningful for
factors
  -> residuals         :  numerical, min =  NA , mean =  NA , max =  NA  
  A new explainer has been created!  
Preparation of a new explainer is initiated
  -> model label       :  randomForest  (  default  )
  -> data              :  615  rows  8  cols 
  -> target variable   :  615  values 
  -> predict function  :  yhat.randomForest  will be used (  default  )
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package randomForest , ver. 4.7.1.2 , task classification (  default  ) 
  -> model_info        :  Model info detected classification task but 'y' is a factor .  (  WARNING  )
  -> model_info        :  By deafult classification tasks supports only numercical 'y' parameter. 
  -> model_info        :  Consider changing to numerical vector with 0 and 1 values.
  -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
  -> predicted values  :  numerical, min =  0 , mean =  0.3470244 , max =  0.98  
  -> residual function :  difference between y and yhat (  default  )
Warning in Ops.factor(y, predict_function(model, data)): '-' not meaningful for
factors
  -> residuals         :  numerical, min =  NA , mean =  NA , max =  NA  
  A new explainer has been created!  
Preparation of a new explainer is initiated
  -> model label       :  randomForest  (  default  )
  -> data              :  615  rows  8  cols 
  -> target variable   :  615  values 
  -> predict function  :  yhat.randomForest  will be used (  default  )
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package randomForest , ver. 4.7.1.2 , task classification (  default  ) 
  -> model_info        :  Model info detected classification task but 'y' is a factor .  (  WARNING  )
  -> model_info        :  By deafult classification tasks supports only numercical 'y' parameter. 
  -> model_info        :  Consider changing to numerical vector with 0 and 1 values.
  -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
  -> predicted values  :  numerical, min =  0 , mean =  0.3438699 , max =  0.98  
  -> residual function :  difference between y and yhat (  default  )
Warning in Ops.factor(y, predict_function(model, data)): '-' not meaningful for
factors
  -> residuals         :  numerical, min =  NA , mean =  NA , max =  NA  
  A new explainer has been created!  
Preparation of a new explainer is initiated
  -> model label       :  randomForest  (  default  )
  -> data              :  615  rows  8  cols 
  -> target variable   :  615  values 
  -> predict function  :  yhat.randomForest  will be used (  default  )
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package randomForest , ver. 4.7.1.2 , task classification (  default  ) 
  -> model_info        :  Model info detected classification task but 'y' is a factor .  (  WARNING  )
  -> model_info        :  By deafult classification tasks supports only numercical 'y' parameter. 
  -> model_info        :  Consider changing to numerical vector with 0 and 1 values.
  -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
  -> predicted values  :  numerical, min =  0 , mean =  0.3456748 , max =  0.99  
  -> residual function :  difference between y and yhat (  default  )
Warning in Ops.factor(y, predict_function(model, data)): '-' not meaningful for
factors
  -> residuals         :  numerical, min =  NA , mean =  NA , max =  NA  
  A new explainer has been created!  
Preparation of a new explainer is initiated
  -> model label       :  randomForest  (  default  )
  -> data              :  615  rows  8  cols 
  -> target variable   :  615  values 
  -> predict function  :  yhat.randomForest  will be used (  default  )
  -> predicted values  :  No value for predict function target column. (  default  )
  -> model_info        :  package randomForest , ver. 4.7.1.2 , task classification (  default  ) 
  -> model_info        :  Model info detected classification task but 'y' is a factor .  (  WARNING  )
  -> model_info        :  By deafult classification tasks supports only numercical 'y' parameter. 
  -> model_info        :  Consider changing to numerical vector with 0 and 1 values.
  -> model_info        :  Otherwise I will not be able to calculate residuals or loss function.
  -> predicted values  :  numerical, min =  0 , mean =  0.3498374 , max =  0.99  
  -> residual function :  difference between y and yhat (  default  )
Warning in Ops.factor(y, predict_function(model, data)): '-' not meaningful for
factors
  -> residuals         :  numerical, min =  NA , mean =  NA , max =  NA  
  A new explainer has been created!  
Code
library(randomForest)
library(DALEX)
library(DALEXtra)
library(mlbench)
library(caret)
library(dplyr)

Attaching package: 'dplyr'
The following object is masked from 'package:DALEX':

    explain
The following object is masked from 'package:randomForest':

    combine
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Code
library(ggplot2)

data(PimaIndiansDiabetes)
df <- na.omit(PimaIndiansDiabetes)
df$diabetes <- factor(df$diabetes, levels = c("neg", "pos")) 


set.seed(5293)
train_idx <- createDataPartition(df$diabetes, p = 0.8, list = FALSE)
train_data <- df[train_idx, ]
test_data <- df[-train_idx, ]
X <- train_data[, -ncol(train_data)]
y <- train_data$diabetes
Code
# logistic regression
logit_model <- glm(diabetes ~ ., data = train_data, family = binomial)
logit_coef <- coef(logit_model)[-1]  
coef_df <- tibble(variable = names(logit_coef),
                  logistic_coef = as.numeric(logit_coef))

# RandomForest + LIME
lime_results <- list()

for (i in 1:10) {
  set.seed(i)
  rf_model <- randomForest(x = X, y = y, ntree = 100)

  explainer <- DALEX::explain(
    model = rf_model,
    data = X,
    y = NULL, 
    predict_function = function(m, d) predict(m, d, type = "prob")[, 2],
    label = paste0("rf_seed_", i),
    verbose = FALSE
  )

  lime_expl <- predict_parts(
    explainer,
    new_observation = test_data[1, -ncol(test_data)],
    type = "break_down"
  )

  lime_results[[i]] <- lime_expl
}
Code
library(stringr)

coef_df <- tibble(
  variable_clean = names(logit_coef),
  logistic_coef = as.numeric(logit_coef)
)

lime_df <- bind_rows(lime_results)

lime_df <- lime_df |>
  mutate(variable_clean = str_trim(str_extract(variable, "^[^=]+")))

lime_mean_df <- lime_df |>
  group_by(variable_clean) |>
  summarise(mean_lime = mean(contribution), .groups = "drop")

consistency_df <- inner_join(coef_df, lime_mean_df, by = "variable_clean") |>
  mutate(abs_diff = abs(logistic_coef - mean_lime))

print(consistency_df)
# A tibble: 8 × 4
  variable_clean logistic_coef mean_lime abs_diff
  <chr>                  <dbl>     <dbl>    <dbl>
1 pregnant            0.148     -0.0203   0.169  
2 glucose             0.0367    -0.156    0.193  
3 pressure           -0.0123    -0.00941  0.00288
4 triceps            -0.000847   0.0251   0.0259 
5 insulin            -0.000418  -0.0409   0.0405 
6 mass                0.0895    -0.0291   0.119  
7 pedigree            1.11       0.00408  1.11   
8 age                 0.00774    0.0457   0.0380 
Code
library(ggplot2)

ggplot(consistency_df, aes(x = mean_lime, y = logistic_coef, label = variable_clean)) +
  geom_point(color = "blue", size = 3) +
  geom_abline(slope = 1, intercept = 0, linetype = "dashed", color = "gray") +
  geom_text(nudge_y = 0.05, size = 3.5) +
  labs(title = "Comparison of LIME Mean Contribution vs Logistic Coefficient",
       x = "LIME Mean Contribution",
       y = "Logistic Regression Coefficient") +
  theme_minimal()

Code
ggplot(consistency_df, aes(x = reorder(variable_clean, -abs_diff), y = abs_diff)) +
  geom_col(fill = "red") +
  labs(title = "Absolute Difference Between LIME and Logistic Coefficients",
       x = "Variable", y = "Absolute Difference") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

4.2 Consistency Analysis - Random Forest

we trained a logistic regression model using the same dataset and compared the model’s coefficients with the mean LIME contributions aggregated across multiple runs. Since LIME is a local explanation method, while logistic regression provides global coefficients, their alignment can indicate whether local interpretability methods reflect global trends.

From the scatter plot of logistic regression coefficients vs LIME mean contributions, features such as age, pressure, and triceps show small absolute differences between the logistic coefficient and LIME average contribution, suggesting good consistency. In contrast, pedigree exhibits a large deviation, indicating that LIME’s local explanations for this variable may not align well with the global behavior captured by the logistic model.

Overall, the analysis suggests that while LIME explanations are partially aligned with global model behavior, they may deviate for variables with complex influence.

4.3 Consistency - Logistic Regression

In the third step, we examined the sensitivity of the model interpretation among different individuals. We used a fixed logistic regression model, applied LIME to 10 test samples and plotted a box plot of feature contributions.

Code
model_type.glm <- function(x, ...) "classification"
predict_model.glm <- function(x, newdata, ...) {
  preds <- predict(x, newdata, type = "response")
  data.frame(`No` = 1 - preds, `Yes` = preds)
}


data(PimaIndiansDiabetes)
df <- na.omit(PimaIndiansDiabetes)
set.seed(5293)
df$diabetes <- factor(df$diabetes)
X <- df[, -ncol(df)]
y <- df$diabetes

lime_contributions <- list()
shap_contributions <- list()


train_idx <- createDataPartition(df$diabetes, p = 0.8, list = FALSE)
train_data <- df[train_idx, ]
test_data <- df[-train_idx, ]
Code
logit_model <- glm(diabetes ~ ., data = train_data, family = binomial)

lime_global <- lime::lime(
  x = train_data[, -ncol(train_data)],
  model = logit_model
)

lime_explanations <- lime::explain(
  x = test_data[1:10, -ncol(test_data)],
  explainer = lime_global,
  n_features = 8,
  n_labels = 1
)

sensitivity_df <- lime_explanations %>%
  select(case, feature, feature_weight) %>%
  rename(variable = feature, contribution = feature_weight)

library(ggplot2)

ggplot(sensitivity_df, aes(x = variable, y = contribution)) +
  geom_boxplot(fill = "#69b3a2", alpha = 0.7) +
  labs(
    title = "LIME Sensitivity: Variability Across Test Samples",
    x = "Variable",
    y = "LIME Contribution"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Code
predictor <- iml::Predictor$new(model = logit_model, data = train_data[, -ncol(train_data)], y = train_data$diabetes)
sensitivity_shap <- list()

for (i in 1:10) {
  shap <- iml::Shapley$new(predictor, x.interest = test_data[i, -ncol(test_data)])
  df <- shap$results %>% mutate(case = i)
  sensitivity_shap[[i]] <- df
}

shap_df <- bind_rows(sensitivity_shap) %>%
  rename(variable = feature, contribution = phi)

ggplot(shap_df, aes(x = variable, y = contribution)) +
  geom_boxplot(fill = "steelblue", alpha = 0.7) +
  labs(title = "SHAP Sensitivity: Variability Across Test Samples",
       x = "Variable", y = "SHAP Contribution") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

4.4 Consistency Analysis - Logistic Regression

  1. LIME chart:

Glucose: The box plot has the largest span (0.1-0.4+), indicating that the interpreted value is sensitive to the changes of the test case.

There are also obvious changes in “mass” and “pregnant”, and the contribution direction will change.

Age, Triceps, Insulin: The box is very narrow and the values are almost fixed, indicating that individual changes have no significant impact on the interpretation.

  1. SHAP Diagram:

Glucose remains the most sensitive variable, with a wider range of variation than LIME (-1 to 1.2).

The contribution of SHAP to most variables (such as age, insulin, triceps) is almost constant.

Compared with LIME, the directions are consistent (such as glucose, mass), but the variation amplitude is greater, showing the characteristic that the local interpretation of SHAP is more sensitive.

LIME and SHAP explanations show a high degree of consistency in identifying the most sensitive and least sensitive features in different test samples. Although SHAP tends to exhibit a large range of variation due to its theoretical design, the overall ranking and directional contribution of features are closely consistent with the LIME results.