[FEA] Implement model interpretation/explanation #311

zhiruiwang · 2022-03-31T03:54:13Z

🚀 Feature request

Would like to understand if there is any plan to implement/integrate model interpretation/explanation techniques like DeepLIFT or GradientExplainer from SHAP into Merlin?

SHAP natively supports model interpretation for TF and Pytorch, but the data and TF/PyTorch models in Merlin are wrapped in Merlin Classes, so it's not straightforward to directly apply SHAP on top of the model outcome.

If your team can integrate SHAP into Merlin that would be tremendously helpful!

Motivation

As recommendation systems are generally applied in real-world business problems, the ability to make the model a white box is extremely important when presenting the model outcome to the business stakeholders. SHAP already has support for TF and Pytorch models, if it can be integrated into Merlin Models, then the maturity of the product will be up to the next level!

EvenOldridge · 2022-09-13T22:20:23Z

We're working on a model evaluation framework that allows for slicing, but we don't have plans for this on our roadmap yet. @zhiruiwang is this something you'd be interested in contributing?

zhiruiwang · 2022-09-14T14:49:40Z

Hi @EvenOldridge,

I actually was trying to explore applying SHAP to Merlin models myself earlier.

Since Merlin models made modifications to Keras input, output, and layering structure(Blocks), I can't directly apply DeepExplainer or GradientExplainer to Merlin models although the underlying model is tf.keras. It would require someone who's very familiar with the internal of Merlin models to modify the SHAP package extensively to make those two explainers fit to Merlin's structure.

However, I did make it work using KernelExplainer to intepret Merlin models. I will post some quick code here, your team can take it and put it into example notebooks or integrate it into the codebase as functions.

The code below is applied to the DLRM model after it is trained in RecSys 22 tutorial notebook 2

import shap
from shap import KernelExplainer

# Turn Merlin Dataset intoto pandas since SHAP only accepts pd or np data
valid_pd = (valid.to_ddf().compute().to_pandas()
            [schema.column_names] # Select relevant columns
            .sample(frac=1, random_state=42) # Shuffle the dataset
            .reset_index(drop=True))

def model_fn(inputs):
  """Wrap Merlin model into a function that takes in numpy array passed by SHAP and make prediction
  """
  # SHAP will turn data into numpy array. Converting it back to pd DataFrame
  inputs_pd = pd.DataFrame(inputs)
  # Assign column names to pd DataFrame
  inputs_pd.columns = schema.column_names
  # Wrap pd DataFrame in Merlin Dataset
  dataset = Dataset(inputs_pd)
  # Assign schema to dataset
  dataset.schema = schema
  # Make prediction
  return model.predict(dataset, batch_size=1024).flatten()

# Use a selection of 100 samples represent "typical" feature values
explainer = KernelExplainer(model_fn, valid_pd.iloc[:100,:])
# Use 500 perterbation samples to estimate the SHAP values for 200 samples
shap_values = explainer.shap_values(valid_pd.iloc[300:500,:], nsamples=500)

# Plot SHAP summary plot for 200 samples
shap.summary_plot(shap_values, valid_pd.iloc[300:500,:])

zhiruiwang added the status/needs-triage label Mar 31, 2022

rnyak added question Further information is requested P2 and removed question Further information is requested labels Sep 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Implement model interpretation/explanation #311

[FEA] Implement model interpretation/explanation #311

zhiruiwang commented Mar 31, 2022

EvenOldridge commented Sep 13, 2022

zhiruiwang commented Sep 14, 2022

[FEA] Implement model interpretation/explanation #311

[FEA] Implement model interpretation/explanation #311

Comments

zhiruiwang commented Mar 31, 2022

🚀 Feature request

Motivation

EvenOldridge commented Sep 13, 2022

zhiruiwang commented Sep 14, 2022