You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would like to understand if there is any plan to implement/integrate model interpretation/explanation techniques like DeepLIFT or GradientExplainer from SHAP into Merlin?
SHAP natively supports model interpretation for TF and Pytorch, but the data and TF/PyTorch models in Merlin are wrapped in Merlin Classes, so it's not straightforward to directly apply SHAP on top of the model outcome.
If your team can integrate SHAP into Merlin that would be tremendously helpful!
Motivation
As recommendation systems are generally applied in real-world business problems, the ability to make the model a white box is extremely important when presenting the model outcome to the business stakeholders. SHAP already has support for TF and Pytorch models, if it can be integrated into Merlin Models, then the maturity of the product will be up to the next level!
The text was updated successfully, but these errors were encountered:
We're working on a model evaluation framework that allows for slicing, but we don't have plans for this on our roadmap yet. @zhiruiwang is this something you'd be interested in contributing?
I actually was trying to explore applying SHAP to Merlin models myself earlier.
Since Merlin models made modifications to Keras input, output, and layering structure(Blocks), I can't directly apply DeepExplainer or GradientExplainer to Merlin models although the underlying model is tf.keras. It would require someone who's very familiar with the internal of Merlin models to modify the SHAP package extensively to make those two explainers fit to Merlin's structure.
However, I did make it work using KernelExplainer to intepret Merlin models. I will post some quick code here, your team can take it and put it into example notebooks or integrate it into the codebase as functions.
import shap
from shap import KernelExplainer
# Turn Merlin Dataset intoto pandas since SHAP only accepts pd or np data
valid_pd = (valid.to_ddf().compute().to_pandas()
[schema.column_names] # Select relevant columns
.sample(frac=1, random_state=42) # Shuffle the dataset
.reset_index(drop=True))
def model_fn(inputs):
"""Wrap Merlin model into a function that takes in numpy array passed by SHAP and make prediction
"""
# SHAP will turn data into numpy array. Converting it back to pd DataFrame
inputs_pd = pd.DataFrame(inputs)
# Assign column names to pd DataFrame
inputs_pd.columns = schema.column_names
# Wrap pd DataFrame in Merlin Dataset
dataset = Dataset(inputs_pd)
# Assign schema to dataset
dataset.schema = schema
# Make prediction
return model.predict(dataset, batch_size=1024).flatten()
# Use a selection of 100 samples represent "typical" feature values
explainer = KernelExplainer(model_fn, valid_pd.iloc[:100,:])
# Use 500 perterbation samples to estimate the SHAP values for 200 samples
shap_values = explainer.shap_values(valid_pd.iloc[300:500,:], nsamples=500)
# Plot SHAP summary plot for 200 samples
shap.summary_plot(shap_values, valid_pd.iloc[300:500,:])
🚀 Feature request
Would like to understand if there is any plan to implement/integrate model interpretation/explanation techniques like DeepLIFT or GradientExplainer from SHAP into Merlin?
SHAP natively supports model interpretation for TF and Pytorch, but the data and TF/PyTorch models in Merlin are wrapped in Merlin Classes, so it's not straightforward to directly apply SHAP on top of the model outcome.
If your team can integrate SHAP into Merlin that would be tremendously helpful!
Motivation
As recommendation systems are generally applied in real-world business problems, the ability to make the model a white box is extremely important when presenting the model outcome to the business stakeholders. SHAP already has support for TF and Pytorch models, if it can be integrated into Merlin Models, then the maturity of the product will be up to the next level!
The text was updated successfully, but these errors were encountered: