ALigned channel overview

pegasystems · Oct 15, 2023 · 15b1334 · 15b1334
1 parent f1050eb
commit 15b1334
Show file tree

Hide file tree

Showing 2 changed files with 83 additions and 31 deletions.
diff --git a/examples/datamart/healthcheck.Rmd b/examples/datamart/healthcheck.Rmd
@@ -221,18 +221,27 @@ configurations it seems that the framework
 -   Channels with no positive feedback
 
 ```{r Channel overview}
-channelSummary <- datamart$modeldata[, .(`Responses` = max(ResponseCount), 
-                                         `Positives` = max(Positives), 
-                                         `Base rate` = sprintf("%.2f%%", 100*max(Positives)/max(ResponseCount)),
-                                         `Supported by Configurations` = paste(sort(unique(ConfigurationName)), collapse = ", "),
-                                         N = uniqueN(ConfigurationName)), 
-                                     by=c("Channel","Direction")][order(Channel)]
+
+# take the max counts per model
+channelSummaryMaxByModel <- datamart$modeldata[, .(Responses = max(ResponseCount), 
+                                                   Positives = max(Positives)),
+                                               by=c("Direction", "Channel", "ModelID", "ConfigurationName")]
+# sum up the response counts per configuration and channel
+channelSummaryCountsByConfiguration <- channelSummaryMaxByModel[, .(Responses = sum(Responses), 
+                                                                    Positives = sum(Positives)), 
+                                                                by=c("Direction", "Channel", "ConfigurationName")]
+# multiple model configurations could be driving one channel, take the max
+channelSummary <- channelSummaryCountsByConfiguration[, .(Responses = max(Responses),
+                                                          Positives = max(Positives),
+                                                          `Base rate` = sprintf("%.2f%%", 100*max(Positives)/max(Responses)),
+                                                          `Supported by Configurations` = paste(sort(unique(ConfigurationName)), collapse = ", "),
+                                                          N = uniqueN(ConfigurationName)), by=c("Direction", "Channel")][order(Channel, Direction)]
 
 channelSummary[, 1:(ncol(channelSummary)-1)] %>%
   kbl() %>%
   kable_paper(c("striped", "hover"), full_width = F) %>%
-  column_spec(3, background = ifelse(channelSummary[[3]] > 0 ,"white", "red")) %>%
-  column_spec(4, background = ifelse(channelSummary[[4]] > 0 ,"white", "red")) %>%
+  column_spec(3, background = ifelse(channelSummary$Responses > 0 ,"white", "red")) %>%
+  column_spec(4, background = ifelse(channelSummary$Positives > 0 ,"white", "red")) %>%
   column_spec(6, background = ifelse(channelSummary$N <= 2 ,"white", "orange"))
 ```
 

diff --git a/python/pdstools/reports/HealthCheck.qmd b/python/pdstools/reports/HealthCheck.qmd
@@ -85,12 +85,6 @@ display(
 )
 ```
 
-```{python}
-xxx=ADMDatamart(
-        path = "/Users/perdo/Downloads/CrossCustomerCache", 
-        model_filename = "Achmea_-_PDC_ModelSnapshots.arrow", 
-        predictor_filename = '')
-```
 
 ```{python}
 #| tags: [initialization]
@@ -136,6 +130,10 @@ if datamart.predictorData is not None:
 else:
     datamart_all_columns = datamart.modelData.columns
 
+channel_overview_columns = [
+    col for col in ["Channel", "Direction"] if col in datamart_all_columns
+]
+
 standardNBADNames = [
     "Assisted_Click_Through_Rate",
     "CallCenter_Click_Through_Rate",
@@ -157,14 +155,26 @@ currentConfigurationNames = datamart.modelData.select(pl.col('Configuration')).u
 configurationNamesInStandardNBADModelNames = [(c in standardNBADNames) for c in currentConfigurationNames]
 ```
 
-
-
 This document gives a global overview of the Adaptive models and predictors. It is generated from a Python markdown file in the [Pega Data Scientist Tools](https://github.com/pegasystems/pega-datascientist-tools). This is open-source software and comes without guarantees. Off-line reports for individual
 models can be created as well, see [Wiki](https://github.com/pegasystems/pega-datascientist-tools/wiki).
 
-## Guidance
+We provide guidance and best practices in the form of bulletted lists of attention points. However these are only generic practices and may or may not be applicable to the specific use case and situation of the implementation.
 
-Where possible, we try to provide guidance and best practices in the form of bulletted lists of attention points. However these are only generic practices and may or may not be applicable to the specific use case and situation of the implementation.
+```{python}
+# Start with a global bubble chart. Maybe later replace by
+# some ADM metrics, e.g. overall AUC, CTR, some other things.
+fig = datamart.plotPerformanceSuccessRateBubbleChart()
+fig.layout.coloraxis.colorscale = pega_template.success
+fig.for_each_xaxis(lambda xaxis: xaxis.update(showticklabels=True, visible=True))
+fig.for_each_xaxis(lambda xaxis: xaxis.update(dict(
+        tickmode = 'array',
+        tickvals = [50, 60, 70, 80, 90,100],
+    )))
+fig.update_layout(autosize=True, height=400, title="All ADM Models",
+xaxis_title="Model Performance", yaxis_title="Success Rate")
+fig.update_coloraxes(showscale=False)
+fig.show()
+```
 
 ::: {.callout-tip}
 The [Plotly](https://plotly.com/python/) charts have [user controls for panning,
@@ -175,7 +185,7 @@ or Box. It is preferable to view them from a browser.
 
 # Overview of Channels
 
-In a typical NBAD setup, channels are served by both one channel specific model configuration as well as a cross-channel *OmniAdaptiveModel* configuration. That *OmniAdaptiveModel* is typically used only as a fall-back option when the treatment model is not mature enough.
+In a typical NBAD setup, channels are served by both one channel specific model configuration as well as a cross-channel *OmniAdaptiveModel* configuration.
 
 If a channel has two model configurations with a naming pattern like “Adm_12345678912”, this could indicate the usage of the (no longer recommended) “2-stage model” predictions for conversion modeling, generated by Prediction Studio.
 
@@ -187,7 +197,7 @@ display(
         f"""
 The standard Pega Next Best Action Designer framework defines a number
 of standard Adaptive Models for channels. By looking at the names of the
-configurations it seems that the framework {framework_usage}.
+configurations it seems that the framework **{framework_usage}**.
 """
     )
 )
@@ -200,32 +210,65 @@ configurations it seems that the framework {framework_usage}.
 :::
 
 ```{python}
-channel_overview_columns = [
-    col for col in ["Channel", "Direction"] if col in datamart_all_columns
-]
 channel_overview = (
-    last_data
+    datamart.modelData
+    # first, take max per model ID
     .group_by(["Configuration", "ModelID"] + channel_overview_columns)
     .agg(
         pl.max("ResponseCount"), 
         pl.max("Positives")
-    ).group_by(channel_overview_columns)
+    )
+    # then, take sum of model max per configuration
+    .group_by(["Configuration"] + channel_overview_columns)
     .agg(
-        pl.sum("ResponseCount").alias("Responses"),
-        pl.sum("Positives"),
-        pl.format("{}%", ((pl.sum("Positives") * 100 / pl.sum("ResponseCount")).round(3))).alias("Base rate"),
-        # TODO format as a comma separated not with the square brackets
+        pl.sum("ResponseCount"), 
+        pl.sum("Positives")
+    )
+    # finally, take the max of the configurations per channel
+    .group_by(channel_overview_columns)
+    .agg(
+        pl.max("ResponseCount").cast(pl.Int32).alias("Responses"), 
+        pl.max("Positives").cast(pl.Int32),
+        pl.format("{}%", ((pl.max("Positives") * 100 / pl.max("ResponseCount")).round(3))).alias("Base rate"),
         pl.col("Configuration").unique().alias("Supported by Configurations")
     )
+    .collect()
 )
 
-show(channel_overview.to_pandas())
+df = channel_overview.to_pandas()
+
+def get_background_style_channel_overview(values):
+    def get_background(value):
+        if isinstance(value, int) or isinstance(value, float):
+            if value == 0:
+                return("background: red")
+            elif value < 200:
+                return("background: orange")
+            else:
+                return(None)
+        # else:
+        #     # warning for > 2 configurations to drive one channel
+        #     if len(value) > 2:
+        #         return("background: orange")
+        #     else:
+        #         return(None)
+        return(None)
+
+    return([get_background(v) for v in values])
+
+# work in progress..
+# df_styled = (
+#     df.style
+#     .apply(get_background_style_channel_overview, axis=1, subset=["Positives", "Responses"])
+# )
+
+show(df)
 ```
 
 
 # Overview of the Actions
 
-In a standard setup, the offers/conversations are presented as treatments for actions in a hierarchical structure setup in NBA Designer. The recommendation is to have multiple treatments for an action. Treatments are often channel specific and you would typically expect more unique treatments than there are actions.
+In a standard setup, the offers/conversations are presented as treatments for actions in a hierarchical structure setup in NBA Designer. Treatments are often channel specific and you would typically expect more unique treatments than there are actions.
 
 Adaptive Models are created per treatment (at least in the default setup) and the recommendation is to stick the default context keys of the models.