Get the representation of all rules found by sklearn RandomForestClassifier. It works in following way:
- On each feature, it applies one-hot encoding that makes each column binary.
- Random Forest runs on the features and a target attribute.
- All trees are extracted from the Random Forest Regressor.
- Decision Trees are split to classification rules.
https://github.com/lukassykora/randomForestRules
pip install randomForestRules-lukassykora
from randomForestRules import RandomForestRules
import pandas as pd
df = pd.read_csv("data/audiology.csv")
df.columns = df.columns.str.replace("_", "-") # underscore not allowed
# All feature columns
cols=[]
for col in df.columns:
if col != 'binaryClass':
cols.append(col)
# Initialize
randomForest = RandomForestRules()
# Load data
randomForest.load_pandas(df)
# Fit
randomForest.fit(antecedent = cols, consequent = 'binaryClass', supp=0.005, conf=50)
# Get result
frame = randomForest.get_frame()