Skip to content

Fairness in data, and machine learning algorithms is critical to building safe and responsible AI systems.

Notifications You must be signed in to change notification settings

w4ester/AI-Ethics_edu

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 

Repository files navigation

Learn to Adopt responsible AI that will help you build Build Ethical Models

Workshop Resources

Table of Contents

Prerequisites

Sign-up/Login to IBM Cloud

If you are an existing user please login to IBM Cloud through http://ibm.biz/aiethics

And if you are not, don't worry! We have got you covered! There are 3 steps to create your account on IBM Cloud:

  1. Put your email and password.
  2. You get a verification link with the registered email to verify your account.
  3. Fill the personal information fields. ** Please make sure you select the country you are in when asked at any step of the registration process.

image

About the Workshop

How do we remove bias from the machine learning models and ensure that the predictions are fair? What are the three stages in which the bias mitigation solution can be applied? This code pattern answers these questions and more to help developers, data scientists, stakeholders take informed decision by consuming the results of predictive models.

Fairness in data, and machine learning algorithms is critical to building safe and responsible AI systems from the ground up by design. Both technical and business AI stakeholders are in constant pursuit of fairness to ensure they meaningfully address problems like AI bias. While accuracy is one metric for evaluating the accuracy of a machine learning model, fairness gives us a way to understand the practical implications of deploying the model in a real-world situation.

How does the fairness algorithm work?

The bias mitigation algorithm can be applied in three different stages of model building. These stages are pre-processing, in-processing & post-processing. The below diagram demonstrates how it works. algorithm-working

The AIF360 Python package contains nine different algorithms, developed by the broader algorithmic fairness research community, to mitigate that unwanted bias. They can all be called in a standard way, very similar to scikit-learn’s fit/predict paradigm. In this way, we hope that the package is not only a way to bring all of us researchers together, but also a way to translate our collective research results to data scientists, data engineers, and developers deploying solutions in a variety of industries. You can learn more about AIF 360 here.

Architecture diagram

architecture-diagram

Flow

  1. Log in to Watson Studio powered by spark, initiate Cloud Object Storage, and create a project.
  2. Upload the .csv data file to Object Storage.
  3. Load the Data File in Watson Studio Notebook.
  4. Install aif 360 Toolkit in the Watson Studio Notebook.
  5. Analyze the results after applying the bias mitigation algorithm during pre-processing, in-processing & post-processing stages.

Step 1: Create an IBM Cloud account

Login/Sign-up for IBM Cloud Account: http://ibm.biz/aiethics

Step 2: Create Watson Studio service

Go to your IBM Cloud Dashboard, type in search box "Watson studio" select the service, on the service page click create

1

2

Step 3: Create a new Watson Studio project

Click on get started it will take you to a "Cloud Pak for Data" Dashboard.

3

4

Select create a project

5

Click on create an empty project

6

Give your project a unique name and click on create

7

Step 4: Add Data

Go to https://github.com/Anam-Mahmood/AI-Ethics and either clone or download code in zip file. Un-zip the folder.

download-zip

Now that you have the code files on your computer, go to your AutoAI project on IBM Cloud. Click on Assets and select Browse and add fraud_data.csv file from your file system. Repeat the step and add the Pipeline_LabelEncoder-0.1.zip file as an asset.

9

Step 5: Create the notebook

Note: You will create three Notebooks here by repeating the below steps.

From IBM Watson Dashboard click on "Add to project" and select "Notebook"

10

11

Create a Pre-processing Notebook. Select the "From URL" tab and Enter a name for the notebook. Select the runtime (2 vCPU and 8 GB RAM.) Enter this Notebook URL for Pre-processing : https://github.com/IBM/bias-mitigation-of-machine-learning-models-using-aif360/blob/main/notebooks/Pre-processing.ipynb

12

After the noteboob is imported, click on Not Trusted and select the option to trust the source of the notebook.

13

Repeat the above steps for in-processing and Post-processing

Enter this Notebook URL for In-processing : https://github.com/IBM/bias-mitigation-of-machine-learning-models-using-aif360/blob/main/notebooks/In-processing.ipynb Enter this Notebook URL for Post-processing : https://github.com/IBM/bias-mitigation-of-machine-learning-models-using-aif360/blob/main/notebooks/Post-processing.ipynb

Step 6: Insert the data as dataframe

Open Pre-processing Notebook from Dashboard and click on edit

14

15

Click on 0010 icon at the top right side which will bring up the data assets tab. Click on Insert to code dropdown for fraud_data.csv and select the option Insert Pandas Dataframe.

Screenshot 2021-08-20 173819

Step 7: Run the notebook & Analyze Result

Click on Run icon to run the code

16

When a notebook is executed, what is actually happening is that each code cell in the notebook is executed, in order, from top to bottom.

Each code cell is selectable and is preceded by a tag in the left margin. The tag format is In [x]:. Depending on the state of the notebook, the x can be:

A blank, this indicates that the cell has never been executed.
A number, this number represents the relative order this code step was executed.
A *, this indicates that the cell is currently executing.

There are several ways to execute the code cells in your notebook:

One cell at a time.
    Select the cell, and then press the Play button in the toolbar.
Batch mode, in sequential order.
    From the Cell menu bar, there are several options available. For example, you can Run All cells in your notebook, or you can Run All Below, that will start executing from the first cell under the currently selected cell, and then continue executing all cells that follow.

After we run all cells in the notebook, the results are displayed at the end of each notebook per below. Pre-processing results We can observe that, priviledged group had 37% more chance of getting a favorable outcome because of the bias in the dataset.

17

Feedback

Your feedback matters to us! Take this short survey and let us know how we are doing! www.surveygizmo.com/s3/6083679/fd8654af11e9?uid=615ab51a4260e2cf77735586 survey

Reference

https://github.com/IBM/bias-mitigation-of-machine-learning-models-using-aif360

About

Fairness in data, and machine learning algorithms is critical to building safe and responsible AI systems.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published