The Personally Identifiable Information (PII) Data Detector is an individual machine learning project developed as part of the CSC532 Machine Learning course. The goal of this project is to detect personally identifiable information (PII) in student writing. In the web application, users can input text into the text editor. The application will then highlight words considered as PII and suggest removing those words. Additionally, users can save the text for later viewing.
Programming Languages:
- Python
- TypeScript
- Go
AI/Data Science Tools:
- PEFT
- Spacy
- Transformers
- Gemma
- Faker
- Numpy
- Pandas
- Matplotlib
- Seaborn
Development Tools:
- Web Application: NextJS
- Backend APIs: Go Fiber, Flask
- Database: PostgreSQL
- Database ORM: Prisma
- 3rd Party API: Firebase Authentication
- Container Management: Docker
- Hosting: Google Cloud Run
- CI/CD: GitHub Action
- Download
model.safetensors
from https://drive.google.com/file/d/19gw8qc6TlHQb5Ag2Ke_e2vEPfVGCRrW3/view?usp=sharing and place it in/pii_data_detector/model
. - In the terminal, navigate to the
/pii_data_detector
directory. - Run
pip install -r requirements.txt
in the terminal. - Run
python main.py
in the terminal to start the server.
- In the terminal, navigate to the
/backend
directory. - Create
.env
in that directory. (See the example in.env.example
) - Run
go run github.com/steebchen/prisma-client-go db push
in the terminal. - Run
go run server.go
in the terminal to start the server.
- In the terminal, navigate to the
/frontend
directory. - Create
.env.local
in that directory. (See the example in.env.example
) - Run
npm i
in the terminal. - Run
npm run dev
in the terminal to start the server.
Note 1: You may be required to install additional packages/libraries.
Note 2: You must run all three servers in order for the web application to be fully functional.
Note 3: It is mandatory to set up the PostgreSQL database and Firebase Authentication before running.
For more information, please refer to the "Wiki" section at https://github.com/jedipw/PIIDataDetector/wiki.