As per wikipedia, the Enron Corpus is a large database of over 600,000 emails generated by 158 employees of the Enron Corporation and acquired by the Federal Energy Regulatory Commission during its investigation after the company's collapse.
In 2000, Enron was one of the largest companies in the United States. By 2002, it had collapsed into bankruptcy due to widespread corporate fraud. In the resulting Federal investigation, there was a significant amount of typically confidential information entered into public record, including tens of thousands of emails and detailed financial data for top executives.
This project is a part of Udacity's Data Analyst Nano-Degree and Intro to ML Course.
The goal of the project is to find the person of interset(POI) based on financial and email data made public as a result of the Enron scandal. A POI is someone who is directly or indirectly involved in the fraud case.
Using Machine Learning techniques we will predict whethere a person is POI or not. Its a classification problem.
- Python
- Scikit-Learn Library
- Matplotlib