A tool to summarize and report any flaws in a long agreement/text. This tool will help us to protect ourselves from accepting malicious agreetments, privacy policies, terms and conditions etc. It uses Naive Bayes classification to make the predictions.
- The user clicks on the button in the chrome extension, we get the current tab Url.
- This Url is then passed to the backend via the flask API.
- We then scrape the URL using Beautiful soup in python, to get all the Privacy policies, Terms of services links present in the website.
- The Urls that we get after the first scraping is used to get the privacy policies text using another scraper that uses NLP to get the best result.
- These texts are stored in a file which is then provided to the ML model that uses Naive bayes classification method, to predict the bad sentences presen if any in the privacy texts.
- We then display all the malicious sentences in the chrome extension itself.
- The user will upload any text document that will go the backend filesystem and will be provided to the ML model to make the predictions.
- We then display all the malicious sentences in the web app.
- Making the ML model more efficient by getting more training datasets.
- Also predict the flaws in the website cookies.
- Integrating file upload in the chrome extension itself.
- Git.
- Node & npm (version 12 or greater).
- A fork of the repo.
- Python3 environment to install flask
- DFINITY Canister SDK package(need access to a terminal shell for MacOS or Linux.)
- Clone this repo to your local machine using
https://github.com/Open-Sourced-Olaf/DocVerifier
- Move to the cloned repository
cd DocVerifier
In order to install all packages follow the steps below:
- Move to flask-api folder
cd flask-api
- For installing virtual environment -
python3 -m pip install --user virtualenv
- Create A Virtual env -
python3 -m venv env
- Activate virtual env
- For Mac/Linux :
source env/bin/activate
- For Windows :
.\env\Scripts\activate
- For Mac/Linux :
pip3 install -r requirements.txt
flask run
The model will be served on http://127.0.0.1:5000/
- Move to custom_greeting folder
- Install all the npm packages
npm install
- Start the Internet Computer network on your local computer by running the following command:
dfx start --background
- To deploy the App, run
dfx deploy
- To get the canister Id of assets, run
dfx canister id custom_greeting_assets
- The deployed Url will look like this
http://127.0.0.1:8000/?canisterId=rwlgt-iiaaa-aaaaa-aaaaa-cai
- Whenever we make any changes in the code, want to rebuild the website.
- Run
dfx build
to rebuild the project - Then run
dfx canister install --all --mode reinstall
to deploy the project changes
- Stop the internet computer using
dfx stop
- Check if our current internet connection will allow us to connect to the Internet Computer network:
dfx ping ic
- Build and deploy the sample application to the Internet Computer by running the command
dfx deploy --network=ic
- Go to the
chrome://extensions
in the browser - Click on load unpacked and choose the
chrome-extension
folder. - Publish it in chrome web store
- To publish your item to the Chrome Web Store, follow these steps:
- Create your item's zip file
- Create a developer account https://chrome.google.com/webstore/devconsole/
- Upload your item
- Add assets for your listing
- Submit your item for publishing
The following is a high-level overview of relevant files and folders.
DocVerifier/
├── flask-api/
│ ├── datasets
│ ├── static/uploads
│ ├── model
│ ├── scraper
│ ├── templates
│ ├── .gitignore
│ ├── Procfile
│ ├── nltk.txt
│ ├── requirements.txt
│ ├── runtime.txt
│ ├── output.txt
│ └── app.py
└── custom_greeting/
├── node_modules/
├── src/
│ ├── custom_greeting/
│ │ ├── main.mo
│ ├── custom_greeting_assets/
│ │ ├── assets
│ │ └── public
├── dfx.json
├── package.json
|__ webpack.config.js
|__ tsconfig.json
|__ canister_ids.json
|__ README.md
|__ package-lock.json
|__ .gitignore
|
|__chrome-extension
|_ background.js
|_ icon.png
|_ manifest.json
|_ window.html
|_ icon.svg
|_ style.css
|__images
|_ demo.gif
|__jupyter-notebooks
|_ privacy_policy_predictor.ipynb
|_ web_Scraping.ipynb
|__ .gitignore
|__ CODE_OF_CONDUCT.md
|__ LICENSE
|__ README.md
- Collected the good and bad policies for training our model was a time consuming task.
- Not finding any way to have file upload popup working in a chrome extension.
- Fork and clone the repository
git clone https://github.com/Open-Sourced-Olaf/DocVerifier
- Create a branch
git checkout -b "branch_name"
- Make changes in that branch
- Add and commit your changes
git add . && git commit -m "your commit message"
- Then push the changes into your branch
git push origin branch_name
- Now you can create a PR using that branch in our repository.
- 🎉 you have successfully contributed to this project.
- https://mlh-fellowship.gitbook.io/fellow-handbook/sponsor-resources/dfinity
- https://sdk.dfinity.org/docs/developers-guide/tutorials/custom-custom_greeting.html
Shoutout goes to these wonderful people:
Anjali Soni 💻 |
Steven Tey 💻 |
Shrill Shrestha 💻 |
Rashi Sharma 💻 |