Machine-Learning-API

Flask API developed using Python Flask, to extract texts from Indian itemized receipts. The extracted text is classified into categories belonging to one of the 21 categories(see 'data' dictinonary in main.py).
The cumulative sum of categorised items along with their respective category is pushed to Firebase Realtime Database. Also, the receipt image is fetched from Firebase Storage.

Files

main.py - The API is written in this file. Run the file using the command:

python main.py
## or
python3 main.py

The frontend is a webpage with the text "Hello World".

2. Fetch_images.py - contains configuration details of firebase, that you will get by creating a new project in firebase.

3. serviceAccount.json - contains configuration details of Firebase storage database.
note: replace congiration details in Fetch_images.py and serviceAccount.json files with your own details.

4. requirements.txt - contains the libraries used for the project.

5. SVC_model.pkl - a pickle file used for categorising, receipt items. It is a SVC model, used for multi-text classification with 3 pre-processing steps done on the text. They are coded as a pipeline with the following functions: removing stopwords, porter stemming, and tf-idf vectoriser.

6. Categorization.ipynb - notebook with all the steps used to develop SVM model i.e SVC_model.pkl.

Dataset

DATA1.csv - 11179 rows with 3 columns of Indian product desccription, sub category, and category.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Machine-Learning-API

Files

Dataset

Files

README.md

Latest commit

History

README.md

File metadata and controls

Machine-Learning-API

Files

Dataset