-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Init commit Init commit
- Loading branch information
d4rkc0de
committed
Sep 18, 2024
0 parents
commit 964216f
Showing
37 changed files
with
15,130 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
The .env file is used to store environment variables for the backend server. it contains two sections: Text Processing Configuration and Image Processing Configuration: | ||
|
||
## Text Processing Configuration | ||
|
||
These variables are used for document (text) processing: | ||
|
||
- TEXT_API_END_POINT: Specifies the API endpoint for text processing. | ||
- TEXT_MODEL_NAME: Defines the model to be used for text processing. | ||
- TEXT_API_KEYS: A list containing the API key(s) required for authentication when making requests to the text API | ||
endpoint. **`Using multiple keys will help in avoiding rate limits.`** | ||
|
||
## Image Processing Configuration | ||
|
||
These variables are used for image processing: | ||
|
||
- IMAGE_API_END_POINT: Specifies the API endpoint for image processing. | ||
- IMAGE_MODEL_NAME: Defines the model used for image processing. | ||
- IMAGE_API_KEYS: A list containing the API key(s) for image processing requests. Using multiple keys will help in avoiding rate limits. | ||
|
||
|
||
## Examples: | ||
|
||
- **OPENAI** | ||
```bash | ||
# API and MODEL used for documents processing | ||
TTEXT_API_END_POINT=https://api.openai.com/v1 | ||
TTEXT_MODEL_NAME=gpt-4o | ||
TTEXT_API_KEYS=["sk-xxx","sk-yyy"] | ||
|
||
# API and MODEL used for images processing | ||
TIMAGE_API_END_POINT=https://api.openai.com/v1 | ||
TIMAGE_MODEL_NAME=gpt-4o | ||
TIMAGE_API_KEYS=["sk-xxx","sk-yyy"] | ||
``` | ||
|
||
- **GROQ** | ||
```bash | ||
# API and MODEL used for documents processing | ||
TEXT_API_END_POINT=https://api.groq.com/openai/v1 | ||
TEXT_MODEL_NAME=llama3-70b-8192 | ||
TEXT_API_KEYS=["gsk_xxx","gsk_yyy"] | ||
|
||
# API and MODEL used for images processing ( No vision models for GROQ yet) | ||
IMAGE_API_END_POINT=http://localhost:11434/v1 | ||
IMAGE_MODEL_NAME=moondream:latest | ||
IMAGE_API_KEYS=["ollama"] | ||
``` | ||
|
||
- **OLLAMA** | ||
```bash | ||
# API and MODEL used for documents processing | ||
TEXT_API_END_POINT=http://localhost:11434/v1 | ||
TEXT_MODEL_NAME=gemma2:latest | ||
TEXT_API_KEYS=["ollama"] | ||
|
||
# API and MODEL used for images processing | ||
IMAGE_API_END_POINT=http://localhost:11434/v1 | ||
IMAGE_MODEL_NAME=moondream:latest | ||
IMAGE_API_KEYS=["ollama"] | ||
``` | ||
|
||
|
||
- **HUGGING FACE** | ||
```bash | ||
# API and MODEL used for documents processing | ||
TEXT_API_END_POINT=https://api-inference.huggingface.co/v1 | ||
TEXT_MODEL_NAME=microsoft/Phi-3-mini-4k-instruct | ||
TEXT_API_KEYS=["hf_xxx","hf_yyy"] | ||
|
||
# API and MODEL used for images processing | ||
IMAGE_API_END_POINT=https://api-inference.huggingface.co/v1 | ||
IMAGE_MODEL_NAME=nlpconnect/vit-gpt2-image-captioning | ||
IMAGE_API_KEYS=["hf_xxx","hf_yyy"] | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# IDE and OS specific files | ||
**/.idea/ | ||
**/.vscode/ | ||
.DS_Store | ||
Thumbs.db | ||
|
||
# Frontend (Angular) specific files | ||
frontend/node_modules/ | ||
frontend/dist/ | ||
frontend/.angular/ | ||
frontend/*.js.map | ||
|
||
# Backend (FastAPI) specific files | ||
backend/__pycache__/ | ||
backend/*.pyc | ||
backend/*.pyo | ||
backend/*.pyd | ||
backend/venv/ | ||
venv/ | ||
backend/env/ | ||
**/__pycache__/ | ||
backend/app/*.db |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,144 @@ | ||
# FileWizardAi | ||
|
||
## Description | ||
|
||
FileWizardAi is a Python/Angular project designed to automatically organize your files into a well-structured directory | ||
hierarchy and rename them according to their content. This tool is ideal for anyone looking to declutter their digital | ||
workspace by sorting files into appropriate folders and providing descriptive names, making it easier to manage and | ||
locate files. Additionally, it allows you to input a text prompt and instantly searches for files that are related to | ||
your query, providing you with the most relevant files based on the content you provide. | ||
|
||
The app also features a caching system to minimize API calls, ensuring that only new or modified files are processed. | ||
|
||
### Example: | ||
|
||
**Before** | ||
|
||
```bash | ||
/home/user | ||
├── Downloads | ||
│ ├── 6.1 Course Curriculum v2.pdf | ||
│ └── trip_paris.txt | ||
│ └── 8d71473c-533f-4ba3-9bce-55d3d9a6662a.jpg | ||
│ └── Screenshot_from_2024-06-10_21-39-24.png | ||
``` | ||
|
||
**After** | ||
|
||
```bash | ||
/home/user/Downloads | ||
├─ docs | ||
│ └─ certifications | ||
│ └─ databricks | ||
│ └─ data_engineer_associate | ||
│ └─ curriculum_v2.pdf | ||
├─ Personal Photos | ||
│ └─ 2024 | ||
│ └─ 03 | ||
│ └─ 01 | ||
│ └─ person_in_black_shirt.jpg | ||
├─ finance-docs | ||
│ └─ trip-expenses | ||
│ └─ paris | ||
│ └─ trip-justification.txt | ||
└─ project Assets | ||
└─ instructions_screenshot.png | ||
``` | ||
|
||
### Video tutorial: | ||
|
||
[![Watch the video](./yt_video_logo.png)](https://www.youtube.com/watch?v=T1rXLox80rM) | ||
|
||
|
||
## Table of Contents | ||
|
||
- [Installation](#installation) | ||
- [Usage](#usage) | ||
- [Run in Development Mode](#run-in-development-mode) | ||
- [Credits](#credits) | ||
- [License](#license) | ||
- [Technical architecture](#technical-architecture) | ||
|
||
## Installation | ||
|
||
Make sure you have Python installed on your machine. | ||
|
||
First, clone the repository: | ||
|
||
```bash | ||
git clone https://github.com/AIxHunter/FileWizardAi.git | ||
``` | ||
|
||
Navigate to the backend folder and update your `.env` file according to the [documentation](.env.md). Then, install the | ||
required | ||
packages by running ( preferably in a virtual environment like venv or conda): | ||
|
||
```bash | ||
cd backend | ||
pip install -r requirements.txt | ||
``` | ||
|
||
## Usage | ||
|
||
Run the backend server | ||
|
||
```bash | ||
cd backend | ||
uvicorn app.server:app --host localhost --port 8000 | ||
``` | ||
|
||
App will be running under: http://localhost:8000/ | ||
|
||
## Run in Development Mode | ||
|
||
If you are a developper and you want to modify the frontend, you can run the frontend and backend separately, here is | ||
how to do it: | ||
Install Node.js https://nodejs.org/ | ||
|
||
install Angular CLI: | ||
|
||
```bash | ||
npm install -g @angular/cli | ||
``` | ||
|
||
Run frontend: | ||
|
||
```bash | ||
cd frontend | ||
npm install | ||
ng serve | ||
``` | ||
|
||
The frontend will be available at `http://localhost:4200`. | ||
|
||
to package the frontend run: | ||
|
||
```bash | ||
ng build --base-href static/ | ||
``` | ||
|
||
Run backend: | ||
|
||
Update your `.env` file with the desired API settings (check the [documentation](.env.md)), then: | ||
|
||
```bash | ||
cd backend | ||
uvicorn app.server:app --host localhost --port 8000 --reload | ||
``` | ||
|
||
## Technical architecture | ||
|
||
<img src="filewizardai_architecture.png" alt="Online Image" width="600"/> | ||
|
||
1. Send request from Angular frontend (ex, organize files) | ||
2. Backend receives request through a REST API of FastAPI. | ||
3. Check SQLite if files has already been processed (cached files). | ||
4. Return cached summary if file was already processed. | ||
5. If the file has not been processed before, send new file to LLM for summarization. | ||
6. Cache summary in SQLite. | ||
7. Return summary to Angular frontend. | ||
8. Display summary to user and perform actions if need it. | ||
|
||
## License | ||
|
||
This project is licensed under the MIT License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# API and MODEL used for documents processing | ||
TEXT_API_END_POINT=https://api.groq.com/openai/v1 | ||
TEXT_MODEL_NAME=llama3-70b-8192 | ||
TEXT_API_KEYS=["gsk_xxx"] | ||
|
||
# API and MODEL used for images processing | ||
IMAGE_API_END_POINT=http://localhost:11434/v1 | ||
IMAGE_MODEL_NAME=moondream:latest | ||
IMAGE_API_KEYS=["ollama"] # Required but not used |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
import sqlite3 | ||
|
||
|
||
class SQLiteDB: | ||
def __init__(self): | ||
self.conn = sqlite3.connect('FileWizardAi.db') | ||
self.cursor = self.conn.cursor() | ||
create_table_query = "CREATE TABLE IF NOT EXISTS files_summary (file_path TEXT PRIMARY KEY,file_hash TEXT NOT NULL,summary TEXT)" | ||
self.cursor.execute(create_table_query) | ||
self.conn.commit() | ||
|
||
def select(self, table_name, where_clause=None): | ||
sql = f"SELECT * FROM {table_name}" | ||
if where_clause: | ||
sql += f" WHERE {where_clause}" | ||
self.cursor.execute(sql) | ||
return self.cursor.fetchall() | ||
|
||
def is_file_exist(self, file_path, file_hash): | ||
self.cursor.execute("SELECT * FROM files_summary WHERE file_path = ? AND file_hash = ?", (file_path, file_hash)) | ||
file = self.cursor.fetchone() | ||
return bool(file) | ||
|
||
def insert_file_summary(self, file_path, file_hash, summary): | ||
c = self.conn.cursor() | ||
c.execute("SELECT * FROM files_summary WHERE file_path=?", (file_path,)) | ||
user_exists = c.fetchone() | ||
|
||
if user_exists: | ||
c.execute("UPDATE files_summary SET file_hash=?, summary=? WHERE file_path=?", | ||
(file_hash, summary, file_path)) | ||
else: | ||
c.execute("INSERT INTO files_summary (file_path, file_hash, summary) VALUES (?, ?, ?)", | ||
(file_path, file_hash, summary)) | ||
self.conn.commit() | ||
|
||
def get_file_summary(self, file_path): | ||
self.cursor.execute("SELECT summary FROM files_summary WHERE file_path = ?", (file_path,)) | ||
result = self.cursor.fetchone() | ||
return result[0] if result else None | ||
|
||
def drop_table(self): | ||
self.cursor.execute("DROP TABLE IF EXISTS files_summary") | ||
self.conn.commit() | ||
|
||
def get_all_files(self): | ||
self.cursor.execute("SELECT file_path FROM files_summary") | ||
results = self.cursor.fetchall() | ||
files_path = [row[0] for row in results] | ||
return files_path | ||
|
||
def update_file(self, old_file_path, new_file_path, new_hash): | ||
self.cursor.execute("UPDATE files_summary SET file_path = ?, file_hash = ? WHERE file_path = ?", | ||
(new_file_path, new_hash, old_file_path)) | ||
self.conn.commit() | ||
|
||
def delete_records(self, file_paths): | ||
placeholders = ",".join("?" * len(file_paths)) | ||
self.cursor.execute(f"DELETE FROM files_summary WHERE file_path IN ({placeholders})", file_paths) | ||
self.conn.commit() | ||
|
||
def close(self): | ||
self.conn.close() |
Oops, something went wrong.