AI Image Detector

Introduction 🌟

Identifying whether images are AI-generated or human-made is crucial as AI's capability to produce lifelike images improves. This project is an attempt to tackle this challenge through advanced machine learning, aiming to effectively classify images with minimal uncertainty.

The AI Image Detector project endeavors to provide a tool for reliably distinguishing between AI-generated and human-created images. This is especially important in areas where the authenticity of an image is critical, such as journalism and legal matters.

Overview and Model Information 🌐📊

This tool, designed around Microsoft's CvT-13 model, has been custom-tuned to better discern between AI and human image generation. Training on a dataset of 2.5 million varied images has significantly advanced its image recognition capabilities. The integration with Hugging Face's API further enhances its usability and adaptability.

HuggingFace🤗

Coming soon...

Key Features 🚀

CvT-13 Architecture: A highly optimized version of Microsoft's Convolutional Vision Transformer.
Massive Dataset: Painstakingly trained on a diverse and extensive set of 2.5 million images.
Custom Evaluation Metrics: Advanced evaluation scripts for in-depth performance analysis, including confusion matrices.
Flexible Data Handling: A bespoke data loader for streamlined integration and efficient processing.

Dataset

The model's development leveraged a comprehensive dataset originally curated by AWSAF, consisting of approximately 2.5 million images that include a mix of AI-generated and human-created images. To enhance the training and testing process, this dataset was further processed and organized, ensuring an efficient and effective fine-tuning phase for the CvT-13 model to achieve high accuracy and performance.

Due to the substantial size of the training and testing datasets, they are not hosted on GitHub but are made available via Google Drive for convenience:

Training Data: Train.zip 26.47 GB
Testing Data: Test.zip 2.93 GB
Train Labels: train.csv 95.6 MB
Test Labels: test.csv 10.2 MB

To replicate the training and evaluation results of this model, please download the above datasets before proceeding with the setup.

File Structure

ai-image-detector/
│
├── models/
│   ├── model_epoch_24.pth
│
└── src/
    ├── custom_dataset.py
    ├── evaluate.py
    ├── main.py
    ├── model.py
    └── train.py

Preparing for Quick Start 🛠️

Before diving into the Quick Start, ensure your environment is set up correctly. This includes installing required packages and navigating to the correct directory.

Required Libraries

Downloading Model Weights

The model weights file is too large to be hosted on GitHub and is instead available via Google Drive. Please download the model weights before proceeding with the setup:

Model Weights: Download Here 226.6 MB

After downloading, move the file into the ai-image-detector/models/ directory. You can do this manually or by running the following command in your terminal:

mv path/to/downloaded/model_epoch_24.pth path/to/ai-image-detector/models/

Replace path/to/downloaded/model_epoch_24.pth with the actual path to the downloaded .pth file and path/to/ai-image-detector/ with the path to your local clone of the repository.

Installing Required Packages

You'll need to install several Python packages to work with the AI Image Detector. You can install these packages using the following command:

pip install --upgrade transformers

pip install --upgrade mlxtend

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

pip install tqdm numpy matplotlib scikit-learn pillow

Navigating to Your Project Directory

Once the packages are installed, navigate to the directory where you've cloned or downloaded the AI Image Detector project:

cd path/to/ai-image-detector

Replace path/to/ai-image-detector with the actual path to project directory.

Tip for Data Organization

💡Tip: When organizing your image data for classification with PyTorch's datasets.ImageFolder, it's important to name your directories appropriately for correct label assignment. If you want to label "real" images as 0 and "fake" images as 1, arrange your directory names in alphabetical order to match this. Use names like A_real and B_fake to ensure datasets.ImageFolder assigns labels 0 to "real" and 1 to "fake" respectively.

Example Structure:

data_dir/
├── A_real/  # This will be labeled as 0 by ImageFolder
│   ├── img1.jpg
│   ├── img2.jpg
│   └── ...
└── B_fake/  # This will be labeled as 1 by ImageFolder
    ├── img1.jpg
    ├── img2.jpg
    └── ...

This naming convention ensures that your dataset is correctly organized for training. The labels will be automatically and accurately assigned by ImageFolder, which is especially useful in binary classification tasks like differentiating "real" from "fake" images.

Setting Up Your Environment Variables

Ensure that your Python environment is set up correctly for PyTorch. If you're using a GPU, make sure your CUDA environment is properly configured. Refer to the official PyTorch documentation for detailed instructions.

Now, you're all set to proceed with the Quick Start section of the AI Image Detector project!

Quick Start 🚀

This section guides you through the initial steps to get up and running with the AI Image Detector. It includes procedures for data preparation, model evaluation, and making predictions with a trained model.

Preparing Your Data

For preparing your data for training or evaluation, ensure it's organized in a directory structure with each subdirectory representing a class. The custom_dataset.py script will automatically handle necessary transformations and convert your images into a PyTorch dataset.

Evaluating the Model

To evaluate a trained model on a test dataset, use the evaluate.py script. Provide the directory of your test data and, if desired, the path to the model weights folder:

python src/evaluate.py /path/to/your/test/data --weights_folder=/path/to/model/model_weights

If the model weights path is not specified, the script defaults to the latest model in the ./models folder. The script outputs the model's performance metrics and displays a confusion matrix for your test data.

Making Predictions

For predictions on individual images, use the main.py script. Specify the image path and, if needed, a specific model weight file:

python src/main.py /path/to/your/image.jpg --weights_folder=/path/to/model_weight

Without a specified model weight path, the script defaults to the latest model in the ./models folder. The script outputs the predicted class and associated probabilities.

Training the Model

Train your model using the train.py script. Define the directory for your training data and optionally set the number of epochs or a custom learning rate:

python src/train.py /path/to/your/training/data --total_epochs=50 --learning_rate=1e-4

The script resumes from the latest checkpoint and saves new checkpoints after each epoch.

Tip for Model Saving

💡Tip: It's recommended to save models in the models/ directory with the naming 'model_epoch_*.pth'. This ensures smooth compatibility with the evaluation and prediction scripts, which auto-locate and load the latest model from this directory.

Results 📈

Visualizations of the model's performance are below:

Training Loss Graph:
Confusion Matrix:

The model, evaluated on test data, achieved these metrics:

Metric	Value
Average Test Loss	0.1275
Accuracy	98.54%
Precision	0.99
Recall	0.98
F1 Score	0.98

Contributing 🤝

Contributions to the AI Image Detector are warmly welcomed. Check the contributing guidelines for details on submitting pull requests.

License

This project is open-sourced under the Apache License 2.0. See the LICENSE file for details.

Acknowledgments 🙏

Microsoft's CvT-13 Model: Microsoft/CvT
Hugging Face's CvT-13 API: huggingface.co/microsoft/cvt-13
Dataset by AWSAF

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
metrics		metrics
models		models
src		src
LICENSE		LICENSE
README.md		README.md
ai_image_detector.ipynb		ai_image_detector.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Image Detector

Introduction 🌟

Overview and Model Information 🌐📊

HuggingFace🤗

Key Features 🚀

Table of Contents

Dataset

File Structure

Preparing for Quick Start 🛠️

Required Libraries

Downloading Model Weights

Installing Required Packages

Navigating to Your Project Directory

Tip for Data Organization

Example Structure:

Setting Up Your Environment Variables

Quick Start 🚀

Preparing Your Data

Evaluating the Model

Making Predictions

Training the Model

Tip for Model Saving

Results 📈

Contributing 🤝

License

Acknowledgments 🙏

About

Releases

Packages

Languages

License

FlyingInY/ai-image-detector

Folders and files

Latest commit

History

Repository files navigation

AI Image Detector

Introduction 🌟

Overview and Model Information 🌐📊

HuggingFace🤗

Key Features 🚀

Table of Contents

Dataset

File Structure

Preparing for Quick Start 🛠️

Required Libraries

Downloading Model Weights

Installing Required Packages

Navigating to Your Project Directory

Tip for Data Organization

Example Structure:

Setting Up Your Environment Variables

Quick Start 🚀

Preparing Your Data

Evaluating the Model

Making Predictions

Training the Model

Tip for Model Saving

Results 📈

Contributing 🤝

License

Acknowledgments 🙏

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages