Project Code for An Exploration Into AI-Generated Video Hallucination Detection

Note: This repository is the code for my final project for UIUC CS 445 Fall 2024.

Abstract

Recent generative models like Sora have shown promise in producing high quality images and videos given text input, but they can also hallucinate inconsistent, undesirable, and irrelevant features. Existing video hallucination detectors, like Sora Detector, leverage another language model for image captioning, which itself can hallucinate. Recent works have shown promising results in exploiting extracted geometries from static images to detect hallucinations, but have not been applied to video sequences. In this work, we first present a short case study showcasing hallucinated predictions output by language models used for anomaly detection, motivating the need for structure-guided hallucination detectors. We then utilize state-of-the-art models to extract geometric cues from a custom-curated dataset of AI-generated and real videos, and directly apply a recent geometry-based hallucination detector on each static frame. Finally, we extend the static frame hallucination detector to consider the temporal consistency across consecutive frames by developing a Kalman filter that fuses predictions from different models.

File Structure

Look inside each sub-folder to see how to run that part of the project.

./
│   README.md - This file
│   frame_extraction.py - Saves raw frames from list of videos
|   keyframe_selector.py - Extracts key frames from videos
|   projective_geometry.yml - Conda environment file
│
└───baseline - Folder with code for pixel-based baseline classifier
│   
└───data - Folder where extracted data is saved
│   
└───kalman_filter - Folder with code for running Kalman filter
│   
└───line_segment - Folder with code for predicting lines, and running line-based classifier
│   
└───object_shadow - Folder with code for predicting object and shadow masks, and running object-shadow-based classifier
│   
└───perspective_fields - Folder with code for predicting perspective fields, and running perspective fields-based classifier
│   
└───results_extractor - Folder with code for generating test splits, and compiling result graphs
│   
└───sora_detector - Folder with prompts used for Sora Detector, and output predictions from GPT-4o using Perplexity

Video Data Sources

We extract 59 AI-generated videos from the T2VHaluBench dataset and 59 real videos from the BDD100K dataset.

All 59 T2VHaluBench videos can be found at https://github.com/TruthAI-Lab/SoraDetector/tree/main/text-to-video%20benchmark/text-to-video
We downloaded bdd100k_videos_test_00.zip found under BDD100K → Video Parts at http://bdd-data.berkeley.edu/download.html. Then we chose 59 videos from that set, which are listed in ./data/real_video_paths.txt.

Conda Environment

We use conda to manage packages for running different parts of the project. Unless otherwise stated, we run our code with the projective-geometry environment. Install the environment with conda env create -f projective_geometry.yml.

Recommended Order of Running Project

Create and activate the projective-geometry conda environment
Create ./data/fake_video_paths.txt and ./data/real_video_paths.txt with lists of paths to videos to save frames from
Run python frame_extraction.py to save raw frames
Run python keyframe_selector.py to extract key frames from raw frames
Follow directions in ./object_shadow to predict masks and run object-shadow-based classifier
Follow directions in ./perspective_fields to predict fields and run fields-based classifier
Follow directions in ./line_segment to predict lines and run lines-based classifier
Follow directions in ./kalman_filter to run Kalman filter
Follow directions in ./baseline to run baseline classifier
Follow directions in ./results_extractor to extract difficulty-based test splits, and compile results
Optional: Prompt GPT-4o with Sora Detector prompts provided in ./sora_detector/prompts.txt and/or look at our case study raw results in ./sora_detector/results

References

We thank the authors of the following projects for making their data/code/models available:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Code for An Exploration Into AI-Generated Video Hallucination Detection

Abstract

File Structure

Video Data Sources

Conda Environment

Recommended Order of Running Project

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
baseline		baseline
data		data
kalman_filter		kalman_filter
line_segment		line_segment
object_shadow		object_shadow
perspective_fields		perspective_fields
results_extractor		results_extractor
sora_detector		sora_detector
.gitignore		.gitignore
Final_Report.pdf		Final_Report.pdf
LICENSE		LICENSE
README.md		README.md
frame_extraction.py		frame_extraction.py
keyframe_selector.py		keyframe_selector.py
projective_geometry.yml		projective_geometry.yml

License

TheNeeloy/temporal-projective-geometry

Folders and files

Latest commit

History

Repository files navigation

Project Code for An Exploration Into AI-Generated Video Hallucination Detection

Abstract

File Structure

Video Data Sources

Conda Environment

Recommended Order of Running Project

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages