Research Code for RGBDGaze

This is the research repository for RGBDGaze: Gaze Tracking on Smartphones with RGB and Depth Data, presented at ACM ICMI 2022.

It contains the training code and dataset link.

Environment

docker
docker-compose
nvidia-docker
nvidia-driver

How to use

1. Download dataset and pretrained RGB model

Dataset: https://www.dropbox.com/s/iixjrxzxx7nbupl/RGBDGaze_dataset.zip?dl=0
RGB part of the Spatial CNN model pretrained with GazeCapture dataset: https://www.dropbox.com/s/dgpn0j212l260q1/pretrained_rgb.pth?dl=0

2. Clone

$ git clone https://github.com/FIGLAB/RGBDGaze

3. Setup

$ cp .env{.example,}

In .env, you can set a path to your data directory.

4. Docker build & run

$ DOCKER_BUILDKIT=1 docker build -t rgbdgaze --ssh default .
$ docker-compose run --rm experiment

5. Run

Prepare following files in the docker container

/root/datadrive/RGBDGaze/dataset/RGBDGaze_dataset
/root/datadrive/RGBDGaze/models/SpatialWeightsCNN_gazecapture/pretrained_rgb.pth

Make tensors to be used for training

$ cd preprocess
$ python format.py

For training RGB+D model, run

$ python lopo.py --config ./config/rgbd.yml

For training RGB model, run

$ python lopo.py --config ./config/rgb.yml

Dataset description

Overview

The data is organized in the following manner:

45 participants (*1)
synchronized RGB + Depth images for different four context
- standing, sitting, walking, and lying
meta data
- corresponding gaze target on the screen
- detected face bounding box
- acceleration data
- device id
- intrinsic camera parameter of the device
*1: We used 50 participants data in the paper. However, five of them did not agree to be included in the public dataset.

Structure

The folder structure is organized like this:

RGBDGaze_dataset
│   README.txt
│   iphone_spec.csv   
│
└───P1
│   │   intrinsic.json
│   │
│   └───decoded
│       │   
│       └───standing
│       │       │   label.csv
│       │       │
│       │       └───rgb
│       │       │   1.jpg
│       │       │   2.jpg ...
│       │       │
│       │       └───depth
│       │       
│       └───sitting
│       └───walking
│       └───lying
│   
└───P2 ...

Reference

Download the paper here.

Riku Arakawa, Mayank Goel, Chris Harrison, Karan Ahuja. 2022. RGBDGaze: Gaze Tracking on Smartphones with RGB and Depth Data In Proceedings of the 2022 International Conference on Multimodal Interaction (ICMI '22). Association for Computing Machinery, New York, NY, USA.

@inproceedings{DBLP:conf/icmi/ArakawaG0A22,
  author    = {Riku Arakawa and
               Mayank Goel and
               Chris Harrison and
               Karan Ahuja},
  title     = {RGBDGaze: Gaze Tracking on Smartphones with {RGB} and Depth Data},
  booktitle = {International Conference on Multimodal Interaction, {ICMI} 2022, Bengaluru,
               India, November 7-11, 2022},
  pages     = {329--336},
  publisher = {{ACM}},
  year      = {2022},
  doi       = {10.1145/3536221.3556568},
  address   = {New York},
}

License

GPL v 2.0 License file present in repo. Please contact [email protected] if you would like another license for your use.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
GazeEstimation		GazeEstimation
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Research Code for RGBDGaze

Environment

How to use

1. Download dataset and pretrained RGB model

2. Clone

3. Setup

4. Docker build & run

5. Run

Dataset description

Overview

Structure

Reference

License

About

Releases

Packages

Languages

License

FIGLAB/RGBDGaze

Folders and files

Latest commit

History

Repository files navigation

Research Code for RGBDGaze

Environment

How to use

1. Download dataset and pretrained RGB model

2. Clone

3. Setup

4. Docker build & run

5. Run

Dataset description

Overview

Structure

Reference

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages