Skip to content

multitalentloes/video-style-transfer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Video Style Transfer

Shrek styled with La Muse by Picasso

Group members

The unbalanced commit log is a result of us working over Google Colab and Visual Studio Live Share.

What we have done

This report is the final delivery for our video style-transfer project for TDAT3025 - Applied Machine Learning with Project. Image style-transfer is trying to recreate an image as if it was made in the style of another reference image. Video style-transfer is image style-transfer taken a step further. Styling an entire video with a single style image is not just the process of image style transfer repeated for each frame. Issues ranging from stability of output images to runtime occur and need to be addressed.

The purpose of this project is to compare existing ways of implementing video style transfer and to implement our own versions based on these.

Project

  • code/gatys contains the implementation of Gatys et al.
    • Here we style the video in a naive way, styling the video frame for frame and putting everything together at the end.
    • Example: https://youtu.be/OzwG4_BDelU
  • code/ruder contains the implementation of Ruder et al.
    • Here we style the video by introducing a temporal constraint. The idea is to penalize the model for large deviations between two adjacent frames. This is done by using DeepFlow, a flow detection algorithm.
    • Example: https://youtu.be/gAxoTpOQA6E
  • code/johnson contains the implementation of Johnson et al combined with Ulyanov et al
    • Here we train a neural network to learn how to style different images. This is much faster since we only have to do one forward-pass through the network to style each the frames.
    • We combined this algorithm with Instance normalization to speed up the optimization, and reduce noise.
    • Example: https://youtu.be/krjY1u1vZc4
  • code/huang contains the implementation of Huang et al combined with Ruder et al
    • Here we define the style of an image as a distribution of features. Then we use the Wasserstein metric as a difference between the different distributions. This representation gives us a more visually pleasing result. The computation of the Wasserstein metric is computationally expensive, thus resulting in longer runtimes.
    • We combined this algorithm with a temporal constraint loss function so that the video becomes much more smoother.
    • Example: https://youtu.be/pujUa0c59hI
  • We have also implemented color preservation for preserving the original color of the video, only applying the style. Gatys et al

The end.

About

Machine Learning Project in TDAT3025 at NTNU.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TeX 50.1%
  • Python 49.9%