Skip to content

How to save data

eitsupi edited this page Nov 14, 2024 · 2 revisions

Warning

The content of this wiki may be outdated. Please check the Rocker Project website for the most up-to-date information.

Docker containers are temporary instances that are usually isolated from the rest of the computing environment. While this makes containers very portable, it also leads to the common question of "How do I save my data?". Here we present several strategies so you can choose what works best for you.

Using docker commit

docker commit allows you to save a snapshot of your container as a docker image so you can return to it later. Like any docker image, these can be moved around to a different machines. Docker's git-like features really shine here -- you can roll back to previous commits, and pushing images with docker push is really fast since it pushes only differences.

Quick overview:

  • On the host machine that is running docker, look up the name or container id of the running container using docker ps. (You can also assign your own choice of name to the container when calling docker run and then use that).

  • Save the running container as a docker image, e.g. docker commit <container-id> username/imagename. Optionally you can include commit messages with -m. Once the container is committed, you can now stop or remove the container without losing data.

  • Push the container to the Docker Hub: docker push username/imagename. Be sure to use a private image (either on the Hub or by running a private registry) if necessary: just create the private image name on the Hub before pushing. (Alternatively you can save the container as a tarball with docker save and download that for future use. This approach does not benefit from transferring only the changed layers, so should be avoided in favor of docker push/pull if possible).

NOTE: If you start an instance with a linked volume, docker commit will not capture changes to that volume.

Using linked volumes

An alternative to using docker commit is to just use linked volumes, as described in more detail on the wiki page Sharing files with the host machine.

By sharing volumes with the host machine, you never have to remember steps like docker commit as the files will always persist locally.

Other ways

  • Use docker cp to copy data from the container onto the host machine.
  • Use 'download' option from the file pane of the RStudio-server
  • Use git to commit your changed files and push to a server
  • Use and commit a "volume-container". This allows you to have separate docker containers for your files and for the computational environment.