Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

dockerignore include / exclude computation doesn't match Docker's #159

Open
cjerdonek opened this issue Dec 20, 2015 · 7 comments
Open

dockerignore include / exclude computation doesn't match Docker's #159

cjerdonek opened this issue Dec 20, 2015 · 7 comments

Comments

@cjerdonek
Copy link

I encountered a situation where docker-osx-dev wasn't syncing / noticing changes to certain files that it should have. Here is a reduced test case.

.dockerignore contents:

*
!foo/

Directory contents:

.dockerignore
bar/
  buzz.txt
foo/
  bar/
    buzz.txt
  buzz.txt

However, files in foo/bar aren't synced / watched:

$ docker-osx-dev sync -s .
[INFO] Using sync paths from command line args: .
[INFO] Complete list of paths to sync: /Users/chris/temp-sync
[INFO] Using excludes from ignore file .dockerignore: bar foo
[INFO] Complete list of paths to exclude: bar foo
[INFO] Using includes from ignore file .dockerignore: foo/
[INFO] Complete list of paths to include: foo/
[INFO] Starting docker-osx-dev file syncing
[INFO] Testing if docker machine is running
[INFO] Installing rsync in the Docker Host image
[INFO] Performing initial sync of paths: /Users/chris/temp-sync
[INFO] Syncing temp-sync/: cd+++++++
[INFO] Syncing temp-sync/.dockerignore: <f+++++++
[INFO] Syncing temp-sync/foo/: cd+++++++
[INFO] Syncing temp-sync/foo/buzz.txt: <f+++++++
[INFO] Initial sync done
[INFO] Watching: /Users/chris/temp-sync

I did confirm using a Dockerfile with a line like "COPY . test/" that Docker's behavior is not to ignore foo/bar. The issue above is that docker-osx-dev is adding "bar" as an unrooted exclude pattern rather than a rooted one, and so excluding "bar" when it appears as a subdirectory.

This is from the Dockerfile docs here:

For the purposes of matching, the root of the context is considered to be both the working and the root directory. For example, the patterns /foo/bar and foo/bar both exclude a file or directory named bar in the foo subdirectory of PATH or in the root of the git repository located at URL. Neither excludes anything else.

The idea to use "*" at the beginning to specify a whitelist is something that the same docs above suggest:

Finally, you may want to specify which files to include in the context, rather than which to exclude. To achieve this, specify * as the first pattern, followed by one or more ! exception patterns.

@cjerdonek
Copy link
Author

I also noticed when changing "" in the first line to "/" (which you can do in Git to match the beginning of a path) that docker-osx-dev incorrectly interprets the pattern as beginning at the root of your drive rather than the root of the directory (which the Docker docs above say should happen). If this had worked, it would have been a valid work-around for my OP.

@brikis98
Copy link
Owner

I believe our excludes follow rsync rules, not Docker rules, as they are just set as rsync flags. Out of curiosity, does !foo/* work?

@cjerdonek
Copy link
Author

I believe our excludes follow rsync rules, not Docker rules

The docker-osx-dev code manually reimplements its own line-by-line parsing and interpretation of .gitignore and .dockerignore files (see here and here, for example). So you are free to parse and convert to rsync flags however you wish. I'm suggesting that you interpret the lines of a .dockerignore file the same way that Docker interprets it, so that docker-osx-dev will include and ignore the same directories as Docker, or as nearly the same as possible. Otherwise, the user is forced to write a .dockerignore file that conforms to two different, possibly conflicting interpretations (Docker and docker-osx-dev).

@cjerdonek
Copy link
Author

Out of curiosity, does !foo/* work?

No, that syncs even fewer files:

2015-12-20 22:08:23 [INFO] Performing initial sync of paths: /Users/chris/temp-sync
2015-12-20 22:08:23 [INFO] Syncing temp-sync/: cd+++++++
2015-12-20 22:08:23 [INFO] Syncing temp-sync/.dockerignore: <f+++++++
2015-12-20 22:08:23 [INFO] Initial sync done

@philipn
Copy link

philipn commented Dec 23, 2015

rsync doesn't appear to allow for easy include/exclude of paths, but rather just acts on substrings found anywhere in a given path. My suggestion would be:

  • For each excluded item in the .dockerignore file, store the full path to the item.
  • Take that path, and construct a set of filter arguments to rsync that iterate from the run root to the full path, including items and the finally excluding the full path. E.g.:
--filter='+ /mnt/' \
--filter='+ /mnt/src/' \
--filter='- /mnt/src/tmp/*' \
  • Do this globally, with a tree that keeps track of these includes and excludes and ultimately outputs the correct filter arguments.

I'd take a stab at this myself, but this appears to be too complicated for me to do quickly in bash. It seems reasonable to write this logic in something other than bash, and call out to it to generate the filter list.

For now, the safe solution appears to be to disable automatic exclusions with docker-osx-dev -i /dev/null ....

@cjerdonek
Copy link
Author

Shouldn't this be labeled a bug rather than an enhancement? docker-osx-dev unexpectedly doesn't sync directories that .dockerignore says to include.

It's possible that full parity might not be achievable, but the test case I gave is a relatively basic one that is doable.

@brikis98
Copy link
Owner

The original excludes behavior was added with the idea of just piggybacking on rsync's exclude behavior. Changing it to match .dockerignore behavior is more of an enhancement than a bug fix. I suppose it just depends on how you look at it :)

As I'm thinking about @philipn's ideas, I thought I'd mention something: I've tried to limit the functionality in docker-osx-dev, as a) bash is a crappy language for doing anything complicated and b) due to #7, we have no easy way to add automated integration tests for this project. If there is a relatively easy way to implement a better excludes functionality, I'm all for it, as I agree it would be better if we were consistent with .dockerignore rules. If implementing that functionality requires a moderate amount of code, then that code must have unit tests (which, fortunately, we do have). And if the implementation requires a lot of complicated code, then I would prefer to hold off until #7 is resolved.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants