Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenNMT-py v3 support #23

Open
argosopentech opened this issue Nov 6, 2022 · 2 comments
Open

OpenNMT-py v3 support #23

argosopentech opened this issue Nov 6, 2022 · 2 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@argosopentech
Copy link
Owner

https://forum.opennmt.net/t/opennmt-py-v3-0-is-out/5077

The vanilla transformer uses sinusoidal positional encoding (position_encoding = true). We recommend to use “maximum relative positions” encoding instead (max_relative_positions=20, position_encoding=false) which again has a small overhead.

We kept the “fusedadam” (old legacy code) which provides the best performance in speed (compare to pytroch amp adam fp16, apex level O1/O2). We tested the new Adam(fused=true) released with pytorch 1.13 but it is way slower.

Always use the highest batch size possible (to your GPU ram capacity) and use an update interval according to the “true bach size” you want. For instance, if your GPU can accept 8192 tokens, then if you use accum_count=12, you will have a true batch size of 98304 tokens.

Adjust the bucket size to your CPU ram. Most of the time a bucket between 200K and 500K examples will be suitable. The highest your bucket size is, the less padding you will have since examples are sorted based on this bucket and batches yield from this bucket.

@argosopentech
Copy link
Owner Author

OpenNMT/OpenNMT-py#2242

@argosopentech argosopentech added enhancement New feature or request help wanted Extra attention is needed labels Nov 6, 2022
@PJ-Finlay
Copy link
Collaborator

OpenNMT/OpenNMT-py#2244

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants