forked from pytorch/torchrec
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement EmbeddingOffloadScaleupProposer (pytorch#1558)
Summary: Implements a new type of Proposer that attempts to scale up fused_uvm_caches individually according to an allocation policy based on the expected statistical distribution of the cache workload and a budget of HBM memory that is available for caching. Scaling fused_uvm_caches identically (e.g. using a global default load factor) is suboptimal as we find significant differences between the cache load factors needed for different embedding tables to achieve a reasonable miss rate. This diff just implements the Proposer, but does not (yet) use it by default. To enable the new Proposer, the trainer should explicitly specify this Proposer when initializing the EmbeddingShardingPlanner. The cost model for fused_uvm_caching does not yet fully account for storage and perf overheads of the cache. So this proposer should not be used in conjunction with other proposers in the planner. In a later diff we will improve the cost model so proposals generated by caching-aware proposers and existing proposers are comparable, removing this restriction. Reviewed By: henrylhtsang Differential Revision: D51451167
- Loading branch information
1 parent
decc6dd
commit 18df7d6
Showing
4 changed files
with
709 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.