Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RMP] Update GTC Recommender to leverage Merlin Systems and new Merlin capabilities #887

Open
10 tasks
EvenOldridge opened this issue Mar 29, 2023 · 5 comments
Open
10 tasks
Assignees
Labels

Comments

@EvenOldridge
Copy link
Member

EvenOldridge commented Mar 29, 2023

Problem:

GTC Recommender was built through custom code and shortcuts. We would like to leverage Merlin to make the deployment of the GTC Recommender much more easily.

Definition of Done

New Functionality

  • Ability to update Workflows when the item catalog changes (is this anything other than retraining it?)
  • TopK functionality (in the model or in the pipeline?)

Models

Transformers4Rec

NVTabular

  • Operator for mapping between key values (Likely a modification to categorify to support an existing mapping)
  • Operators for manipulating embeddings in NVT (e.g. concat) in order to build pre-trained embedding tables
  • Reverse Categorify mapping
  • Separate slicing and padding operators

Dataloader

Systems

Deliverables

  • Blog post
  • Updated model that leverages tabular features
  • Tensorflow based version of the model
  • Examples of GTC Recommender workflow
  • Work with GTC team to onboard them and add Merlin to NVIDIA on-demand

Constraints:

  • Pytorch and Tensorflow

Starting Point:

Existing GTC Recommender is our foundation for this work.

@EvenOldridge EvenOldridge added this to the Merlin 23.05 milestone Mar 29, 2023
@karlhigley karlhigley changed the title [Task] Update GTC Recommender to leverage Merlin Systems and new Merlin capabilities [RMP] Update GTC Recommender to leverage Merlin Systems and new Merlin capabilities Mar 31, 2023
@viswa-nvidia
Copy link

@angmc , please add the systems related dev list here from the slack thread

@angmc
Copy link

angmc commented Apr 13, 2023

To the question of what in this project is niche and is a functionality that may not be immediately needed, I would say it's anything related to the catalog swapping. This strategy worked because item/user ids were not used as inputs and because training the model would not have yielded significant improvements since the items being predicted were new. I don't think this is a common implementation for customers. A lot of what we did that relied on a python back-end so it would have to change. Now the model would have to be modified outside of triton, in the automation script, jit traced and repackaged as an ensemble and then deployed.

I don't believe any individual feature was too difficult to circumvent, but may not be the best user experience. Support for pre-trained embeddings and the use of categorify, I don't believe is a niche problem.

In post processing, the issue where we saw better throughput using pandas instead of cudf needs to be further explored. Pre-built post processing features or best practices can help prevent low throughput for customers.

@bschifferer
Copy link
Contributor

@angmc @karlhigley @EvenOldridge

About:
Operator for mapping between key values (Likely a modification to categorify to support an existing mapping)

NVTabular has a paramter called vocab (https://github.com/NVIDIA-Merlin/NVTabular/blob/main/nvtabular/ops/categorify.py#L208 ). Unfortunately, our documentation (inline doc string) doesn't explain what it is doing. But I think we can provide an existing mapping table to Categorify Op. We might be able to use the current NVT operator. The question is - how can we exchange the mapping table during serving?

@bschifferer
Copy link
Contributor

bschifferer commented Apr 17, 2023

@angmc @karlhigley @EvenOldridge @viswa-nvidia

About:
Merging pre-trained embeddings in the dataloaders (#211)
Merging pre-trained embeddings at serving time (#211)

I would not use pre-trained embeddings as input features to the model. Currently, we initialise an embedding table and load the weights of the pre-trained embedding into the embedding table. As the GTC recommender has only 5000 items, it is not required to use pretrained embedding as an input features. I think the pre-trained embeddings as input features will increase the latency/throughput numbers and I would not use this in that use-case.

As @angmc wrote ("Now the model would have to be modified outside of triton, in the automation script, jit traced and repackaged as an ensemble and then deployed.") - I think the process should be that the automation script updates the embedding table outside of Triton and we keep the current architecture loading the pre-trained embeddings into an embedding table.

A missing piece of the current proposal:
Using pre-trained embedding vector as an input features isn't sufficient. We use weight-tying in the output layer to get the item scores. If we use pre-trained embeddings vector as an input features, we do not have all item embeddings available for the weight-tying operations. There are two solutions:

  1. We still initialise an embedding table and load the pre-trained embeddings as a weight -> then we added complexity by introducing pre-trained embeddings as input features because we still need to do the same automation as we do right now
  2. The model returns the output of the transformers before weight-tying and Merlin Systems will do a ANN look-up. This adds an additional complexity to split the model, initialize an ANN, etc. (as defined in RMP [RMP] Using session-based models as query encoders (for downstream models or ANN search) #898 ). I am not sure, if that is too much scope for this ticket.

As GTC recommender has only 5000 items, the most easy way is to use pre-trained embeddings as an embedding table and not input feature and would still work with systems.

@karlhigley
Copy link
Contributor

It seems to me that, assuming we're going to try to do this migration, we should be looking for a set of Merlin functionality which:

  1. Is sufficiently general that customers could apply it to their own use cases
  2. Can build a recommender that's broadly similar to the current GTC recommender

I don't think that we want to replicate the GTC recommender exactly—especially if it has quirks that we don't expect to reflect customer use cases—so I think we're kinda looking to make that system and our functionality meet in the middle somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants