[RMP] Update GTC Recommender to leverage Merlin Systems and new Merlin capabilities #887

EvenOldridge · 2023-03-29T17:22:52Z

viswa-nvidia · 2023-04-12T16:50:06Z

@angmc , please add the systems related dev list here from the slack thread

angmc · 2023-04-13T18:34:52Z

To the question of what in this project is niche and is a functionality that may not be immediately needed, I would say it's anything related to the catalog swapping. This strategy worked because item/user ids were not used as inputs and because training the model would not have yielded significant improvements since the items being predicted were new. I don't think this is a common implementation for customers. A lot of what we did that relied on a python back-end so it would have to change. Now the model would have to be modified outside of triton, in the automation script, jit traced and repackaged as an ensemble and then deployed.

I don't believe any individual feature was too difficult to circumvent, but may not be the best user experience. Support for pre-trained embeddings and the use of categorify, I don't believe is a niche problem.

In post processing, the issue where we saw better throughput using pandas instead of cudf needs to be further explored. Pre-built post processing features or best practices can help prevent low throughput for customers.

bschifferer · 2023-04-17T09:32:06Z

@angmc @karlhigley @EvenOldridge

About:
Operator for mapping between key values (Likely a modification to categorify to support an existing mapping)

NVTabular has a paramter called vocab (https://github.com/NVIDIA-Merlin/NVTabular/blob/main/nvtabular/ops/categorify.py#L208 ). Unfortunately, our documentation (inline doc string) doesn't explain what it is doing. But I think we can provide an existing mapping table to Categorify Op. We might be able to use the current NVT operator. The question is - how can we exchange the mapping table during serving?

bschifferer · 2023-04-17T09:47:07Z

@angmc @karlhigley @EvenOldridge @viswa-nvidia

About:
Merging pre-trained embeddings in the dataloaders (#211)
Merging pre-trained embeddings at serving time (#211)

I would not use pre-trained embeddings as input features to the model. Currently, we initialise an embedding table and load the weights of the pre-trained embedding into the embedding table. As the GTC recommender has only 5000 items, it is not required to use pretrained embedding as an input features. I think the pre-trained embeddings as input features will increase the latency/throughput numbers and I would not use this in that use-case.

As @angmc wrote ("Now the model would have to be modified outside of triton, in the automation script, jit traced and repackaged as an ensemble and then deployed.") - I think the process should be that the automation script updates the embedding table outside of Triton and we keep the current architecture loading the pre-trained embeddings into an embedding table.

A missing piece of the current proposal:
Using pre-trained embedding vector as an input features isn't sufficient. We use weight-tying in the output layer to get the item scores. If we use pre-trained embeddings vector as an input features, we do not have all item embeddings available for the weight-tying operations. There are two solutions:

We still initialise an embedding table and load the pre-trained embeddings as a weight -> then we added complexity by introducing pre-trained embeddings as input features because we still need to do the same automation as we do right now
The model returns the output of the transformers before weight-tying and Merlin Systems will do a ANN look-up. This adds an additional complexity to split the model, initialize an ANN, etc. (as defined in RMP [RMP] Using session-based models as query encoders (for downstream models or ANN search) #898 ). I am not sure, if that is too much scope for this ticket.

As GTC recommender has only 5000 items, the most easy way is to use pre-trained embeddings as an embedding table and not input feature and would still work with systems.

karlhigley · 2023-04-18T14:56:08Z

It seems to me that, assuming we're going to try to do this migration, we should be looking for a set of Merlin functionality which:

Is sufficiently general that customers could apply it to their own use cases
Can build a recommender that's broadly similar to the current GTC recommender

I don't think that we want to replicate the GTC recommender exactly—especially if it has quirks that we don't expect to reflect customer use cases—so I think we're kinda looking to make that system and our functionality meet in the middle somewhere.

EvenOldridge added this to the Merlin 23.05 milestone Mar 29, 2023

EvenOldridge assigned angmc Mar 29, 2023

karlhigley added the roadmap label Mar 31, 2023

karlhigley changed the title ~~[Task] Update GTC Recommender to leverage Merlin Systems and new Merlin capabilities~~ [RMP] Update GTC Recommender to leverage Merlin Systems and new Merlin capabilities Mar 31, 2023

EvenOldridge modified the milestones: Merlin 23.05, Merlin 23.07 Apr 12, 2023

karlhigley assigned karlhigley, bschifferer, jperez999 and nv-alaiacano Apr 13, 2023

bschifferer mentioned this issue Apr 17, 2023

[RMP] GTC Session-based Web Recommendations #700

Closed

4 tasks

EvenOldridge modified the milestones: Merlin 23.05, Merlin 23.07, Merlin Backlog Apr 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RMP] Update GTC Recommender to leverage Merlin Systems and new Merlin capabilities #887

[RMP] Update GTC Recommender to leverage Merlin Systems and new Merlin capabilities #887

EvenOldridge commented Mar 29, 2023 •

edited

Loading

viswa-nvidia commented Apr 12, 2023

angmc commented Apr 13, 2023

bschifferer commented Apr 17, 2023

bschifferer commented Apr 17, 2023 •

edited

Loading

karlhigley commented Apr 18, 2023

[RMP] Update GTC Recommender to leverage Merlin Systems and new Merlin capabilities #887

[RMP] Update GTC Recommender to leverage Merlin Systems and new Merlin capabilities #887

Comments

EvenOldridge commented Mar 29, 2023 • edited Loading

Problem:

Definition of Done

New Functionality

Models

Transformers4Rec

NVTabular

Dataloader

Systems

Deliverables

Constraints:

Starting Point:

viswa-nvidia commented Apr 12, 2023

angmc commented Apr 13, 2023

bschifferer commented Apr 17, 2023

bschifferer commented Apr 17, 2023 • edited Loading

karlhigley commented Apr 18, 2023

EvenOldridge commented Mar 29, 2023 •

edited

Loading

bschifferer commented Apr 17, 2023 •

edited

Loading