Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug/Feature: NumPrevPRsTransformer in TTM model #509

Open
oindrillac opened this issue Jul 7, 2022 · 2 comments
Open

Bug/Feature: NumPrevPRsTransformer in TTM model #509

oindrillac opened this issue Jul 7, 2022 · 2 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@oindrillac
Copy link
Member

The custom transformer NumPrevPRsTransformer is a column feature transformer that is currently designed to go through the number of occurrences of the particular PR's creator in the input dataframe (minus that particular PR) and assign it that value.

I'm wondering if this will appropriately apply for a model service where we are expecting results per PR? In cases where we are not giving rows of PRs or a dump of PRs. Will it always be 0 in those cases where the model is predicting per row?

Proposed solutions

  • For a repo or org specific model, create a dictionary of creator/PRs mapping to compare the PR with and update that dictionary iteratively.
  • Exclude that as a feature, or only enable for a model which is sending batch requests

@chauhankaranraj what are your thoughts on this?

@oindrillac oindrillac added the kind/bug Categorizes issue or PR as related to a bug. label Jul 7, 2022
@chauhankaranraj
Copy link
Member

@chauhankaranraj what are your thoughts on this?

My initial thought is to exclude it as a feature and measure the performance difference. Based on this we can evaluate whether it'd make sense to set up a different process for calculating this feature. wdyt?

@oindrillac
Copy link
Member Author

oindrillac commented Jul 8, 2022

Model with the feature:
Screen Shot 2022-07-08 at 2 06 14 PM

Model without the feature:
Screen Shot 2022-07-08 at 1 59 51 PM

Excluding it as a feature doesn't seem to create that big a difference. So we can exclude it for now. For the future, to improve model performance, we know that we can explore this avenue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants