-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scalability fixes - Load model once per user #944
Conversation
Implementing code changes for improving scalability as per the requirements in this issue# 950 in e-mission-docs. Initial approach involves utilizing Singleton design pattern concept on instance variables to check whether model has already been loaded before attempting to load the model. Thus this should prevent model from being loaded if it is already present in the instance variable of the current class instance of BuiltinModelStorage.
Added import for logging. Added self parameter to pass reference to current instance object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs redesign
Changes after refactoring label inference pipeline to load model only once. Will be merging with previous PR branch.
Starting from scratch to implement refactored label pipeline approach
Using a dictionary of lists of predictions returned for each trip instead of a simple list. This will help keep predictions of each trip separate but keep all the algorithms_wise predictions of each trip together in a list.
Updated function call to predict_labels_with_n() as per latest changes to include trip_list and user_id Fixed typo in eacilp for trip.get_id()
eamur.predict_labels_with_n() now takes a trip instead of a trip_list. Passed a trip_list in previous commit but forgot to update parameter name.
Receive a prediction list of (prediction, n) for each trip in trip_list. Then for each of the predictions in the prediction list, run the assertion tests. Also, cleaned up some debug print statements from earlier commits / PRs.
Cleaning up previously used debug statements
A couple of pipeline tests were failing - TestLableInferencePipeline, TestExpectationPipeline. Failed since they were calling placeholder_predict_n functions in eacili inferrers. Had to update these to now receive trip_list instead of single trip.
[SOLVED]
Took a look at the logs and saw that some more functions were called on running pipeline in these tests.
So, added a quick fix to update the functions so that they now receive trip_list instead of single trip and pass in a list of trips. |
prediction_list = algorithm_fn(user_id, trip_list) | ||
start_insert_inference_labels_time = time.process_time() | ||
for trip, prediction in zip(trip_list, prediction_list): | ||
lp = ecwl.Labelprediction() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future fix??: my one comment here is that if we standardize on an interface for the algorithm_fn
, I believe we can simplify this further.
This can become (pseudocode):
for algorithm_id, algorithm_fn in primary_algorithms.items():
algo_model = algorithm.load_model()
for trip in trip_list:
prediction = model.predict(trip)
# and then the original code
Back when this code was written, trip_model
, with its clean interfaces to load, predict and save, had not yet been implemented. So algorithm_fn
was supposed to be an abstraction layer of the multiple algorithms.
But now that we have the abstractions in trip_model, I am not sure that we need two levels of abstraction. I am open to deferring the cleanup until the next round, but I am also open to just removing one level of abstraction right now, since that will result in far fewer changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, there is a map that has algorithm name to algorithm fn and then the algorithm fn (in our case) calls predict_with_n
. instead, the map can have algorithm name to dict of model parameters, and then inside the for loop, we can just load the model and continue to predict trip by trip and not load the model within the predict function (which works, but is not pretty).
Alternatively, we can keep the algorithm function, eacili.predict_cluster_confidence_discounting
but load the model in there. But I don't see the need for the second level of abstraction if we assume that all our algorithms will follow the L -> P -> S workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are not comfortable with removing the algorithm_fn
abstraction, you can simply do the following 2 lines of code to be changed.
def predict_cluster_confidence_discounting(trip, max_confidence=None, first_confidence=None, confidence_multiplier=None):
# load application config
model_type = eamtc.get_model_type()
model_storage = eamtc.get_model_storage()
### load model <------ NEW CODE
labels, n = eamur.predict_labels_with_n(trip, model) <----- CHANGED CODE
if n <= 0: # No model data or trip didn't match a cluster
logging.debug(f"In predict_cluster_confidence_discounting: n={n}; returning as-is")
return labels
confidence_coeff = n_to_confidence_coeff(n, max_confidence, first_confidence, confidence_multiplier)
logging.debug(f"In predict_cluster_confidence_discounting: n={n}; discounting with coefficient {confidence_coeff}")
labels = copy.deepcopy(labels)
for l in labels: l["p"] *= confidence_coeff
return labels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few queries which are still confusing to me.
I can go ahead with the minimal change of loading model in eaclii but wanted to at least try and understand scope and efforts on changes required in reducing abstractions before making the simpler changes.
But now that we have the abstractions in trip_model, I am not sure that we need two levels of abstraction.
- By "two levels", I believe you are referring to:
eacili.predict_cluster_confidence_discounting -> eamur.predict_labels_with_n -> model.predict()
I see that trip_model is an abstract class and one child class / concrete implementation is GreedySimilarityBinning class.
Do we intend on removing the calls to the first two functions in the above chain and directly call only model.predict() which is in effect a call to GreedySimilarityBinning.predict() depending on the model parameters?
Right now, there is a map that has algorithm name to algorithm fn and then the algorithm fn (in our case) calls
predict_with_n
. instead, the map can have algorithm name to dict of model parameters...
Alternatively, we can keep the algorithm function,
eacili.predict_cluster_confidence_discounting
but load the model in there. But I don't see the need for the second level of abstraction if we assume that all our algorithms will follow the L -> P -> S workflow.
- I believe I've understood that the same trip model will be loaded for each user irrespective of algorithm if we move up the model loading, model parameters up into
eacilp.pipeline
.
If this is correct, then yes, using a map of algorithm_name to dict_model_parameters is useful:
primary_algorithms = {
ecwl.AlgorithmTypes.CONFIDENCE_DISCOUNTED_CLUSTER: dict_model_params
}
# In the for loop
# Load model
for trip in trip_list:
prediction = model.predict(trip)
But in doing this (in continuation with my 1st question), this will call model.predict(), and I fail to see how our algorithm_name will have a unique implementation if we are removing the call to its respective function?
For instance, on removing the call to eacili.predict_cluster_confidence_discounting
, we are no longer doing the confidence_coeff
calculations.
So what would be the significance of the algorithm name in the map in this case?
We also have placeholder_predictor() functions in inferrers.py which are being called in TestLabelInferencePipeline.py through the compute_and_save_algorithms() as well.
I believe for these functions, loading model and making predictions won't be useful as these don't load the model and just return predictions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you made the relevant changes - is this still confusing? If so, we can discuss in a 1:1.
Fetching from trip_list itself. Added assertion to ensure identical and only one unique user_id present for all trips. Assertion fails if multiple user_ids found for the trip_list.
Removed from Test files and inferrers.py which contained placeholder_predictor functions.
Refactored code to pass in the trip model directly to predict_labels_with_n() in eamur. Moved the load model step to eacil.inferrers by using load_model() of eamur. Modified TestRunGreedyModel to use this refactored function.
Failing test due to missed user_id parameter. Passed user_id to rectify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to merge this for now, but please note the "future fix" comments and address them in a cleanup pass. You should also look through all the logs that you have added and make sure that you are logging as much detail as possible, so it can actually be useful in understanding the problem.
else: | ||
predictions, n = model.predict(trip) | ||
predictions_list.append((predictions, n)) | ||
print(f"{arrow.now()} Inside predict_labels_n: Predictions complete for trip_list in time = {time.process_time() - start_predict_time}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, future fix: we should log the length of the list
|
||
inferred_trip_list.append(inferred_trip) | ||
|
||
if inferred_trip_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this always going to be true, given that you define inferred_trip_list
in line 62?
Are you actually testing against the "no trip" case?
I suspect this doesn't matter, since we just iterate through the trips downstream, but then we can just skip this check altogether given that it will always be true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did test this out and Python evaluates an empty list to be False when used in an if condition.
So, it does not enter the if block unless trips are loaded into inferred_trip_list
in the for loop above the if block and the list becomes non-empty which then evaluates to True.
user_id_list = [] | ||
for trip in trip_list: | ||
user_id_list.append(trip['user_id']) | ||
assert user_id_list.count(user_id_list[0]) == len(user_id_list), "Multiple user_ids found for trip_list, expected unique user_id for all trips" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, future fix: you should add some additional details to the log - what were the unique user IDs, how many is "multiiple"? Otherwise, if I see this log, I am not sure how to debug it.
@MukuFlash03 I am squash-merging this to avoid code churn. |
Cleanup fixes including log statements for PR #944
The changes in this iteration are improvements in test for forest model : 1. Post discussion last week, the regression test was removed ( `TestForestModel.py` )since it won't be useful when model performance improves. Rather, the structures of predictions is checked. This check is merged with TestForestModel.py 2. After e-mission#944 , `predict_labels_with_n` in `run_model.py` expectes a lists and then iterates over it. The forest model and rest of the tests were updated accordingly.
Implementing code changes for improving scalability as per the requirements in this issue.
Initial approach involves utilizing Singleton design pattern concept on instance variables to check whether model has already been loaded before attempting to load the model.
Thus this should prevent model from being loaded if it is already present in the instance variable of the current class instance of BuiltinModelStorage.
More details present here.