Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast TopDown/BottomUp #109

Closed
wants to merge 12 commits into from
Closed

Fast TopDown/BottomUp #109

wants to merge 12 commits into from

Conversation

kdgutier
Copy link
Collaborator

@kdgutier kdgutier commented Nov 2, 2022

  • Added option of sparse reconciliation to BottomUp, and TopDown methods.
  • Approach resulted in 15x speed gains.
  • Added unit test comparing equality versus previous version.

Pending:
The ERM method by design creates a sparse P matrix, we should be able to see the same speed gains during inference. Check if gains come from clever ordering of matrix operations.
If so update all such matrix operations accordingly.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@AzulGarza AzulGarza self-requested a review November 2, 2022 22:41
Copy link
Member

@AzulGarza AzulGarza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the changes as modular and clean as possible. Some changes from this PR are included in this new PR. Please separate them.

@@ -44,6 +44,7 @@
"\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #17.            res = {'mean': np.matmul(S @ P, y_hat)}

np.matmul(S @ P, y_hat) is equivalent to S.dot(P.dot(y_hat)) when using numpy arrays? If that's the case, we could use {'mean': S.dot(P.dot(y_hat))} for all cases and leave the transformation to sparse arrays to the methods (to avoid adding a new argument).


Reply via ReviewNB

Copy link
Collaborator Author

@kdgutier kdgutier Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I added an issue to change the corresponding matrix multiplication for the dot method.

I would keep the tests and the sparsity parameter momentarily to track these and other method's speed gains, they can help in future development.

Currently is a default parameter that is not shown to the higher-level methods.

@@ -44,6 +44,7 @@
"\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #24.                      sparsity: Optional[bool] = True):

sparsity could be an attribute of the class instead of an extra argument. If self.sparsity=True , then P and S could be sent to _reconcile as sparse matrices.


Reply via ReviewNB

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought on skipping the self.sparsity attribute is to avoid extra complexity to the methods. Some reconciliation methods might report gains consistently while other don't.

We might want to make the selection a default.

@@ -44,6 +44,7 @@
"\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nbs/methods.ipynb file is intended to work only with numpy arrays. If you want to test a large summing matrix, please create it from scratch using numpy instead of downloading and importing a dataframe. Since the inclusion of CodeTimer is not reviewed yet, you could test the performance of the change using the time module:

from time import time

performace wo sparsity

init_wo_sparsity = time()
cls_bu = BottomUp()
cls_bu.reconcile(...)
end_wo_sparsity = time()

performance sparsity

init_sparsity = time()
cls_bu = BottomUp(sparsity=True)
cls_bu.reconcile(...)
end_sparsity = time()

test sparsity helps

assert (end_sparsity - init_sparsity) < (end_wo_sparsity - init_wo_sparsity)



Reply via ReviewNB

@@ -44,6 +44,7 @@
"\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment on the sparsity attribute. If it works, it should be included in all the methods.


Reply via ReviewNB

@AzulGarza AzulGarza self-requested a review November 2, 2022 23:11
Copy link
Member

@AzulGarza AzulGarza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea. I think it would be better if the possibility to use sparse matrices is an argument of HierarchicalReconciliation instead of an argument/attribute of each class. For example, HierarchicalReconciliation(sparsity=True, ...). In that case, the transformation of S is paid only once and all methods receive the same transformed matrix.
Subsequently, the _reconcile function could determine if the matrix S is sparse and in that case transform P.

@kdgutier kdgutier closed this Nov 5, 2022
@kdgutier kdgutier deleted the fast_topdown_bottomup branch November 5, 2022 04:15
@kdgutier
Copy link
Collaborator Author

kdgutier commented Nov 5, 2022

While doing internal testing on computation speed I realized that speed gains come from two sources:

  1. Ordering of matrix multiplications (S P) y_[a,b] vs S ( P y_[a,b])
  2. Sparsity of S and P (that comes with conversion fixed cost)

Creation of sparse S should be done in HierarchicalForecast.reconcile once.
While the creation of sparse P should be done once for methods BottomUp, TopDown, ERM-lasso.
Rather than in every call to the methods._reconcile function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants