Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add non-negative reconciliation heuristic for MinTraceSparse #284

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

christophertitchen
Copy link
Contributor

This PR adds a non-negative reconciliation heuristic for sparse minimum trace reconciliation, which currently is not supported in the fit method for MinTraceSparse. I have implemented something similar in my job as the time series we deal with are inherently non-negative in nature, as they are across many industries, and thought it would be useful. It is fast and guarantees non-negative coherence at the expense of producing an approximate solution with potentially ill-defined mathematical properties. However, the goal of such a transformation is to align forecasts with their use in practice for the users of our forecasts, not necessarily improve accuracy, which is a nice additional benefit that in my experience actually happens rather frequently.

We can expand this further in the future by implementing a sparse non-negative least squares solver—there seem to be a few nice examples of this in the signal processing space.

Step 1: Clip the base forecasts to remove any negative point forecasts, which of course does not guarantee non-negativity in the minimum trace method, but nonetheless provides a good starting point.

$\alpha_{h} = \begin{cases} \hat{y}_{i,h} & \text{if } \hat{y}_{i,h} \gt 0 \\ 0 & \text{otherwise} \end{cases}$

Step 2: Although it is now sufficient to ensure that all of the entries in $P$ are positive, as it is implemented as a linear operator, we solve the sparse linear system iteratively as usual using BiCGSTAB. We could also try changing this to MINRES to exploit the symmetry of $\left( S'W^{-1}_{h}S \right)^{-1}$ in the future which has been more performant in my experience, but it can be finicky to find the relative tolerance that best balances speed with maximum error.

$\beta_{h} = S\left( S'W^{-1}_{h}S \right)^{-1}S'W^{-1}_{h}\alpha_{h}$

Step 3: We now have reconciled point forecasts which minimise the error variances of the coherent forecasts, but which may be negative, so we need to check if any of the coherent bottom level point forecasts are negative. If so, the bottom level negative forecasts are clipped.

$b_{h} = \begin{cases} \beta_{j,h} & \text{if } \beta_{j,h} \gt 0 \\ 0 & \text{otherwise} \end{cases}$

$j = \left\{ n_{a}, n_{a} + 1, \cdots , n_{a} + n_{b} - 1, n_{a} + n_{b} \right\}$

Step 4: Restore coherence by aggregating the non-negative bottom level reconciled forecasts across the data structure, which leads to an approximate solution as we used the leverages of the negative-containing $SP$ in the minimum variance unbiased estimator to generate $b_{h}$.

$\tilde{y}_{h}= Sb_{h}$

Step 5: This directly forces projection onto the non-negative coherent subspace, which can be equivalently and subsequently achieved by using the bottom-up reconciliation projection matrix. We also set $W$ appropriately for probabilistic reconciliation if required.

$\tilde{y}_{h} = S\begin{bsmallmatrix} 0_{n_{b}, n_{a}} \quad I_{n_{b}} \end{bsmallmatrix} \tilde{y}_{h}$

$W = I_{n}$

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@elephaint elephaint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work again - I took the liberty to review it. Minor comment in the code.

Apart from that, I'm sure you checked it but it may be worth checking the types of the sparse matrices in the various matrix multiplications (some may be csc, others csr) and whether it's better to transpose operations or not. This can often lead to easy speedups.

Other than that, I think mostly what is needed are a few unit tests, comparable to the non-sparse MinT tests.

@@ -1186,7 +1181,7 @@ def get_P_action(y):
(b.size, b.size), matvec=lambda v: R @ (S @ v)
)

x_tilde, exit_code = sparse.linalg.bicgstab(A, b, atol="legacy")
x_tilde, exit_code = sparse.linalg.bicgstab(A, b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you remove this as SciPy no longer supports it? (I don't see it being mentioned as a valid value on bicgstab page)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was an issue on Github with this too, I think this argument is indeed deprecated and needs to removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants