-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More efficient method for check_parallel_jacobian #1395
base: main
Are you sure you want to change the base?
Changes from 12 commits
62a6346
6a46229
5e4da39
8fbb339
da664b0
9b8a25d
7fe9c5e
ab30049
8c9959c
5021f29
e4ce707
8438d09
e048a22
6d53c8d
12109f9
e175405
8cedc01
a8da700
345a813
330953f
5939198
e30c6d5
50903cc
a54f189
9bd6ce8
0cd2d2d
42e00a2
940d253
c922565
fb1bb91
e226647
1aed06f
02db37e
631e7c5
bf5f26d
ffd0071
10250a5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,7 +27,7 @@ | |
import numpy as np | ||
from scipy.linalg import svd | ||
from scipy.sparse.linalg import svds, norm | ||
from scipy.sparse import issparse, find | ||
from scipy.sparse import issparse, find, triu, diags | ||
|
||
from pyomo.environ import ( | ||
Binary, | ||
|
@@ -3619,7 +3619,109 @@ def ipopt_solve_halt_on_error(model, options=None): | |
) | ||
|
||
|
||
def check_parallel_jacobian(model, tolerance: float = 1e-4, direction: str = "row"): | ||
def check_parallel_jacobian( | ||
model, | ||
tolerance: float = 1e-8, | ||
direction: str = "row", | ||
zero_norm_tolerance: float = 1e-8, | ||
jac=None, | ||
nlp=None, | ||
): | ||
""" | ||
Check for near-parallel rows or columns in the Jacobian. | ||
|
||
Near-parallel rows or columns indicate a potential degeneracy in the model, | ||
as this means that the associated constraints or variables are (near) | ||
duplicates of each other. | ||
|
||
This method is based on work published in: | ||
|
||
Klotz, E., Identification, Assessment, and Correction of Ill-Conditioning and | ||
Numerical Instability in Linear and Integer Programs, Informs 2014, pgs. 54-108 | ||
https://pubsonline.informs.org/doi/epdf/10.1287/educ.2014.0130 | ||
|
||
Args: | ||
model: model to be analysed | ||
tolerance: tolerance to use to determine if constraints/variables are parallel | ||
direction: 'row' (default, constraints) or 'column' (variables) | ||
|
||
Returns: | ||
list of 2-tuples containing parallel Pyomo components | ||
|
||
""" | ||
# Thanks to Robby Parker for the sparse matrix implementation and | ||
# significant performance improvements | ||
|
||
if direction not in ["row", "column"]: | ||
raise ValueError( | ||
f"Unrecognised value for direction ({direction}). " | ||
"Must be 'row' or 'column'." | ||
) | ||
|
||
if nlp is None or jac is None: | ||
jac, nlp = get_jacobian(model, scaled=False) | ||
|
||
# Get vectors that we will check, and the Pyomo components | ||
# they correspond to. | ||
if direction == "row": | ||
components = nlp.get_pyomo_constraints() | ||
mat = jac.tocsr() | ||
|
||
elif direction == "column": | ||
components = nlp.get_pyomo_variables() | ||
mat = jac.transpose().tocsr() | ||
|
||
# Take product of all rows/columns with all rows/columns by taking outer | ||
# product of matrix with itself | ||
outer = mat @ mat.transpose() | ||
|
||
# Along the diagonal of the outer product you get the dot product of the row | ||
# with itself, which is equal to the norm squared. | ||
norms = np.sqrt(outer.diagonal()) | ||
|
||
zero_norm_indices = np.nonzero(np.abs(norms) <= zero_norm_tolerance) | ||
|
||
# Want to ignore indices with zero norm. If we give them | ||
dallan-keylogic marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# norms of infinity, we scale the rows and columns to zero | ||
# which allows us to ignore them | ||
norms[zero_norm_indices] = float("inf") # 1/float('inf') == 0 | ||
|
||
# Divide each row and each column by the vector norm. This leaves | ||
# the entries as dot(u, v) / (norm(u) * norm(v)). The exception is | ||
# entries with "zero norm", whose corresponding rows and columns are | ||
# set to zero. | ||
scaling = diags(1 / norms) | ||
outer = scaling * outer * scaling | ||
|
||
# Get rid of duplicate values by only taking upper triangular part of | ||
# resulting matrix | ||
upper_tri = triu(outer) | ||
|
||
# Set diagonal elements to zero | ||
# Subtracting diags(upper_tri.diagonal()) is a more reliable | ||
# method of getting the entries to exactly zero than subtracting | ||
# an identity matrix, where one can encounter values of 1e-16 | ||
upper_tri = upper_tri - diags(upper_tri.diagonal()) | ||
dallan-keylogic marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# Get the nonzero entries of upper_tri in three arrays, | ||
# corresponding to row indices, column indices, and values | ||
rows, columns, values = find(upper_tri) | ||
|
||
# We have that dot(u,v) == norm(u) * norm(v) * cos(theta) in which | ||
# theta is the angle between u and v. If theta is approximately | ||
# 0 or pi, sqrt(2*(1 - abs(dot(u,v)) / (norm(u) * norm(v)))) approximately | ||
# equals the number of radians from 0 or pi. A tolerance of 1e-8 corresponds | ||
# to a tolerance of sqrt(2) * 1e-4 radians | ||
parallel_1D = np.nonzero(1 - abs(values) < tolerance)[0] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the motivation for checking There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Both the previous implementation and the Gurobi implementation checked against a semi-normalized epsilon:
Note that this formulation doesn't scale the tolerance with We could implement the check like
It's addition and subtraction that can cause problems with catastrophic cancellation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would advocate for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the intention with the I don't think avoiding vectorized operations is going to be a major improvement in performance, especially since we'll be adding two multiplications ( |
||
|
||
parallel = [ | ||
(components[rows[idx]], components[columns[idx]]) for idx in parallel_1D | ||
] | ||
|
||
return parallel | ||
|
||
|
||
def check_parallel_jacobian_old(model, tolerance: float = 1e-4, direction: str = "row"): | ||
dallan-keylogic marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
Check for near-parallel rows or columns in the Jacobian. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous implementation was designed to avoid taking every possible dot product by first sorting vectors by their nonzero structure. I would have expected this to be slow due to unnecessary dot products, but in some testing, it doesn't seem slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sparse matrix multiplication carried out by Scipy. The key here is that 1) Sorting through which products to take should be carried out in the sparse matrix multiplication routine, not manually and
2) The looping required to do this sorting and multiplication is done in compiled C++ code, not Python code, and (should) be much faster than doing it in Python code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scipy matrix product is taking every possible combination of non-zero dot products, while the previous implementation takes only dot products between vectors with the same sparsity structure. This is fewer dot products, although, in practice, probably not by a wide margin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I don't entirely follow. Scipy's matrix multiply, as near as I can tell, does the same thing. It implements the SMMP algorithm which occurs in two steps: the first to determine the nonzero structure of the matrix, the second to actually compute the values (as near as I can tell). That's basically what you're doing, except 1) it isn't filtering out entries as being "too small to contribute" and 2) it's doing it in C++ instead of Python, which is much faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, if two rows are
(1, 2, 3)
and(1, 2, 0)
, they cannot possibly be parallel. The previous implementation will not compute this dot product, while the new implementation will.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're ruling out rows being parallel based on one having an element for the third index (3) and the other having zero for the third index? That can fail if the first row was
(1, 2, 3e-8)
instead of(1, 2, 3)
:(1, 2, 3e-8)
is still effectively parallel to(1, 2, 0)
even if they structurally differ.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this after filtering out small entries.