xORBDB6 performs numerically unadvisable operations #634

christoph-conrads · 2021-10-25T18:37:46Z

Description

xORBDB5 and xORBDB6 compute vector norms differently causing disagreement on when a vector is numerically zero.

Given an isometric matrix Q and a vector x, xORBDB5 computes a vector x' that is orthogonal to the span of the columns of Q. (Q isometric means that the following two properties hold: Q has at most as many columns as rows and Q^* Q = I). Internally xORBDB5 calls xORBDB6. Given an isometric matrix Q and a vector x, xORBDB6 projects x onto the complement of the column span of Q. The code contains only the ominous comment that

Kahan's "twice is enough" criterion

is applied. Probably this means that xORBDB6 uses at most two iterations of classical Gram-Schmidt orthogonalization to compute its results. This approach is known as CGS2, see BarlowS2011 or GiraudLR2002 and it matches the matrix-vector multiplications inside this function. The problem are the computation of the vector norm.

For a xORBDB5, the norm of the vector computed by xORBDB6 is computed with SNRM2:

lapack/SRC/sorbdb5.f

Lines 223 to 226 in 5d4180c

    
            IF( SNRM2(M1,X1,INCX1) .NE. ZERO 
        
           $    .OR. SNRM2(M2,X2,INCX2) .NE. ZERO ) THEN 
        
               RETURN 
        
            END IF

For xORBDB6, the vector norm is computed as follows:

lapack/SRC/sorbdb6.f

Lines 241 to 247 in 5d4180c

    
           SCL1 = REALZERO 
        
           SSQ1 = REALONE 
        
           CALL SLASSQ( M1, X1, INCX1, SCL1, SSQ1 ) 
        
           SCL2 = REALZERO 
        
           SSQ2 = REALONE 
        
           CALL SLASSQ( M2, X2, INCX2, SCL2, SSQ2 ) 
        
           NORMSQ2 = SCL1**2*SSQ1 + SCL2**2*SSQ2

Consider the input to xORBDB6 below:

c := 2^-45
μ := c ε

    | 0  1 -μ |
Q = | 0  0 +μ |
    | 1  0  0 |
    | 0  μ  1 |

With x = e_2 (i.e., the second standard basis vector), the xORBDB6 variable NORMSQ2 is zero whereas SNRM2 is nonzero causing xORBDB5 to return an incorrect vector.

Checklist

I've included a minimal example to reproduce the issue
I'd be willing to make a PR to solve this issue

The text was updated successfully, but these errors were encountered:

christoph-conrads · 2021-10-26T18:48:12Z

A test highlighting the problem can be found in christoph-conrads/qr+cs-gsvd branch in commit 6c576c4. The test does not trigger yet in double precision because NORMSQ2 in DORBDB6 is not zero while it is zero in SORBDB6.

christoph-conrads · 2021-10-26T20:00:26Z

lapack/SRC/sorbdb6.f

Lines 241 to 247 in 5d4180c

    
           SCL1 = REALZERO 
        
           SSQ1 = REALONE 
        
           CALL SLASSQ( M1, X1, INCX1, SCL1, SSQ1 ) 
        
           SCL2 = REALZERO 
        
           SSQ2 = REALONE 
        
           CALL SLASSQ( M2, X2, INCX2, SCL2, SSQ2 ) 
        
           NORMSQ2 = SCL1**2*SSQ1 + SCL2**2*SSQ2

This should be

      SCL = REALZERO
      SSQ = REALONE
      CALL SLASSQ( M1, X1, INCX1, SCL, SSQ )
      CALL SLASSQ( M2, X2, INCX2, SCL, SSQ )
      NORMSQ2 = SCL**2 * SSQ

christoph-conrads · 2021-10-27T17:53:41Z

In my opinion, the functions xORBDB6/xUNBDB6 should be modified to enhance numerical stability.

The first issue is that there are no restrictions on the norm of X. Right now this is not a problem because the only caller xORBDB5 is passing unit-length vectors. Either the documentation should require unit-length vectors or the function should rescale vectors if necessary.

lapack/SRC/sorbdb6.f

Line 247 in 5d4180c

NORMSQ2 = SCL1**2*SSQ1 + SCL2**2*SSQ2

The expression might over- or underflow. An underflow is one of the causes of the problem observed in the pull request #406 test regression_preprocessing_20210606. The fix is as follows:

      SCL = REALZERO
      SSQ = REALONE
      CALL SLASSQ( M1, X1, INCX1, SCL, SSQ )
      CALL SLASSQ( M2, X2, INCX2, SCL, SSQ )
      NORM = SCL * SQRT(SSQ)

lapack/SRC/sorbdb6.f

Lines 257 to 259 in 5d4180c

    
           IF( NORMSQ2 .EQ. ZERO ) THEN 
        
              RETURN 
        
           END IF

edit For xORBDB6, the right-hand side constant could be REALZERO for consistency.

This branch is only taken if X is exactly zero after the first Gram-Schmidt orthogonalization which will practically never be the case (unless the norm computation underflows). The condition should be replaced with something like

if norm_after_gs <= (m1 + m2) * eps * norm:
    x(:) = 0
    return

lapack/SRC/sorbdb6.f

Lines 283 to 289 in 5d4180c

    
           SCL1 = REALZERO 
        
           SSQ1 = REALONE 
        
           CALL SLASSQ( M1, X1, INCX1, SCL1, SSQ1 ) 
        
           SCL2 = REALZERO 
        
           SSQ2 = REALONE 
        
           CALL SLASSQ( M1, X1, INCX1, SCL1, SSQ1 ) 
        
           NORMSQ2 = SCL1**2*SSQ1 + SCL2**2*SSQ2

After the second Gram-Schmidt orthogonalization, SCL2 and SSQ2 are not updated properly.

lapack/SRC/sorbdb6.f

Lines 290 to 295 in 5d4180c

    
           * 
        
           *     If second projection is sufficiently large in norm, then do 
        
           *     nothing more. Alternatively, if it shrunk significantly, then 
        
           *     truncate it to zero. 
        
           * 
        
                 IF( NORMSQ2 .LT. ALPHASQ*NORMSQ1 ) THEN

This comparison might fail if the right-hand side underflows; this can happen if the vector X was already small in norm after the first Gram-Schmidt orthogonalization. X should be re-scaled.

For completeness, a pull request should fix the vector increments as Brian Sutton correctly noted and also check or handle negative increments like other LAPACK functions.

@briansutton @langou @weslleyspereira Please comment on my proposal.

edit ALPHASQ must be updated

weslleyspereira · 2021-11-02T22:29:39Z

Hi @christoph-conrads. Thanks for the time of explaining all of this!

I talked to @langou last week. He had something else to say about the precision of the final orthogonal vector depending on how Q is far from being a true orthogonal matrix. I'm not sure if he will be able to comment on this thread this week, so here are my comments:

In my opinion, the functions xORBDB6/xUNBDB6 should be modified to enhance numerical stability.

The first issue is that there are no restrictions on the norm of X. Right now this is not a problem because the only caller xORBDB5 is passing unit-length vectors. Either the documentation should require unit-length vectors or the function should rescale vectors if necessary.

I agree. I would go with updating the documentation.

lapack/SRC/sorbdb6.f

Line 247 in 5d4180c

NORMSQ2 = SCL1**2*SSQ1 + SCL2**2*SSQ2

The expression might over- or underflow. An underflow is one of the causes of the problem observed in the pull request #406 test regression_preprocessing_20210606. The fix is as follows:
      SCL = REALZERO
      SSQ = REALONE
      CALL SLASSQ( M1, X1, INCX1, SCL, SSQ )
      CALL SLASSQ( M2, X2, INCX2, SCL, SSQ )
      NORM = SCL * SQRT(SSQ)

I agree.

lapack/SRC/sorbdb6.f

Lines 257 to 259 in 5d4180c

IF( NORMSQ2 .EQ. ZERO ) THEN

RETURN

END IF

edit For xORBDB6, the right-hand side constant could be REALZERO for consistency.

I agree.

This branch is only taken if X is exactly zero after the first Gram-Schmidt orthogonalization which will practically never be the case (unless the norm computation underflows). The condition should be replaced with something like
if norm_after_gs <= (m1 + m2) * eps * norm:
    x(:) = 0
    return

I agree the condition should either be removed or be changed another one involving some tolerance. I'm not sure which tolerance. I think it makes sense to use something like norm_after_gs <= eps * norm. I don't understand the why we would need the factor (m1 + m2). Also, I think this:

lapack/SRC/sorbdb5.f

Lines 223 to 226 in 5d4180c

    
            IF( SNRM2(M1,X1,INCX1) .NE. ZERO 
        
           $    .OR. SNRM2(M2,X2,INCX2) .NE. ZERO ) THEN 
        
               RETURN 
        
            END IF

should be modified using the same pattern.

lapack/SRC/sorbdb6.f

Lines 283 to 289 in 5d4180c

SCL1 = REALZERO

SSQ1 = REALONE

CALL SLASSQ( M1, X1, INCX1, SCL1, SSQ1 )

SCL2 = REALZERO

SSQ2 = REALONE

CALL SLASSQ( M1, X1, INCX1, SCL1, SSQ1 )

NORMSQ2 = SCL1**2*SSQ1 + SCL2**2*SSQ2

After the second Gram-Schmidt orthogonalization, SCL2 and SSQ2 are not updated properly.

Oh... this is a bug!

lapack/SRC/sorbdb6.f

Lines 290 to 295 in 5d4180c

*

* If second projection is sufficiently large in norm, then do

* nothing more. Alternatively, if it shrunk significantly, then

* truncate it to zero.

*

IF( NORMSQ2 .LT. ALPHASQ*NORMSQ1 ) THEN

This comparison might fail if the right-hand side underflows; this can happen if the vector X was already small in norm after the first Gram-Schmidt orthogonalization. X should be re-scaled.

Suppose we use a more appropriate check for norm_after_gs after the first Gram-Schmidt orthogonalization as you suggests. Then, X won't need to be re-scaled, right?

For completeness, a pull request should fix the vector increments as Brian Sutton correctly noted and also check or handle negative increments like other LAPACK functions.

I agree.

@briansutton @langou @weslleyspereira Please comment on my proposal.

edit ALPHASQ must be updated

I agree. The new value would be ALPHASQ = 0.1E0.

christoph-conrads · 2021-11-11T10:10:28Z

This branch is only taken if X is exactly zero after the first Gram-Schmidt orthogonalization which will practically never be the case (unless the norm computation underflows). The condition should be replaced with something like
if norm_after_gs <= (m1 + m2) * eps * norm:
    x(:) = 0
    return
[snip] I don't understand the why we would need the factor (m1 + m2).

The factor is essentially the error bound n ε/2 |x^* y| on the result of a single inner product |x^* y| multiplied with a magic constant convenient for programming. Strictly speaking, the right-hand side should be sqrt(n^2) ε/2. I mixed up the number of rows and columns in the expression.

In Highman Accuracy and Stability of Numerical Algorithms (2002) the norm-wise error bound for matrix-vector multiplication y = Ax is given as ||y - y'||₂ ≤ sqrt(min(m, n)) γ(n) ||A||₂ ||x||₂, where

γ(n) = (n ε) / (2 - n ε),
m x n are the dimensions of the matrix A, and
y' is the result of the matrix computation in floating-point arithmetic (see §3.5 Matrix Multiplication).

With the improved error bounds in Jeannerod, Rump: Improved Error Bounds for Inner Products in Floating-Point Arithmetic (2013) it might be possible to consider the function γ(n) as γ(n) = n ε/2. ~~Strictly speaking, this holds only for the one, maximum and Frobenius norm (see the §5 Concluding Remarks).~~ The expression in §5 Concluding Remarks can also be used for the case p = 2 (the spectral and Euclidean norm, respectively) because the matrix is isometric; hence ||A||ₚ = sqrt(n) ||A||₂, p = F.

Also, I think this:

lapack/SRC/sorbdb5.f

Lines 223 to 226 in 5d4180c

IF( SNRM2(M1,X1,INCX1) .NE. ZERO

$ .OR. SNRM2(M2,X2,INCX2) .NE. ZERO ) THEN

RETURN

END IF

should be modified using the same pattern.

If xORBDB6 considers a vector to be numerically zero, then it is set to zero, see SRC/sorbdb6.f, line 295ff.. Absent a dedicated boolean return value indicating whether the vector is considered to lie in the range of the matrix or not, I think this is the best solution because it stops the caller from having to know about the exact test employed by xORBDB6 to check for numerical zero. This fix was also proposed by Brian D. Sutton in pull request 406.

edit ALPHASQ must be updated

I agree. The new value would be ALPHASQ = 0.1E0.

ALPHASQ should also be renamed to ALPHA (ALPHASQ).

christoph-conrads · 2022-01-30T19:16:29Z

Suppose we use a more appropriate check for norm_after_gs after the first Gram-Schmidt orthogonalization as you suggests. Then, X won't need to be re-scaled, right?

I do not think so. If the input vector x has initially unit norm, then the algorithm will terminate early if the norm of (I - Q^* Q) x is less than n * ε, where n is the dimension of x. Hence, if the projection is repeated, then ||(I - Q^* Q)^2 x||_2 is at least n^2 ε**2 which is much larger than any denormalized number for IEEE floats. Consequently, I would not assume there to be underflows and hence, no need for re-scaling.

langou · 2022-04-12T16:20:51Z

I talked to @langou last week. He had something else to say about the precision of the final orthogonal vector depending on how Q is far from being a true orthogonal matrix. I'm not sure if he will be able to comment on this thread this week

I have been wondering why we do not use some Householder reflections for these routines, as opposed to Gram-Schmidt. This seems doable. And we would not be worried about numerical stability.

The proof of 2005 for stability of CGS2 requires the assumption that the input matrix is numerically nonsingular. (So "twice is enough" if something like "epsilon * kappa < 1".) I do not know whether this condition will hold in this piece of code.

There are a few places in LAPACK where we use Gram-Schmidt and I think we should make an effort to use Householder reflections instead.

Cheers,
Julien.

christoph-conrads · 2022-04-24T19:36:17Z

The proof of 2005 for stability of CGS2 requires the assumption that the input matrix is numerically nonsingular. (So "twice is enough" if something like "epsilon * kappa < 1".) I do not know whether this condition will hold in this piece of code.

The caller must ensure that Q has orthonormal columns due to the function name (xORBDB6) and the documentation clarifies it as well.

christoph-conrads · 2022-04-24T19:40:45Z

I have been wondering why we do not use some Householder reflections for these routines, as opposed to Gram-Schmidt. This seems doable. And we would not be worried about numerical stability.

The matrix Q must have orthonormal columns; Gram-Schmidt can be applied directly without negative consequences.

langou · 2022-04-24T19:56:15Z

Hi Christoph,

Even if Q is a perfectly orthonormal basis, we can imagine that we run into issues with a two-projection Gram-Schmidt scheme.

If x has many components in Span(Q) ( ||Q^Tx || ~ || x || ), then the first projection will essentially returns a random vector. It is possible (unlikely but possible) that this random vector is again in Span(Q). So the second projection is again a random vector, which is unlikely to be orthogonal to Span(Q). Therefore we can imagine that we do not have orthogonality after two steps.

The trick to prove that twice is enough is to assume that Q^T x is not too small. The assumption essentially is needed to prove that the noise of the first projection is less than the magnitude of the first projection. So that whatever pollution is after the first projection is a problem for orthogonality, but is small enough to guarantee that the first projection is not in Span(Q).

With all that being said, I think the code works with high probability and is very well written so all good with me

Julien.

christoph-conrads added the Type: Bug label Oct 25, 2021

christoph-conrads changed the title ~~xORBDB5, xORBDB6 disagree on criterion for zero vector~~ xORBDB5, xORBDB6 disagree when vectors are numerically zero Oct 25, 2021

christoph-conrads mentioned this issue Oct 25, 2021

Add GSVD with QR factorizations, 2-by-1 CS decomposition #406

Open

christoph-conrads added a commit to christoph-conrads/lapack that referenced this issue Oct 26, 2021

Test: add xUNBDB5 test for LAPACK issue Reference-LAPACK#634

6c576c4

christoph-conrads changed the title ~~xORBDB5, xORBDB6 disagree when vectors are numerically zero~~ xORBDB6 performs numerically unadvisable operations Oct 27, 2021

christoph-conrads added a commit to christoph-conrads/lapack that referenced this issue Oct 27, 2021

Test: add xUNBDB6 tests for LAPACK issue Reference-LAPACK#634

830cea1

christoph-conrads mentioned this issue Jan 30, 2022

Fix xORBDB6 performs numerically unadvisable operations #647

Merged

2 tasks

christoph-conrads added a commit to christoph-conrads/lapack that referenced this issue Jul 14, 2022

Test: add xUNBDB6 tests for LAPACK issue Reference-LAPACK#634

f57c1dc

langou closed this as completed in #647 Jul 15, 2022

christoph-conrads mentioned this issue Aug 8, 2022

Fix sorcsd #697

Merged

christoph-conrads mentioned this issue Nov 2, 2023

SORCSD2BY1 computes inaccurate result #917

Closed

2 tasks

christoph-conrads added a commit to christoph-conrads/lapack that referenced this issue Dec 7, 2023

Test: add xUNBDB6 tests for LAPACK issue Reference-LAPACK#634

4638ae7

christoph-conrads added a commit to christoph-conrads/lapack that referenced this issue Dec 19, 2023

Test: add xUNBDB6 tests for LAPACK issue Reference-LAPACK#634

65d8a2a

christoph-conrads added a commit to christoph-conrads/lapack that referenced this issue Dec 29, 2023

Test: add xUNBDB6 tests for LAPACK issue Reference-LAPACK#634

4eaf6d4

christoph-conrads added a commit to christoph-conrads/lapack that referenced this issue Jan 17, 2024

Test: add xUNBDB6 tests for LAPACK issue Reference-LAPACK#634

58459ef

christoph-conrads added a commit to christoph-conrads/lapack that referenced this issue Jan 19, 2024

Test: add xUNBDB6 tests for LAPACK issue Reference-LAPACK#634

d8b12eb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xORBDB6 performs numerically unadvisable operations #634

xORBDB6 performs numerically unadvisable operations #634

christoph-conrads commented Oct 25, 2021

christoph-conrads commented Oct 26, 2021

christoph-conrads commented Oct 26, 2021

christoph-conrads commented Oct 27, 2021 •

edited

Loading

weslleyspereira commented Nov 2, 2021

christoph-conrads commented Nov 11, 2021 •

edited

Loading

christoph-conrads commented Jan 30, 2022 •

edited

Loading

langou commented Apr 12, 2022

christoph-conrads commented Apr 24, 2022 •

edited

Loading

christoph-conrads commented Apr 24, 2022

langou commented Apr 24, 2022

xORBDB6 performs numerically unadvisable operations #634

xORBDB6 performs numerically unadvisable operations #634

Comments

christoph-conrads commented Oct 25, 2021

christoph-conrads commented Oct 26, 2021

christoph-conrads commented Oct 26, 2021

christoph-conrads commented Oct 27, 2021 • edited Loading

weslleyspereira commented Nov 2, 2021

christoph-conrads commented Nov 11, 2021 • edited Loading

christoph-conrads commented Jan 30, 2022 • edited Loading

langou commented Apr 12, 2022

christoph-conrads commented Apr 24, 2022 • edited Loading

christoph-conrads commented Apr 24, 2022

langou commented Apr 24, 2022

christoph-conrads commented Oct 27, 2021 •

edited

Loading

christoph-conrads commented Nov 11, 2021 •

edited

Loading

christoph-conrads commented Jan 30, 2022 •

edited

Loading

christoph-conrads commented Apr 24, 2022 •

edited

Loading