Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in fixed division #5698

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

hexagonrecursion
Copy link
Contributor

Fixes #5697

Note: the return statement at the end does convert Uint64 to Sint64, which is implementation-defined in C++17 if the value is > INT64_MAX. Here is why I think this is probably OK:

  1. My biggest argument why I think this is OK is that C++20 guarantees wraparound modulo 2^64 in this case.
  2. gcc guarantees the right result
  3. Microsoft C++ guarantees the right result
  4. clang unfortunately does not document its implementation-defined behaviour

@hexagonrecursion
Copy link
Contributor Author

hexagonrecursion commented Dec 27, 2023

I believe I have found a way to implement unsigned to signed conversion with wraparound in C++ without relying on implementation defined behavour. Would this be overkill?

#include <cstdint>

typedef int64_t Sint64;
typedef uint64_t Uint64;

Sint64 good(Uint64 num) {
    if (num > (Uint64) INT64_MAX) {
        // Implement wrapardound
        num -= (Uint64) INT64_MAX;
        num -= 1;
        Sint64 result = num;
        result += INT64_MIN;
        return result;
    }
    return num;
}

@impaktor
Copy link
Member

impaktor commented Jan 5, 2024

@hexagonrecursion Since you seem to have been playing around with our code base, I'm just informing you we'll soon feature freeze (-ish), for the annual February 03 release.

@hexagonrecursion
Copy link
Contributor Author

Thanks. I have a lot on my plate though so I don't expect to make more pioneer pull requests any time soon.

@sturnclaw
Copy link
Member

sturnclaw commented Jan 7, 2024

Would this be overkill?

The primary concern of the fixed-point function library is determinism between GCC-on-Linux and MSVC-on-Windows. Performance is a very close second, so if GCC and MSVC provide uniform outcomes of their implementation-defined behavior (and Clang can be tested to comply with the expected behavior) then I would strongly advise against introducing additional branches into math code that is expected to have extremely high throughput.

As a corollary, given this PR addresses very niche/specific behavior which can be implementation or platform dependent and thus remain "invisible" if broken, I'd strongly recommend adding a new test case validating the outcome of the specific division cases you're addressing to our unit testing suite. We define unit tests in src/test using DocTest, and it should be extremely simple to add new tests to that suite.

EDIT: I had not yet reached the point in the diff where a test case doing exactly that had been added, disregard!

Copy link
Member

@sturnclaw sturnclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I'm quite glad someone is addressing this. I believe the fixed-point code to either predate or be written around the time C++11 was introduced, and certainly well before Pioneer began adopting the standard, much less C++17.

Thank you for taking the time to ensure it is numerically correct and stable, I'm sure it will save some time in the long run otherwise spent hunting down ghost bugs!

I've left one change request that I'd like to see addressed before merge - minimizing branches where possible in high-throughput code is strongly preferred, and the system generation code performance as a whole is primarily dependent on the fixed-point math library.

src/fixed.h Outdated
Comment on lines 152 to 161
Uint64 abs_a = a.v, abs_b = b.v;
bool is_neg = false;
if (a.v < 0) {
abs_a = -abs_a;
is_neg = !is_neg;
}

if (b.v < 0) {
abs_b = -abs_b;
is_neg = !is_neg;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly recommend using bitwise ops and std::abs rather than explicit branches, as it will generally compile into optimized code under more scenarios. E.g. something like:

uint64_t abs_a = std::abs(a.v), abs_b = std::abs(b.v);
bool is_neg = int(a.v < 0) ^ int(b.v < 0);

(Note: this is well-defined, true is defined to promote to the integral value 1.)
See this Godbolt for example of how this affects code that would be generated in debug mode: https://godbolt.org/z/6MsTPv8dz

Yes, under -O3 or equivalent this code would produce the same generated assembly, but the above suggestion will both produce better performance in debug mode (wasting less programmer time) and more clearly expresses what is logically occurring in the if statements without the extra cognitive overhead of the order-dependent logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly recommend using bitwise ops and std::abs rather than explicit branches, as it will generally compile into optimized code under more scenarios. E.g. something like:

uint64_t abs_a = std::abs(a.v), abs_b = std::abs(b.v);
bool is_neg = int(a.v < 0) ^ int(b.v < 0);

(Note: this is well-defined, true is defined to promote to the integral value 1.) See this Godbolt for example of how this affects code that would be generated in debug mode: https://godbolt.org/z/6MsTPv8dz

Yes, under -O3 or equivalent this code would produce the same generated assembly, but the above suggestion will both produce better performance in debug mode (wasting less programmer time) and more clearly expresses what is logically occurring in the if statements without the extra cognitive overhead of the order-dependent logic.

Done. Please note that with the requested changes dividing fixed(INT64_MIN) by anything or dividing anything by fixed(INT64_MIN) causes undefined behaviour because std::abs(INT64_MIN) is undefined behaviour.

@@ -0,0 +1,33 @@
#include <limits>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add:

// Copyright © 2008-2024 Pioneer Developers. See AUTHORS.txt for details
// Licensed under the terms of the GPL v3. See licenses/GPL-3.txt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I forgot

@impaktor
Copy link
Member

@hexagonrecursion Did you have time to address the requested changes? We might do a release next month, so just checking status on this PR.

Note: this is technically undefined behavior if a.v or b.v is _exactly_ INT64_MIN,
but the upside that this compiles to faster code even under -Og
@hexagonrecursion hexagonrecursion marked this pull request as draft April 26, 2024 08:45
@hexagonrecursion hexagonrecursion marked this pull request as ready for review April 26, 2024 09:58
Copy link
Member

@sturnclaw sturnclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest a "canary" test be added to the test suite to check for the behavior of division tests involving INT64_MIN which rely on undefined behavior. Otherwise, this looks good to me - thanks for addressing the review feedback.

Because this has a non-zero chance of implicitly altering procedural generation determinism when merged, I'm going to defer merge of this PR until we've fully started the development cycle for the next major release.

@hexagonrecursion
Copy link
Contributor Author

I'd suggest a "canary" test be added to the test suite to check for the behavior of division tests involving INT64_MIN which rely on undefined behavior

@Web-eWorks I do not understand. At the time of writing none of the tests added by this pull request (to the best of my knowledge) rely on undefined behavior. Are you suggesting we add a test that deliberately triggets undefined behaviour? What would this test assert?

The use of std::abs() is to the best of my knowledge the only source of undefined behaviour in my current implementation of fixedf operator/(fixedf,fixedf). This can only be triggered by passing fixed(INT64_MIN) as a numerator or demoninator to the / operator - all other values should be fine.

undefined behaviour-free version:

int64_t a = INT64_MIN;
uint64_t abs_a = a;
if (a < 0 /* true */) {
    abs_a = -abs_a;  // Defined: -uint64_t(INT64_MIN) == uint64_t(INT64_MIN)
}

undefined behaviour version:

int64_t a = INT64_MIN;
// Undefined behaviour because the return type of std::abs() is
// int, long or long long (depending on overload resolution)
// can't represent the absolute value of INT64_MIN
uint64_t abs_a = std::abs(a);  
  1. I thought you suggested std::abs() because you were confident fixed(INT64_MIN) will never occur in practice
  2. It is impossible to "test" undefined behaviour - a passing test proves nothing.
  3. A test that "tests" undefined behaviour is problematic because it will be flagged when someone runs the test suite with sanitizers.

@hexagonrecursion
Copy link
Contributor Author

hexagonrecursion commented Apr 28, 2024

What do you think I should do?

Brainstorming alternatives:

  1. std::abs(a.v) - the current implementation
    • undefined behaviour if the absolute value of a.v is outside the range of the return type of abs().
    • For C++20 there should be only one such value: INT64_MIN
    • For gcc (regardless of C++ standard) there should be only one such value: INT64_MIN because "GCC supports only two’s complement integer types"
    • For msvc (regardless of C++ standard) there should be only one such value: INT64_MIN because "Signed integers are represented in two's-complement form"
    • So don't divide by fixed(INT64_MIN) and don't divide fixed(INT64_MIN) by anything
  2. if (a.v < 0) abs_a = -abs_a;
    • Free of UB
    • Adds a branch under -Og - may or may not have a measurable performance impact because the same function contains a loop with 128 iterations - the cost of the loop may overshadow the cost of the branch. I could look into doing a performance benchmark.
  3. I could look into the possibility of changing optimization level for one function
  4. I could look into the possibility of implementing the calculation of the absloute value without branches using inline assembly with a portable C++ fallback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bugs in fixed.h division
3 participants