Skip to content

Commit

Permalink
Revert "Implement NSDate as a small object (tagged pointer) - clang/l…
Browse files Browse the repository at this point in the history
…ibobjc2 only (#451)" because of GUI breakage: apps hang when loading NSMenu

This reverts commit 8fd2c06.
  • Loading branch information
rmottola committed Oct 31, 2024
1 parent fd5ec29 commit cac43f7
Show file tree
Hide file tree
Showing 6 changed files with 438 additions and 833 deletions.
17 changes: 1 addition & 16 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -15,22 +15,7 @@
* Source/NSFileManager.m: Create an NSError object when we fail to
copy because the destination item already exists.

2024-10-10: Hugo Melder <[email protected]>

* Source/NSDate.m:
* Source/NSDatePrivate.h:
NSDate is now a small object in slot 6, when configuring GNUstep with the
clang/libobjc2 toolchain. This is done by compressing the binary64 fp
holding the seconds since reference date (because someone at Apple thought
it would be a good idea to represent a time interval as a fp). Note that
this assumes that IEEE 754 is used.
* Source/NSCalendarDate.m:
Remove access to the NSDate private concrete Class
and implement -timeIntervalSinceReferenceDate instead.
Previously, all methods of the concrete date class were
added to NSCalendarDate via GSObjCAddClassBehavior.

2024-23-09: Hugo Melder <[email protected]>
2024-09-23: Hugo Melder <[email protected]>

* Headers/Foundation/NSThread.h:
* Source/NSString.m:
Expand Down
9 changes: 4 additions & 5 deletions Source/NSCalendarDate.m
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ @interface GSTimeZone : NSObject // Help the compiler
@class GSAbsTimeZone;
@interface GSAbsTimeZone : NSObject // Help the compiler
@end
@class NSGDate;
@interface NSGDate : NSObject // Help the compiler
@end


#define DISTANT_FUTURE 63113990400.0
Expand Down Expand Up @@ -393,6 +396,7 @@ + (void) initialize
absAbrIMP = (NSString* (*)(id,SEL,id))
[absClass instanceMethodForSelector: abrSEL];

GSObjCAddClassBehavior(self, [NSGDate class]);
[pool release];
}
}
Expand Down Expand Up @@ -459,11 +463,6 @@ + (id) dateWithYear: (NSInteger)year
return AUTORELEASE(d);
}

- (NSTimeInterval) timeIntervalSinceReferenceDate
{
return _seconds_since_ref;
}

/**
* Creates and returns a new NSCalendarDate object by taking the
* value of the receiver and adding the interval in seconds specified.
Expand Down
Loading

12 comments on commit cac43f7

@triplef
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rmottola can you provide steps to reproduce the issue or a link to an open ticket for this?

I feel it’s a bit drastic to just revert someone else’s work here, especially when it was reviewed and approved in a pull request. It would be great if we could find the underlying issue and fix that instead.

@gcasa
Copy link
Member

@gcasa gcasa commented on cac43f7 Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@triplef I do apologize if you felt as though this action was harsh, @rmottola acted on my instructions, so I take absolute and full and complete responsibility.

As chief maintainer, I reserve the right to summarily back out a change when it breaks the build or causes an issue where applications crash or won't start. It is not acceptable to leave the breakage in place until it is fixed as that could take an undefined amount of time and could impact the progress/work of others. This is something that is rarely, if ever, done, but when done it is done for good reason and never when only one person can reproduce the issue. This is usually an indication of our shortcomings with respect to testing or some other way of mitigating the issue, see my suggestions below.

Both @rmottola and I were able to reproduce this issue... the steps to reproduce for me were as follows:

  1. Delete /usr/GNUstep or wherever you have it installed
  2. from the directory you have the gnustep code installed in, for me, I did this:
    ./tools-scripts/clang-build clang clang++
    This script builds the entire project using the options given in the script...
  3. I am using Debian 12, with clang 14.0.6 and the tip of both libobjc2 and libdispatch (the apple version).
  4. I then built Gorm from scratch and installed it.
    RESULT: Upon invoking Gorm it waited a few seconds and got a segmentation fault. If you look at @rmottola 's comments,
    he provides a backtrace, mine was pretty much the same. The issue seemed to be that when the services menu was being
    updated the application recursively tried to update all of the menu items and fell into an infinite recursion.

OBSERVATIONS:

  • I saw this issue on Debian and @rmottola found it on OpenBSD.
  • This issue is easy to miss in base as we don't have any tests that test what happens to applications when base is modified. We, likely, should.
  • This issue seems confined to applications that contain a services menu

SUGGESTIONS:

  • We need to start testing apps on the libs-base side, at least in CI
  • Said apps need to have a services menu

Had either of these been in place, this issue would have been caught at the CI step instead of making it this far. Just FYI, this is not the first issue to have triggered this... the last time was about 15 years ago.

@hmelder Please do not take this in any negative way. Your work is excellent. @triplef I will create an issue for this so that we can more directly track commentary instead of here in the commit log. GC

@gcasa
Copy link
Member

@gcasa gcasa commented on cac43f7 Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is here...
#457

Please many any additional comments there. I have assigned all interested parties.

@rmottola
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@triplef well, continuous integration is not a source of truth, just an aid. I tested on Linux and all gui apps were broken. I tested on Linux/Intel and NetBSD / sparc32 and both were broken. Gregory could reproduce the issue on his clang setup of which I do not have details.
I did the hard work of pinpointing the exact commit (having first looked ad gui for obvious reasons) and the consulted with Gregory.

Master trunk was broken in a "heavy" way, I feel it is justified to revert a single commit. It is not a judgment on @hmelder 's work nor on the review work, just a temporary measure. Back out and replaying of a commit (possibly with changes or with preliminary dependencies) is an accepted practice in other project. We just rarely resorted to this.

@triplef
Copy link
Member

@triplef triplef commented on cac43f7 Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the details and finding this issue.

Of course any change can break things in unexpected ways, despite the best testing and CI. I’m sure this has happened to all of us.

Personally I would just have expected a note in the pull request or a new issue to give whoever wrote the code that caused the issue a chance to chime in and ideally fix the issue. Otherwise any larger change could end up being reverted and reapplied with fixes multiple times, which I don’t think is a great way to move forward. Just my thoughts.

@fredkiefer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I would like to see to come out of this are more tests. Of course adding application test to base as Greg suggested isn't what I have in mind. But now we know which specific usage of -earlierDate: was broken and should be fairly simple to add a few test case to the file Tests/base/NSDate/general.m that reflect the new usage pattern and ensure it won't get broken in the future.
My feeling there is that the tiny object change for NSDate itself did not break that code but rather the code was already wrong and somehow the change triggered this issue. And Richards latest change seem to confirm that.

@gcasa
Copy link
Member

@gcasa gcasa commented on cac43f7 Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should have a test for the issue in -earlierDate:. I also agree that the code in the GUI was likely already wrong, as @fredkiefer pointed out.

I had suggested adding a test in CI that tests an application because, had there been one, this issue would have been found before review. Sort of the "shotgun" approach. Naturally, we can never know beforehand what issues might arise, so it might not catch everything.

@gcasa
Copy link
Member

@gcasa gcasa commented on cac43f7 Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@triplef well, continuous integration is not a source of truth, just an aid. I tested on Linux and all gui apps were broken. I tested on Linux/Intel and NetBSD / sparc32 and both were broken. Gregory could reproduce the issue on his clang setup of which I do not have details. I did the hard work of pinpointing the exact commit (having first looked ad gui for obvious reasons) and the consulted with Gregory.

Master trunk was broken in a "heavy" way, I feel it is justified to revert a single commit. It is not a judgment on @hmelder 's work nor on the review work, just a temporary measure. Back out and replaying of a commit (possibly with changes or with preliminary dependencies) is an accepted practice in other project. We just rarely resorted to this.

Just to clarify... when I say that I take responsibility I am only saying I gave the go-ahead to do the reversion. @rmottola did the work of finding the exact commit by bisecting the code.

When there is a breaking change that can hold up the work of others the responsibility for correcting the issue lies with the original committers or (failing that) the maintainer of whatever package is affected. This is why it was reverted.

@triplef
Copy link
Member

@triplef triplef commented on cac43f7 Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is a breaking change that can hold up the work of others the responsibility for correcting the issue lies with the original committers [...]

That’s fine, but my point is that (as far as I know) the original committer was not notified of the issue. IMO that should have been the first step after finding the issue.

Anyway, thank you @rfm for fixing the issue and reapplying the patch so quickly.

Edit: Nevermind, I just saw the comments attached to this commit where Hugo was in fact notified.

@gcasa
Copy link
Member

@gcasa gcasa commented on cac43f7 Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rfm what tests do you believe should come out of this to prevent this type of breakage in the future? You've seen the suggestions by myself and @fredkiefer. Do you have any thoughts?

@rfm
Copy link
Contributor

@rfm rfm commented on cac43f7 Nov 4, 2024 via email

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gcasa
Copy link
Member

@gcasa gcasa commented on cac43f7 Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed 100%. That is the best we can do. As I pointed out also (as you did as well) this was an EXTREMELY easy thing to miss. Especially if you're working in a bubble where you are solving one problem. The issue came up only in this specific case and ONLY affected certain applications (not all apps have service menus). So I guess the determination is to create tests in base that would catch the specific issue you mentioned and keep expanding our tests to prevent regression. @rfm @fredkiefer @hmelder Thank you guys for being on top of this and for making this project better and better.

Please sign in to comment.