-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use monotonic time (clock_gettime) for cups_enum_dests #1083
base: master
Are you sure you want to change the base?
Conversation
Using gettimeofday here is not safe against clock jumps (e.g. a manual setting of the clock). Use clock_gettime instead. Signed-off-by: Adam Williamson <[email protected]>
For context, see https://bugzilla.redhat.com/show_bug.cgi?id=2316066 . In Fedora 41 validation testing we found a problem where live images sometimes weren't working properly for the first minute or so after boot. We worked out that this is associated with a system time jump that often happens during boot of live images on systems with the hardware clock set to UTC. Our live images assume the hardware clock is set to localtime and the system timezone is US Eastern, so they set the system clock accordingly; when chrony kicks in during boot and finds the real current time, this can cause a system clock jump of several hours. If we adjust things so no clock jump occurs, the bug doesn't happen. Steve tracked it down to the org.gnome.SettingsDaemon.PrintNotifications.service user service timing out on startup, which blocked some other things from starting up properly too, especially xdg-desktop-portal-gnome.service , which is likely the source of the observed problems. I then noticed the backtrace of that service (after systemd kills it for exceeding its startup timeout) shows it apparently waiting for a response from I built a live image with a scratch build of our cups package with this patch applied (backported to 2.4.11), and in some initial testing by Steve it seems to have resolved the problem. I do note there's a lot of other calls to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would propose using '/ 1000000' instead of dividing /1000 twice, but otherwise the changes look good to me.
I've checked the other usages of gettimeofday in libcups which could cause similar issue and IMO cups_dnssd_resolve()
and cups_dnssd_resolve_cb()
should be using the monotonic clock as well, but since we can't reproduce it at the moment, we migrate it later.
I've checked the CI errors as well - MacOS fails due unrelated reason, however Windows does not know the function or macro - probably we have to use a different way for Windows.
gettimeofday(&nt, NULL); | ||
|
||
msecs = (int)(1000 * (nt.tv_sec - t->tv_sec) + (nt.tv_usec - t->tv_usec) / 1000); | ||
msecs = (int)(1000 * (nt.tv_sec - t->tv_sec) + (nt.tv_nsec - t->tv_nsec) / 1000 / 1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we can use '/ 1000000' instead dividing by 1000 twice.
@@ -3058,15 +3058,14 @@ cups_dnssd_unquote(char *dst, // I - Destination buffer | |||
// | |||
|
|||
static int // O - Elapsed time in milliseconds | |||
cups_elapsed(struct timeval *t) // IO - Previous time | |||
cups_elapsed(struct timespec *t) // IO - Previous time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation issue, but it is not a blocker.
Yes, this needs cleanup and we need to determine the proper functions to use on Windows for the same purpose (if the current code isn't suitable). I also want to check on how long this has been supported as some of these POSIX functions are "standard" but with loads of optional bits - I have a vague memory of Solaris not supporting the monotonic clock, among others, so we'll need to research this a bit more before we "pull the trigger"... |
Roger, those all sound like excellent improvements. I'll try and get back and work on this once we're through Fedora 41 final freeze (which is where this came from). FWIW though I'd say this is a bug fix, not an enhancement. To be clear, I found this as the cause of a fairly visible bug in Fedora - https://bugzilla.redhat.com/show_bug.cgi?id=2316066 . It was causing GNOME's print notifications daemon to get blocked on startup until it timed out and systemd killed it; during that time, various other things didn't work, because some other necessary stuff doesn't get started up till the print notifications daemon is running... |
From fish-shell/fish-shell#6440 it seems like macOS is probably OK since 10.12, and that's a long way out of support lifetime now. https://learn.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-gettickcount64 seems like possibly the best option on Windows. Googling around suggests CLOCK_MONOTONIC does exist on Solaris (as an alias to CLOCK_HIGHRES - see vim/vim#10647 ), but does not exist on HP-UX - vim/vim#10873 . nmap/nmap#180 has even more icky details, including some differences in CLOCK_MONOTONIC behaviour across platforms. Various people who've run into this mess before suggest things like falling back to gettimeofday or setitimer/getitimer. There's a library at https://github.com/ThomasHabets/monotonic_clock which tries to handle various platforms but only has tested implementations for Linux, OpenBSD and Solaris, plus apparently untested mach and Windows support. In summary I don't know what to do! Maybe get/setitimer would be viable here? I dunno. |
@AdamWill What I've pushed thus far uses The new proposed API is a simple "elapsed time in seconds" API with a defined resolution of at least 10ms: double cupsGetClock(void); The use of "double" allows for 54 bits of precision and effectively unlimited run time support. The implementation bases the return values on the first call to [master 3395443] Add cupsGetClock API. With the new API, it should be fairly straight-forward to refactor the dest and other timing-specific code and eliminate a bunch of the platform-specific code that gets sprinkled throughout. |
I will also note, if this gets back-ported to 2.4.x we will name it |
Looks good to me! Some of the references I found suggested CLOCK_MONOTONIC_RAW might be preferred over CLOCK_MONOTONIC for this purpose if it's present, but not 100% sure. |
It looks like the difference is I had never heard of it until last week, when I noticed dbus-broker was using it. GLib uses vanilla |
Using gettimeofday here is not safe against clock jumps (e.g. a manual setting of the clock). Use clock_gettime instead.