Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

video_core: Use adaptive mutex on Linux #2105

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ngoquang2708
Copy link
Contributor

@ngoquang2708 ngoquang2708 commented Jan 9, 2025

Fix performance regression with #1973 on SteamDeck (#2048).

Tested with reduced resolution to reduce GPU load and vblank divider set to 2.

SteamDeck

main

pm+wm

Laptop

main.mp4
pm+wm.mp4

@kalaposfos13
Copy link
Contributor

I tested in 4 different games: Bloodborne, GRR, GR2 and Nier: Automata, and this PR had no major change on the last 3, but it made Bloodborne run at 15-20 FPS in the Hunter's Dream, down from a stable 60 on main (tested 3 times, so it's consistent). It has to be noted, though, that even on main, the framerate also deteriorates after a few minutes to the same level (and it happens for all games, as well)
Specs: 3060 Laptop GPU (6GB), 12700h laptop CPU, 16 GB RAM, Linux Mint 22 with KDE Plasma 5

@GHU7924
Copy link

GHU7924 commented Jan 9, 2025

I tested your PR on Steam Deck.
I must say right away that I used patches, but according to the idea, these patches should not greatly affect the game (except for one thing, the resolution patch).
I do not know how it will be in the future, but now playing without patches on the Steam Deck is spoiling your impression.

Играл я на 12W. Vblank = 2. Profile 30 fps (90 Hz). Play Time = 1 h 30 min.
I used the following patches:
20250109111214_1

Here are some screenshots:
20250109113257_1
20250109113409_1
20250109113948_1
20250109114202_1

In general, the game is trying to go at about 30 frames, but due to delays, the fps is drop.
The performance has improved and become more stable.

@DemoJameson
Copy link
Contributor

DemoJameson commented Jan 9, 2025

I got worse performance on the PR

CPU: Intel 13600kf
GPU: NVIDIA 4080S
RAM: 32GB
OS : Windows 11 24H2

Main:
main

PR:
pr

@JvictorVentura
Copy link

This PR fixed the performance regression for me. My FPS has increased to 50 from 17.
CPU: Intel i5-9400F (6) @ 4.100GHz
GPU: GTX 1050 Ti 4GB VRAM
OS: Linux Mint 22 MATE
RAM: 16GB

@diegolix29
Copy link

diegolix29 commented Jan 9, 2025

For me also reduce the performance by 8-10 fps overrall
intel 12900k i9
3080 12gb vram
64 RAM
windows 10

diegolix29 added a commit to diegolix29/shadPS4 that referenced this pull request Jan 9, 2025
@smiRaphi
Copy link

smiRaphi commented Jan 9, 2025

I went from 125-135fps to 120-125 with this pr
Ryzen 9 7950X3D
RTX 4090
96GB RAM
win 11

@ngoquang2708
Copy link
Contributor Author

ngoquang2708 commented Jan 10, 2025

Some interesting test results. So basically:

  • Using mutex instead of spinlock help weaker CPU, especially APU, allow them to "rest" more without taking too much power, which is shared with the GPU.
  • More powerful CPU can just "brutefore" its way to get more frames with spinlock.

I wonder if we can get a middle ground of this, something that can spin lock for a number of times, then sleep, like a adaptive mutex?

@ngoquang2708 ngoquang2708 changed the title video_core: Use std::mutex video_core: Use adaptive mutex on Linux Jan 10, 2025
@ngoquang2708
Copy link
Contributor Author

I have change this PR to use adaptive mutex on Linux so this should not affect other OSes. Need retest from Linux users.

Fix performance regression with shadps4-emu#1973 on SteamDeck
@DiGik05
Copy link

DiGik05 commented Jan 10, 2025

I have change this PR to use adaptive mutex on Linux so this should not affect other OSes. Need retest from Linux users.

I would like to test it on SteamOS, but I am not sure how to get the AppImage to do that. Can you let me know?

@JvictorVentura
Copy link

This PR fixed the performance regression for me. My FPS has increased to 50 from 17. CPU: Intel i5-9400F (6) @ 4.100GHz GPU: GTX 1050 Ti 4GB VRAM OS: Linux Mint 22 MATE RAM: 16GB

I didn't see any difference in the performance, or at least I didn't notice any.

@JvictorVentura
Copy link

I have change this PR to use adaptive mutex on Linux so this should not affect other OSes. Need retest from Linux users.

I would like to test it on SteamOS, but I am not sure how to get the AppImage to do that. Can you let me know?

On the "actions" tab look for "video_core: Use adaptive mutex on Linux", click on the most recent one, then download the linux-qt version.

@DiGik05
Copy link

DiGik05 commented Jan 10, 2025

shad_log.txt
I can confirm performance improvement on Steam deck.

@nr1971
Copy link

nr1971 commented Jan 10, 2025

While I'm not seeing any increase in FPS this really helps smooth out my frame rate by getting rid of the low dips. For example while standing at the Central Yarnham lampost my fps would hover around 65 but drop to 52-53 every 5 seconds or so. This is gone now.
Ryzen 5600x, RX 6700xt, 32GB ram, Ubuntu 24.04

@Missake212
Copy link

Tested a bit in Central Yharnam comparing latest main with this, didn't notice any performance difference on Windows like other people said, think with a little more testing this might be good to go.

@ngoquang2708
Copy link
Contributor Author

Tested a bit in Central Yharnam comparing latest main with this, didn't notice any performance difference on Windows like other people said, think with a little more testing this might be good to go.

This PR now only affect Linux.

@roamic
Copy link
Collaborator

roamic commented Jan 11, 2025

@ngoquang2708 AFAIK we have adaptive mutex implementation for windows in src/core/libraries/kernel/threads/mutex.cpp You can try to use it.

@ngoquang2708
Copy link
Contributor Author

@ngoquang2708 AFAIK we have adaptive mutex implementation for windows in src/core/libraries/kernel/threads/mutex.cpp You can try to use it.

Isn't that for guest side use? This PR use adaptive mutex for host side, only for Linux though, since I don't see any complains about performance regression from Windows users.

@roamic
Copy link
Collaborator

roamic commented Jan 11, 2025

@ngoquang2708 AFAIK we have adaptive mutex implementation for windows in src/core/libraries/kernel/threads/mutex.cpp You can try to use it.

Isn't that for guest side use? This PR use adaptive mutex for host side, only for Linux though, since I don't see any complains about performance regression from Windows users.

Ah, you're right, my bad.

@ngoquang2708
Copy link
Contributor Author

@kalaposfos13 Can you help retest new changes on your system?

@kalaposfos13
Copy link
Contributor

The performance is now yet again similar to main. Side note, I was able to significantly lessen the performance degradation and lag spikes by manually installing the 565 Nvidia drivers (Mint/apt only comes with 550), but it didn't completely solve the problem.

@ngoquang2708
Copy link
Contributor Author

@kalaposfos13 Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.