Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: run SaveGameStats in a job #5934

Merged
merged 3 commits into from
Oct 17, 2024

Conversation

mwerle
Copy link
Contributor

@mwerle mwerle commented Oct 9, 2024

##' Bugs / Feature requests

Attempts to fix #5929

Background

The idea is to run this expensive task in an asynchronous job and notify the Lua side once it is completed. This is to prevent the stutter from occurring every time the SaveGameStats API is called from Lua.

Details

The code was modified to call the expensive part of the SaveGameStats API (loading and parsing the game file) into a Job. Once finished, the data is marshalled into a LuaTable and a LuaEvent is triggered to pass the data to the Lua side.

The Lua side registers an Event listener, which parses the data and updates the SaveGame cache entry for the data.

Practically, this is achieved by the following changes :

  • add a job manager to the LuaManager class which can run any job on
    the Lua job queue.
  • add a LoadGameToJsonJob implementation in SaveGameManager, and a new function, LoadGameToJsonAsync(), which schedules an instance of this job on the Lua job queue.
  • call this new function from Lua, as well as registering a callback for when the job completes, in saveloadgame.lua.

Many thanks to @Web-eWorks for feedback throughout the development of this change (details of which have been squashed, but the implementation went through several iterations) as well as providing the Lua profiler and removing the no longer required SaveLoadWindow::update() callback.

Copy link
Member

@sturnclaw sturnclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial feedback - instead of adding the new LuaEvent class to piecewise construct an argument table, construct a LuaTable object inside of l_game_savegame_stats_callback, add values to it as normal, and pass that LuaTable handle as the args parameter to LuaEvent::Queue. (Don't forget to dispose of the underlying table from the stack - a LuaTable doesn't auto-pop its table.)

If that doesn't currently compile, let me know and I'll do the template metaprogrammery to get it to work.

A note - we tend to reserve the l_<namespace>_<method> naming convention for functions following the Lua_Function API prototype, i.e. static int l_name(lua_State *l). This is just a "regular" callback and can have any particular name.

Regarding profiling, we have an in-process profiler for C++ which can be used via the PROFILE_SCOPED() macro (with the appropriate include of profiler/Profiler.h). Details on usage are in the dev docs.

src/lua/LuaGame.cpp Outdated Show resolved Hide resolved
@sturnclaw
Copy link
Member

Thanks for working on this! We unfortunately don't have good (or any) Lua-side profiling utilities at current, but I'll pull the branch and take a closer look to see what's responsible for the stutters.

@sturnclaw
Copy link
Member

Finally, as a suggestion - you may want to move the JobSet into a more central API like Pi::App() or Lua::manager as a shared "Lua Jobs" interface. That would avoid the need to keep a static object at file scope, and allow object deinitialization to be controlled as well as initialization.

@sturnclaw
Copy link
Member

Hacked in a manual lua-side profiler to test performance. Note that printing to the console has an overhead of about 0.065ms - I've disabled most of the print statements for the sake of measuring real performance.

With hot caches:

Info: PROFILE | SaveLoadWindow:makeFilteredList() took 0.0085ms
Info: PROFILE | SaveLoadWindow:makeFileList() took 0.1471ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0033ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0036ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0038ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0040ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0039ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0050ms
Info: error in JSON file 'savefiles/_autosave9': [json.exception.parse_error.110] parse error at 1: unexpected end of input
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0049ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0038ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0029ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0041ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0054ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0034ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0037ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0034ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0041ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0113ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0042ms

With fully cold cache (echo 3 | sudo tee /proc/sys/vm/drop_caches):

Info: PROFILE | SaveLoadWindow:makeFilteredList() took 0.0092ms
Info: PROFILE | SaveLoadWindow:makeFileList() took 5.8037ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0030ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0033ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0043ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0036ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0028ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0048ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0069ms
Info: error in JSON file 'savefiles/_autosave9': [json.exception.parse_error.110] parse error at 1: unexpected end of input
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0035ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0060ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0035ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0037ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0035ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0027ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0031ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0045ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0039ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0042ms

Looking at the profile from that run (Engine.RequestProfileFrame()), we're almost entirely disk bound inside of SaveGameManager::ListSaves.
image

With the manual rate-limiting removed from SaveLoadWindow:update(), loading information about saves is almost instant as all saves are processed at once. I'm not seeing the major stutter you mentioned either - even with cold caches loading the list of saves from disk (with an N of 18) causes at most a single frame skip at 60Hz.

This does raise the follow-on question of also moving Game.ListSaves() to an off-thread operation - as I imagine a user with a decent number of savefiles (say, 150) in combination with a legacy HDD or eMMC boot drive could potentially have an observable stutter in that function lasting a significant fraction of a second.

Would you like me to push my profiler implementation to this branch?

@mwerle
Copy link
Contributor Author

mwerle commented Oct 10, 2024

Thank you for having a look and providing feedback!

Initial feedback - instead of adding the new LuaEvent class to piecewise construct an argument table, construct a LuaTable object inside of l_game_savegame_stats_callback, add values to it as normal, and pass that LuaTable handle as the args parameter to LuaEvent::Queue.

I'm afraid I still have extremely limited knowledge of how the Lua stack works and what can and can't be constructed on it and passed around. It's not really made any clearer (to me) by the wrappers you guys have developed which have various restrictions such as the ScopedTable. If there is a book or other training material you can point me at so I can learn more about it that would be appreciated.

So I didn't think a LuaTable could be constructed on the stack and then passed around and modified etc before invoking a Lua function while other parts of the code also potentially modify the Lua stack (ie, another Job in another thread).

Inside l_game_savegame_stats_callback I don't have a lua_State which can be used to create a table.

In the "normal" functions the lua_State comes directly as a parameter, but I didn't know if that state would survive being passed into another thread and used/modified there while other Lua functions run in other threads. I kinda figured the Lua stuff would probably all need to be on the same thread without adding a lot of synchronisation.

This is why I made that LuaEvent class to construct the table and manage the lua_State. But looking at it now, it probably boils down to the same thing.. as I said, C++ isn't my forte, and the whole Lua thing is brand-new to me.

(Don't forget to dispose of the underlying table from the stack - a LuaTable doesn't auto-pop its table.)

???

A note - we tend to reserve the l_<namespace>_<method> naming convention for functions following the Lua_Function API prototype, i.e. static int l_name(lua_State *l). This is just a "regular" callback and can have any particular name.

Noted.

Regarding profiling, we have an in-process profiler for C++ which can be used via the PROFILE_SCOPED() macro (with the appropriate include of profiler/Profiler.h). Details on usage are in the dev docs.

I saw the macro, but haven't had time to look into it. Thanks for the link, I'll have a read.

@mwerle
Copy link
Contributor Author

mwerle commented Oct 10, 2024

Would you like me to push my profiler implementation to this branch?

Yes please. No need for me to reinvent the wheel..

@sturnclaw
Copy link
Member

I'm afraid I still have extremely limited knowledge of how the Lua stack works and what can and can't be constructed on it and passed around.

This is partially why the Lua*** wrappers were developed, to simplify interacting with Lua from the programmer's perspective. I'm sorry they're not clear and easy to use either - we (at this point, I) probably should write a primer on interacting with Lua and add it to the dev docs.

So I didn't think a LuaTable could be constructed on the stack and then passed around and modified etc before invoking a Lua function while other parts of the code also potentially modify the Lua stack (ie, another Job in another thread).

but I didn't know if that state would survive being passed into another thread and used/modified there while other Lua functions run in other threads.

Lua is implicitly and explicitly single-threaded only. Only the main thread is allowed to call any Lua API functions on the main Lua state (Lua::manager->GetLuaState()). Inside of a Job / Task's main execution method, you generally will never see any Lua API being called.

I should have mentioned this when I was originally describing the problem to you, sorry. 😅

The OnFinish method of the Job class specifically always runs on the main thread for this reason, to allow interaction with non-threaded resources like OpenGL or Lua.

Other Lua states may be constructed for use on separate threads, but they operate under a share-nothing model in relation to the main Lua state from both a C and Lua-side perspective.

As mentioned in another reply, from the "main" thread you can lua_State *l = Lua::manager->GetLuaState(); to interact with the game's Lua context at (almost*) any time.

* Startup and shutdown excluded, for obvious reasons.

This is why I made that LuaEvent class to construct the table and manage the lua_State. But looking at it now, it probably boils down to the same thing.. as I said, C++ isn't my forte, and the whole Lua thing is brand-new to me.

Don't worry about it at all - this is why I'm taking the time to review and provide feedback in such depth on your PRs. My hope is to help smooth along the learning process; feel free to mention me (as sturnclaw) in IRC with questions if that's a more expedient way to learn.

(Don't forget to dispose of the underlying table from the stack - a LuaTable doesn't auto-pop its table.)

???

Here's an annotated example:

// Create a new table on the Lua stack.
// This is a shorthand to call lua_newtable(l); LuaTable(l, -1).
// Passing a stack index to the constructor instead aliases an existing table.
LuaTable tab { l };

// Interact with the table as normal, don't need to bother with the underlying Lua API.
tab.Set("myKey", myVal);

// Pass this table to a Lua event. The LuaTable wrapper aliases the underlying table on
// the Lua stack when copied.
// LuaEvent::Queue pushes all parameters to Lua and then destroys its arguments. You
// generally don't have to worry about lifetime concerns as long as you're not passing
// a ScopedTable handle.
// LuaEvent::Queue is guaranteed to leave the stack in the same state as when you called it.
LuaEvent::Queue("myEvent", tab);

// Because LuaTable aliases a table on the stack, it does not remove that value from
// the stack when it is destroyed. Instead, you have to manually call lua_pop() to remove
// the table implicitly created via LuaTable(lua_State *l).
lua_pop(l, 1);

data/pigui/modules/saveloadgame.lua Outdated Show resolved Hide resolved
src/lua/LuaEvent.h Outdated Show resolved Hide resolved
data/pigui/modules/saveloadgame.lua Outdated Show resolved Hide resolved
@mwerle
Copy link
Contributor Author

mwerle commented Oct 11, 2024

Finally, as a suggestion - you may want to move the JobSet into a more central API like Pi::App() or Lua::manager as a shared "Lua Jobs" interface. That would avoid the need to keep a static object at file scope, and allow object deinitialization to be controlled as well as initialization.

Sure; I first wanted to see if it would work at all so just hacked it into the LuaGame.cpp.

I moved the Job itself into the SaveGameManager class and simply exposed a function to create a new job.

I moved the jobset into Lua::manager as suggested.

The issue is that the Lua::Manager implementation now has a #include "Pi.h" which I seem to remember is undesired..?

@sturnclaw
Copy link
Member

I moved the Job itself into the SaveGameManager class and simply exposed a function to create a new job.

Looks good. Naming is a little stilted, I might personally try to name it something like LoadGameToJsonAsync and allow the return value type to do the talking on what it returns.

I moved the jobset into Lua::manager as suggested.

The issue is that the Lua::Manager implementation now has a #include "Pi.h" which I seem to remember is undesired..?

this is fine

More seriously, this isn't ideal. I suspect the editor will likely crash on startup because of this, as Pi::App doesn't exist outside of "main game" contexts. Just one of those details about the codebase you'll trip over in the process of learning it.

Preferably, the LuaManager constructor takes a JobQueue *asyncJobQueue parameter instead. As a knock-on effect, this means that Lua::Init() now needs to either be passed a LuaManager for it to take ownership of on startup, or be passed a pointer to the running application's AsyncJobQueue to then pass to LuaManager. The latter is likely much easier to implement and thus what I'd suggest.

@mwerle
Copy link
Contributor Author

mwerle commented Oct 14, 2024

Preferably, the LuaManager constructor takes a JobQueue *asyncJobQueue parameter instead. As a knock-on effect, this means that Lua::Init() now needs to either be passed a LuaManager for it to take ownership of on startup, or be passed a pointer to the running application's AsyncJobQueue to then pass to LuaManager. The latter is likely much easier to implement and thus what I'd suggest.

This is proving to be marginally challenging as the other Applications do not have any Job Queues..

@sturnclaw
Copy link
Member

JobQueue *GetAsyncJobQueue();

@mwerle mwerle force-pushed the refactor/savegame_stats_job branch from 20a7010 to 63c95d1 Compare October 14, 2024 23:40
@mwerle
Copy link
Contributor Author

mwerle commented Oct 14, 2024

JobQueue *GetAsyncJobQueue();

No idea how or why I missed this.. I was grubbing around inside the implementation files and must have missed the base class.

Proposed fix in 63c95d1

@mwerle
Copy link
Contributor Author

mwerle commented Oct 14, 2024

Looks good. Naming is a little stilted, I might personally try to name it something like LoadGameToJsonAsync and allow the return value type to do the talking on what it returns.

Fixed in 1a871c8

@sturnclaw
Copy link
Member

No idea how or why I missed this.. I was grubbing around inside the implementation files and must have missed the base class.

Proposed fix in 63c95d1

Looks good. Will give this a whirl tomorrow.

Copy link
Member

@sturnclaw sturnclaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far this looks good! I've left one comment where a bit more cleanup is needed, and I've pushed a commit to the branch removing the manual rate-limiting code in SaveLoadWindow:update() (and that overridden function entirely). Feel free to rebase/squash/etc. that commit - I'm not worried about the author/committer information.

@@ -243,6 +258,7 @@ end
--=============================================================================

function SaveLoadWindow:makeFilteredList()
local profileEndScope = utils.profile("SaveLoadWindow::makeFilteredList()")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover profiling call here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is impressive, even Debug builds now load the list of saves almost instantly, down from about 14 seconds, on my PC with about 40 saves.

I'll squash/merge/etc and mark the PR as ready then. Thank you for all of your input!

sturnclaw and others added 3 commits October 17, 2024 07:34
- Simple manual profiler for checking the time cost of certain Lua methods
- Add Engine.nowTime, which returns the exact time since an unspecified epoch at the moment it's evaluated
The idea is to run this expensive task in an asynchronous job
and notify the Lua side once it is completed. This is to prevent
the stutter from occurring every time the SaveGameStats API is
called from Lua.

This commit implements this idea by:

* add a job manager to the LuaManager class which can run any job on
  the Lua job queue.
* add a LoadGameToJsonJob implementation in SaveGameManager, and a new
  function, LoadGameToJsonAsync(), which schedules an instance of this
  job on the Lua job queue.
* call this new function from Lua, as well as registering a callback for
  when the job completes, in saveloadgame.lua.

Anecdotal testing in debug builds shows an impressive speed increase
when loading the list of savegames.
- Prior code limited synchronous Game.SaveGameStats calls to once-per-frame
- Since the function is now async, we can dispatch as many jobs as we want and their results will be processed on completion
@mwerle mwerle force-pushed the refactor/savegame_stats_job branch from 6ebf296 to f20eb57 Compare October 16, 2024 22:44
@mwerle mwerle marked this pull request as ready for review October 16, 2024 22:45
@mwerle
Copy link
Contributor Author

mwerle commented Oct 16, 2024

Squashed commits, rebased on main, ready for review, thank you.

@sturnclaw
Copy link
Member

My prior approval stands. Good work, and thank you for implementing this! Hopefully it's given you a bit better visibility into a "vertical slice" of the codebase - I'll chalk that up as a win for shamelessly asking other contributors to do my TODO items for me. 😄

In all seriousness, this is a good implementation and I'm glad you proposed the idea of the SaveGameManager - I likely wouldn't have come up with that on my own.

@sturnclaw sturnclaw merged commit 61f1c16 into pioneerspacesim:master Oct 17, 2024
4 checks passed
@mwerle
Copy link
Contributor Author

mwerle commented Oct 17, 2024

Thank you, yes it was a great learning exercise and I think I gained some insight into how the codebase, especially interactions with Lua, work.

@mwerle mwerle deleted the refactor/savegame_stats_job branch October 17, 2024 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improvement - move calls to Game.SaveGameStats into Jobs
2 participants