Skip to content
This repository has been archived by the owner on Jul 19, 2018. It is now read-only.

Tobin secondary cb synch tracking #2066

Open
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

tobine
Copy link
Contributor

@tobine tobine commented Sep 11, 2017

Replaces #2049

This is all the same code rebased to latest master plus it adds memory access checking for secondary command buffers and streamlines the descriptor update so that perf of validation with memory accesses enabled is on-par with validation when these checks are disabled for "smoketest --validate -s".

I've kept the callback disabled for now but added a commented-out "//#define ENABLE_MEMORY_ACCESS_CALLBACK" in vk_layer_config.h that can be un-commented to enable the callback and disabled tracking.

I think it's in a good state to land as a baseline to make it easier to merge code going forward as I continue to improve the synch tracking.

Copy link
Collaborator

@mark-lunarg mark-lunarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but yikes!

@tobine tobine force-pushed the tobin_secondary_cb_synch_tracking branch from 658e576 to 6e7a47c Compare September 12, 2017 20:38
@mark-lunarg
Copy link
Collaborator

@tobine, Master runs fine on Dota2 and a release build starts up in ~10 seconds with validation enabled. With the sync changes added, my windows machine blue-screened after about 4 minutes. What information can I get for you?

@tobine
Copy link
Contributor Author

tobine commented Sep 13, 2017

@mark-lunarg does that mean you also never saw anything on screen? Can you kill the last CL so that ENABLE_MEMORY_ACCESS_CALLBACK is not defined and re-try? If that works it's a big flag about where the issue is.

@mark-lunarg
Copy link
Collaborator

Yep, that worked.

@tobine tobine force-pushed the tobin_secondary_cb_synch_tracking branch from 6e7a47c to 4267210 Compare September 13, 2017 20:15
@tobine
Copy link
Contributor Author

tobine commented Sep 13, 2017

@mark-lunarg thanks, did perf seem on-par with master?

I'm pretty sure the issue is with Draw tracking in ValidateCmdDrawType() function. Specifically this block

for (const auto &set_binding_pair : pPipe->active_slots) {
    uint32_t setIndex = set_binding_pair.first;
    // Pull the set node
    cvdescriptorset::DescriptorSet *descriptor_set = state.boundDescriptorSets[setIndex];
    if (descriptor_set) {
        descriptor_set->AddReadWriteBuffersAndImages(set_binding_pair.second, cmd_type, read_accesses, write_accesses);
    }
}
skip |= ValidateRWMemoryAccesses(dev_data->report_data, cmd_buffer, (*cb_state)->mem_accesses, read_accesses, write_accesses, caller, false, nullptr, nullptr);

If you're up for a couple other tests here's what I'd be interested in. If you have time and are willing, please try this on top of the running config that you have (last CL of this PR killed):

  1. Kill the early return in "ValidateCmdDrawType()" so that the code I flagged will run. If that crashes, as I expect then
  2. Comment out the last line in the code above where ValidateRWMemoryAccesses() is called.

If you get to the 2nd experiment and it passes, then that we grab the Draw updates, but fail on checking them for conflicts. If the second experiment doesn't pass (which I expect) then almost certainly something is broken in AddReadWriteBuffersAndImages(). That's the code that I'm going over now, but those 2 experiments would verify I'm on the right track.

@tobine tobine force-pushed the tobin_secondary_cb_synch_tracking branch from 4267210 to ae690c5 Compare September 13, 2017 20:56
@tobine
Copy link
Contributor Author

tobine commented Sep 13, 2017

I may have found the issue. Forgot to update LUT when I added AMD draw commands. I just pushed fix. @mark-lunarg can you pull latest and see if that was the problem?

@mark-lunarg
Copy link
Collaborator

Just got out of a meeting, will check it now and look at perf if it works.

@mark-lunarg
Copy link
Collaborator

Yeah, that fixed it. Time to first pixels is 27 sec with the PR, 10 without.

@mark-lunarg
Copy link
Collaborator

@tobine, is there anything to be done on this one now, or should we move it to inactive state and hold onto the branch?

@tobine
Copy link
Contributor Author

tobine commented Dec 15, 2017

@mark-lunarg yes, moved to inactive for now. Want to hang onto this until I have a good landing spot for the barrier checking.

@zeux
Copy link

zeux commented Dec 19, 2017

It would be amazing to get this in at some point - missing sync validation is currently one of the biggest pain points for writing correct Vulkan code. Keep up the good work!

@tobine tobine force-pushed the tobin_secondary_cb_synch_tracking branch from 91b6474 to 4a14b22 Compare March 29, 2018 20:45
@tobine tobine force-pushed the tobin_secondary_cb_synch_tracking branch from 4a14b22 to f9c7517 Compare April 2, 2018 21:48
This is the initial version of a single-case update to handle
synchronization tracking for the cmd sequence from GH892. As the code
stands it's only primarily useful for this single cmd sequence and
isn't catching any variety of memory conflicts.

The intention is that if the general idea is sound then I will build on
this initial code to expand/generalize the algorithms used here for all
memory conflict cases.
tobine added 28 commits April 3, 2018 07:55
Add tracking for the memory accesses in CmdBlitImage.
Initially just adding a read & write access for the src & dest image
memory that conservatively references whole binding and is recorded
as imprecise so that it only will cause warnings.
Add tracking for the memory access in CmdFillBuffer.
This is a single write to a buffer for which we have precise info.
Add tracking for the memory accesses in CmdClear[DS|Color]Image.
Initially just adding imprecise write access covering the whole image
binding area.
Update CmdClearAttachments to record a write memory access per image
attachment that's cleared. This is an imprecise access across the
entire image binding for now.
Update global mem access state when CmdClearAttachments is called.
Add tracking for the memory accesses in CmdResolveImage.
Initially just adding a read & write access for the src & dest image
memory that conservatively references whole binding and is recorded
as imprecise so that it only will cause warnings.
Just moving code to prep for adding mem access support to
CmdCopyQueryPoolResults. Refactor it to use the common PreValidate* &
PostRecord* pattern.
Add the buffer write access for CmdCopyQueryPoolResults. Updated query
pool data struct to store number of elements in the query based on
query type and then use that number when calculating total data size.
Update the Draw/Dispatch validate/update framework to account for all
memory accesses through descriptors. At validate time grab the complete
set of active read and write memory bindings from descriptors.
Then verify the memory accesses to flag any conflicts.
During state update, record all of the accesses into the cmd buffer
state.

All of these accesses are imprecise and will only result in warnings if
there is a potential synch issue.
A PipelineBarrier command that occurs within a renderPass is only for
handling a subpass self-dependency. That case is not currently included
in the synch model so updating code to only record barriers that are
outside of a renderPass for now.
Moved existing code into RecordBarrierMemoryAccess() function. This
generalizes to code so it can more easily be multi-purposed in order to
record barriers for CmdWaitEvents().
This is a race condition test to check both a RaW and WaR conflict with
buffers used in a Draw. A src buffer is copied to a dst buffer, then,
without a barrier, the src is used as a storage buffer in a draw and
the dst is used as a uniform buffer in the same draw. This results in a
RaW conflict for the Dst/Uniform buffer and WaR conflict for the
Src/Storage buffer.
Make any memory access race condition conflict messages warning-level
for initial release. There are still some holes in modeling the synch
primitives in validation so setting callback to warning allows for a
soft release where people can ignore or post bugs on incorrect warnings
while the synch model is updated in validation.

The current warning message is meant to deter developers from sinking
too much time into debugging these warnings, and promotes feedback for
known-bad warning cases.
Record barriers in CmdWaitEvents using same method used for barriers
from CmdPipelineBarriers. This involves merging any previous barriers
to handle dependency chains and then recording the barriers into any
previous memory accesses that fall into the source synch scope.
Modify ValidateMemoryAccesses() so that it takes map of prev accesses
directly. This is preparation to share the function with command buffer
submission validation where we'll deal with sets of initial accesses
that may cross many command buffers.
At QueueSubmit time, check for any memory conflicts between command
buffers in a single submit.

Here's the high-level algorithm:
1. While submitting CBs, store vector of CB state for each CB
2. For each CB, create vector of mem accesses that are "live" meaning
that no barrier within the CB made them visible
3. Check live vector against all other live mem accesses up to this
point in submit.
3a. If there are no potential conflicts, save live access vector
into submit access map and continue with next submit
3b. If there are potential conflicts, we need to replay commands up to
this point so continue to step 4.
4. Gather all cmds submitted up to this point in a single vector
5. Note any synch commands that occur between the early and late mem
access conflict commands
6. Replay mem access commands, including synch commands between the
points of interest
7. Following replay, re-check for conflicts and if they still occur,
flag an error

Also update the race condition warning message between two separate
command buffers with additional information about second cmd buffer.
Added TwoCommandBufferUpdateBufferRaWDependencyMissingBarrier test that
is an alternate version of UpdateBufferRaWDependencyMissingBarrier test
that separates the CmdUpdateBuffer and CmdDrawIndirect commands across
two separate command buffers and verifies that the RaW race condition
is flagged.
Merging synch barriers may be a bit naive. Note potential bug for
future fix.
Add MultiCBDrawWithBufferWaRandRaWSynchChain test. This features three
command buffers with mem access dependencies and synch commands that
cross CB boundaries. This is a positive test to verify that no errors
occur.
Mis-match between size_t and uint32_t was causing some Windows compiler
warnings. Use size_t uniformly where vector size() function involved.
Move the code validate inter-cmd-buffer synch conflicts at QueueSubmit
time into its own function. This also fixes a bug where the vector of
cmds to replay was not being reset when more than one replay session
was required for a single commit.
It also streamlines the code a bit by making the update of mem access
map with latest live mem accesses common code between fast and slow
path.
This commit stores memory accesses into separate read and write maps.
This provides a perf boost for checking potential memory access faults
by eliminating uneccessary checks when previous access maps that would
be checked are empty so we don't iterate over maps that have no
potential for any conflicts.

A couple of other changes to cleanup memory access checks:
1. Merge "Add*MemoryAccess()" and "ValidateMemoryAccesses()" functions
into a single "CreateAndValidate*MemoryAccess()" function. This creates
the local access struct and validates it, but it won't be recorded into
the CB until/unless validation passes.
2. Split Mem access violation reporting into a separate function that
can be shared by a couple of different cases that flag errors.
This is a conservative measure to simplify landing and integration of
the memory access checks. For now turning off the callback and also
disabling some draw-time checks that seem to have ~10-20% perf hit in
limited testing. Turning off callback also means turning off tests that
rely on the callback.
Will continue advancing this code and testing with checks enabled to
reach desired functionality before enabling code on master.
At CmdExcuteCommands() time, check between secondary cmd buffers being
executed to see if there are any potential memory conflicts.
Add TwoSecondaryCBUpdateBufferRaWDependencyMissingBarrier test to flag
error between two separate secondary command buffers that are issued in
the same CmdExecuteCommands() call and have a RaW conflict.
Put memory access updates for active descriptors directly into the read
and write vectors.
This kills an extra data structure transformation where we moved the
accesses through sets before putting them into vectors. There was no
benefit to the extra overhead and this update brings the performance of
validation with memory access checks on-par with validation when the
checks are disabled.

Still have the warning callback disabled for now.

The ENABLE_MEMORY_ACCESS_CALLBACK define in vk_layer_config.h can be
un-commented to enable the warning callback, as well as the descriptor
memory access tracking.
Turning on all of the memory tracking code as there is almost no perf
hit on "smoketest --validate -s" now. Also the warning callback has a
large caveat statement to help developers share suspected issues and
disregard as they see fit.
These extended draw types were added without updating the read/write
flag LUT. Doh! Initially adding data as copies of matching non-AMD cmds
which should work. Added comments to double-check and make sure those
are the correct flags.

Also added a magic number entry at the end of the LUT CommandToFlags
array and a static_assert to make sure this array is correctly sized at
compile time.
@tobine tobine force-pushed the tobin_secondary_cb_synch_tracking branch from f9c7517 to 96a3862 Compare April 3, 2018 14:19
@mark-lunarg
Copy link
Collaborator

@tobine, this repository will close for write access on Sunday, 5/13/2018. If it is pushed before that time it will be present in the follow-on Vulkan-ValidationLayers repository on Monday, otherwise a new PR will be required in the new repo.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants