Skip to content
This repository has been archived by the owner on Jan 7, 2023. It is now read-only.

CTS dEQP issue fix based on mesa 19.0.5 #116

Open
wants to merge 356 commits into
base: master
Choose a base branch
from
Open

CTS dEQP issue fix based on mesa 19.0.5 #116

wants to merge 356 commits into from

Conversation

renchenglei
Copy link
Contributor

We have upgrade our mesa to 19.0.5. Based on the new code, we have some CTS issues, the following patches could help fix some of them.
@tpalli @strassek @js0701 @YuanjunHuang Please help take a review of them.
Kevin, let's freeze our code to 19.0.5 and figure out issue based on 19.0.5. I will monitor upstream branch, if there is any fix for current issue. I will share the patch and back port them to our github. We have full CTS dEQP and GFX test on them. With following changes, no regression is found.
Bug track:
Bug ID OAM-80938, fix patch 3b7825a
Bug ID OAM-80942, fix patch 480e1cc
Bug ID OAM-80943, fix patch efac387; ef1e217; 99274f6

Kevin, please feel free to reject my PR, and reorg or re-back port some of them from upstream.

gfxstrand and others added 30 commits February 25, 2019 13:29
Reviewed-by: Alejandro Piñeiro <[email protected]>
(cherry picked from commit 30b548f)
Fixes: 19064b8 "nir: Add a pass for gathering transform feedback info"
Reviewed-by: Alejandro Piñeiro <[email protected]>
(cherry picked from commit 8f0fe71)
Instead of going to all the work of to combine them into one array, just
make two arrays and use location_frac to colocate them within CLIP0.
Then the back-end can sort things out and stack them on top of each
other.  Thanks to ef99f4c, we also don't need to set compact anymore.

Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit 4e69fba)
Conflicts resolved by Dylan

Conflicts:
	src/compiler/nir/nir_lower_clip_cull_distance_arrays.c
We needed to better handle cases where a chunk of a variable starts at
some non-zero location_frac and rolls over into the next slot but may
not be more than 4 dwords.  For example, if gl_CullDistance is an array
of 3 things and has location_frac = 2, it will span across two vec4s but
is not, itself, bigger than a vec4.  If you ignore the clip/cull special
case, it's not allowed to happen for anything else because the only
things that can span more than one slot is dvec3 and dvec4 and they're
both bigger than a vec4.  The current code uses this attrib_slot thing
where we count attribute slots and iterate over them.  However, that
doesn't work in the case above because gl_CullDistance will have an
attrib_slot count of 1 even though it does span two slots.  We could fix
this by adjusting attrib_slot but we already have comp_mask and it's
easier to just handle it that way.

Reviewed-by: Alejandro Piñeiro <[email protected]>
(cherry picked from commit 558c314)
This makes us properly handle gl_ClipDistance and gl_CullDistance.

Fixes: 19064b8 "nir: Add a pass for gathering transform feedback info"
Reviewed-by: Alejandro Piñeiro <[email protected]>
(cherry picked from commit 1a93fc3)
Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 .

That MR keeps the clip and cull arrays split.

So we have to handle
 - compact arrays with location_frac != 0
 - VARYING_SLOT_CLIP_DIST1

Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit 1ef2855)
Copy+paste error.  It was supposed to test cull and not clip.

Fixes: 4e69fba "nir: Rewrite lower_clip_cull_distance_arrays..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109717
Reviewed-by: Lionel Landwerlin <[email protected]>
(cherry picked from commit f98fd9d)
nir_lower_clip_cull_distance_arrays() marks the combined clip/cull
distance array as compact.  However, when translating in from GLSL
or SPIR-V, we were not marking the original float[] arrays as compact.

We should do so.  That way, we can detect these corner cases properly.

Reviewed-by: Timothy Arceri <[email protected]>
(cherry picked from commit ef99f4c)
Reviewed-by: Jason Ekstrand <[email protected]>
The cherry-pick dropped a chunk.

Fixes: e735173 "radv: Fix float16 interpolation set up."
Reviewed-by: Samuel Pitoiset <[email protected]>
spirv_to_nir can generate input/output variables which are illegal
for the current shader stage, which would cause nir_validate_shader
to balk.  After my recent commit to start decorating arrays as compact,
dEQP-VK.spirv_assembly.instruction.graphics.module.same_module started
hitting validation errors due to outputs in a TCS (not intended for the
TCS at all) not being per-vertex arrays.

Thanks to Jason Ekstrand for suggesting this approach.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109573
Fixes: ef99f4c compiler: Mark clip/cull distance arrays as compact before lowering.
Reviewed-by: Juan A. Suarez <[email protected]>
(cherry picked from commit 6775665)
Fix the logic for buffer full check on alloc.

This patch just takes the fix Nicolai attached to the bug report
and updates it to work on master.

Fixes: e0f0d36 ("radeonsi: factor si_query_buffer logic out of si_query_hw")

Reviewed-by: Marek Olšák <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109561
(cherry picked from commit 603206d)
Fixes following valgrind warning:

   ==27561== Conditional jump or move depends on uninitialised value(s)
   ==27561==    at 0x667856B: value_set_ssa_components (nir_opt_copy_prop_vars.c:78)
   ==27561==    by 0x667A1C4: copy_prop_vars_block (nir_opt_copy_prop_vars.c:797)

Fixes: 62332d1 "nir: Add a local variable-based copy propagation pass"
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
(cherry picked from commit 22267fe)
We shouldn't increment the buffer list pointers twice.

This fixes some crashes with new CTS
dEQP-VK.binding_model.descriptor_copy.*.

Cc: 18.3 19.0 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 9256e0a)
Sampler descriptors don't have a buffer list.

This fixes some crashes with new CTS
dEQP-VK.binding_model.descriptor_copy.*.sampler_*.

Cc: 18.3 19.0 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 4924dfc)
Earlier commit introduced support for haiku yet did not properly
annotate the loader/xmlconfig dependencies.

Thus we ended up adding inc_loader for each !haiku platform - see
659910e 9a96bf0 c731508 ec6cb01.

One piece remained though - the wayland platform. Hence the following
would fail:

 meson -Dgallium-drivers=etnaviv -Ddri-drivers=''\
       -Dtools=etnaviv -Dplatforms=wayland -Dglx=disabled \
       build/

Cc: Alexander von Gluck IV <[email protected]>
Reported-by: Boris Brezillon <[email protected]>
Fixes: 834d221 ("meson: Add Haiku platform support v4")
Signed-off-by: Emil Velikov <[email protected]>
Tested-by: Boris Brezillon <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
(cherry picked from commit f0a7b46)
Calculating the scissor rectangle fields with the y flipped (0 on top)
can generate negative values that will cause assertion failure later on
as the scissor fields are all unsigned. We must clamp the bbox values
again to make sure they don't exceed the fb_height. Also fixed a
calculation error.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999
          https://bugs.freedesktop.org/show_bug.cgi?id=109594

v2:
   - I initially clamped the values inside the if (Y is flipped) case
   and I made a mistake in the calculation: the clamp of the bbox[2] should
   be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I
   shouldn't have changed the ScissorRectangleYMax calculation. As the
   fixed code is equivalent with using CLAMP instead of MAX2 at the top of
   the function when bbox[2] and bbox[3] are calculated, and the 2nd is more
   clear, I replaced it. (Nanley Chery)

v3:
   - Reversed the CLAMP change in bbox[3] as the API guarantees that the
   viewport height is positive. (Nanley Chery)

v4:
  - Added nomination for the mesa-stable branch and the link to the second
  bugzilla bug (Nanley Chery)

CC: <[email protected]>
Tested-by: Paul Chelombitko <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
(cherry picked from commit fd37a19)
If no framebuffer is bound, get the number of samples and the
image format from the render pass.

This fixes new CTS dEQP-VK.geometry.layered.*.secondary_cmd_buffer.

Cc: 18.3 19.0 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 5671f38)
Conflicts resolved by Dylan

Conflicts:
	src/amd/vulkan/radv_meta_clear.c
Seems like dxvk used integer builtins without setting the flat
interpolation decoration.

I believe in the current spec the app is required to set these,
but in the meantime to avoid breaking things in stable releases
(and so close to release for 19.0), only expand the interpolation
to float16 and struct (which cannot be builtins as our spirv parser
lowers the builtin block).

Fixes: f324784 "radv: Allow interpolation on non-float types."
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit c011047)
This fixes a failed assertion in glDeleteLists() for the following
case:

list = glGenLists(1);
glDeleteLists(list, 1);

when those are the first display list commands issued by the
application.

When we generate display lists, we plug in empty lists created with
the make_list() helper.  This function uses the OPCODE_END_OF_LIST
opcode but does not call dlist_alloc() which would set the
InstSize[OPCODE_END_OF_LIST] element to non-zero.

When the empty list was deleted, we failed the InstSize[opcode] > 0
assertion.

Typically, display lists are created with glNewList/glEndList so we
set InstSize[OPCODE_END_OF_LIST] = 1 in dlist_alloc().  That's why
this bug wasn't found before.

To fix this failure, simply initialize the InstSize[OPCODE_END_OF_LIST]
element in make_list().

The game oolite was hitting this.

Fixes: OoliteProject/oolite#325
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 6dabcb5)
This change applies the workaround suggested by Bill Deegan on the
affected SCons versions.

It also adds a comment with the URL explaining why we were using
customizing the decider and max_drift in the first place, as I had
forgotten all about it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109443
Tested-by: [email protected]
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
There is only room for 3 vertices now (RECT has 3 vertices).

Fixes: 6ef7700

Signed-off-by: Jonathan Marek <[email protected]>
(cherry picked from commit 357313a)
Fixes: cb2322c

Signed-off-by: Jonathan Marek <[email protected]>
(cherry picked from commit 8eca6df)
In freedreno_gmem.c, gmem_align of 0x8000 is used. Alignment used here
should be the same.

Fixes: 912a9c8

Signed-off-by: Jonathan Marek <[email protected]>
(cherry picked from commit 4f23767)
Fixes: 3a273a4

Signed-off-by: Jonathan Marek <[email protected]>
(cherry picked from commit 6c0fefb)
Now that freedreno has create_with_modifiers(), this "hack" is needed to
make some cases work. Copied from vc4.

Fixes: 41ddf1d

Signed-off-by: Jonathan Marek <[email protected]>
(cherry picked from commit e3591b0)
The optimization in 4cd1a0b introduced a replacement of :

cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D
...
cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D

By :

cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D
...
mov(8) vgrf11.y:D, vgrf15.yyyy:D

The first cmp instruction is storing in x while the second mov is
sourcing from y. We need to take into account where the replacement on
the scan_inst destination is going to store thing so that the
replacement mov can source things from the correct location.

Signed-off-by: Lionel Landwerlin <[email protected]>
Fixes: 4cd1a0b ("i965/vec4: Propagate conditional modifiers from more compares to other compares")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109759
Reviewed-by: Ian Romanick <[email protected]>
(cherry picked from commit 6e18414)
Added check for higher compat profile being allowed
before assigning certain extensions.

Fixes: 272fe94 (mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile)

Signed-off-by: Danylo Piliaiev <[email protected]>
Signed-off-by: Yevhenii Kolesnikov <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107052
(cherry picked from commit 07f4b4e)
Some types of params such as some builtins are always padded. We
need to keep track of this so we can restore the list correctly.

Here we also remove a couple of cache entries that are not actually
required as they get rebuilt by the _mesa_add_parameter() calls.

This patch fixes a bunch of arb_texture_multisample and
arb_sample_shading piglit tests for the radeonsi NIR backend.

Fixes: edded12 ("mesa: rework ParameterList to allow packing")

Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 7536af6)
call XShmDetach to allow X server to free shared memory

Fixes: bcd80be "drisw/glx: use XShm if possible"
Signed-off-by: Ray Zhang <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
(cherry picked from commit b344e32)
Min He and others added 16 commits May 23, 2019 11:44
To avoid blocking other EGL calls, release the display mutex before
calling update_buffers(), which will call droid_window_dequeue_buffer().

This patch fixes some failure cases in android graphics cts test.

Signed-off-by: Min He <[email protected]>
Signed-off-by: Chenglei Ren <[email protected]>
This fixes crash due to NULL window when swap interval is set
for pbuffer surface.

Jira: 61995
Test: CtsDisplayTestCases pass

Signed-off-by: samiuddi <[email protected]>

(am from https://patchwork.freedesktop.org/patch/235697/)
Signed-off-by: Kalyan Kondapally <[email protected]>
GLSL ES 320 technically allows #line to have arbitrary expression trees
rather than integer literal constants, unlike the C and C++ preprocessor.
This is likely a completely unused feature that does not make sense.

However, Android irritatingly mandates this useless behavior, so this
patch implements a hack to try and support it.

We handle a single expression:

    #line <line number expression>

but we avoid handling the double expression:

    #line <line number expression> <source string expression>

because this is an ambiguous grammar.  Instead, we handle the case that
wraps both in parenthesis, which is actually well defined:

    #line (<line number expression>) (<source string expression>)

With this change following tests pass:

   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_expression_vertex
   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_expression_fragment
   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_and_file_expression_vertex
   dEQP-GLES3.functional.shaders.preprocessor.builtin.line_and_file_expression_fragment

Signed-off-by: Tapani Pälli <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>

BUG=b:33352633
BUG=b:33247335
TEST=affected tests passing on CTS 7.1_r1 sentry

Change-Id: I7afbbb386bd4a582e3f241014a83eaccad1d50d9
Reviewed-on: https://chromium-review.googlesource.com/427305
Tested-by: Haixia Shi <[email protected]>
Reviewed-by: Ilja H. Friedel <[email protected]>
Commit-Queue: Haixia Shi <[email protected]>
Trybot-Ready: Haixia Shi <[email protected]>
GPA requires a null renderer query which disables all rendering. This
feels fairly at odds with the spirit of the INTEL_performance_query
extension.

Note:

Considering the INTEL_blackhole_render implementation(https://www.
khronos.org/registry/OpenGL/extensions/INTEL/INTEL_blackhole_render
.txt, https://patchwork.freedesktop.org/series/40035/)need test case
changes, and also need time to review in upstream, we keep this patch
firstly for urgent project milestone.

Test: Pass mdapi test_GfxDrv_DriverAcceptance test case
      GfxDrv_DriverAcceptanceQuery.GL_NULL_HARDWARE and has no reg issue
Signed-off-by: Landwerlin, Lionel <[email protected]>
This change makes following test pass:
	dEQP-VK.api.info.device.extensions

Originally-from: Tapani Pälli <[email protected]>
Test: [CTS 9.0_r8] dEQP-VK.api.info.device.extensions
Signed-off-by: Kevin Strasser <[email protected]>
(cover letter https://patchwork.freedesktop.org/series/51006/)

FROMLIST: i965: SIMD32 heuristics debug flag

Added a new DEBUG_HEUR32 flag to INTEL_DEBUG flags for enabling SIMD32
selection heuristics.

(am from https://patchwork.freedesktop.org/patch/256764/)

FROMLIST: i965: SIMD32 heuristics control data

Added a new structure for holding SIMD32 heuristics control data. The
control data itself will be fetched from drirc.

(am from https://patchwork.freedesktop.org/patch/256806/)

FROMLIST: i965: SIMD32 heuristics control data from drirc

To be able to test the heuristics with different parameters, they can be
controlled via environment variables through drirc.

(am from https://patchwork.freedesktop.org/patch/256788/)

FROMLIST: mesa: Helper functions for counting set bits in a mask

(am from https://patchwork.freedesktop.org/patch/256765/)

FROMLIST: i965/fs: Save the instruction count of each dispatch width

The SIMD32 selection heuristics will use this information for deciding whether
SIMD32 shaders should be used.

(am from https://patchwork.freedesktop.org/patch/256793/)

FROMLIST: i965/fs: SIMD32 selection heuristic based on grouped texture fetches

The function goes through the compiled shader and checks how many grouped
texture fetches there are. This is a simple heuristic which gets rid of most
of the regressions when enabling SIMD32 shaders but still retains some of
the benefits.

(am from https://patchwork.freedesktop.org/patch/256798/)

FROMLIST: i965/fs: Enable all SIMD32 heuristics

There are three simple heuristics for SIMD32 shader enabling:

- How many MRTs does the shader write into?
- How many grouped texture fetches does the shader have?
- How many instructions does the SIMD32 shader have compared to the SIMD16
   shader?

For testing purposes, the heuristics can be controlled via these environment
variables:

simd32_heuristic_mrt_check
- Enables MRT write check
- Default: true

simd32_heuristic_max_mrts
- How many MRT writes the heuristic allows
- Default: 1

simd32_heuristic_grouped_check
- Enables grouped texture fetch check
- Default: true

simd32_heuristic_grouped_sends
- How many grouped texture fetches the heuristic allows
- Default: 6

simd32_heuristic_inst_check
- Enables SIMD32 vs. SIMD16 instruction count check
- Default: true

simd32_heuristic_inst_ratio
- SIMD32 vs. SIMD16 instruction count ratio the heuristic allows
- Default: 2.3

SIMD32 shaders will not be compiled also when SIMD16 compilation fails or
spills.

(am from https://patchwork.freedesktop.org/patch/256766/)
This is needed to be in agreement with spec requirements:
KhronosGroup/OpenGL-API#46

Piers Daniell:
   "We discussed this in the OpenGL/ES working group meeting
    and agreed that eliminating unused elements from the interface
    block array is not desirable. There is no statement in the spec
    that this takes place and it would be highly implementation
    dependent if it happens. If the application has an "interface"
    in the shader they need to match up with the API it would be
    quite confusing to have the binding point get compacted.
    So the answer is no, the binding points aren't affected by
    unused elements in the interface block array."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109532
Reported-By: Ilia Mirkin <[email protected]>
Signed-off-by: Andrii Simiklit <[email protected]>

TEST=[CTS 9.0r6} dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers#18
(am from https://gitlab.freedesktop.org/mesa/mesa/merge_requests/332)
Signed-off-by: Kevin Strasser <[email protected]>
…of ssbo/ubo

This is needed to fix these tests:
piglit.spec.arb_shader_storage_buffer_object.compiler.unused-array-element_frag
piglit.spec.arb_shader_storage_buffer_object.compiler.unused-array-element_comp

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109532
Reported-By: Ilia Mirkin <[email protected]>
Signed-off-by: Andrii Simiklit <[email protected]>

TEST=[CTS 9.0r6} dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers#18
(am from https://gitlab.freedesktop.org/mesa/mesa/merge_requests/332)
Signed-off-by: Kevin Strasser <[email protected]>
… build

The libmesa_anv_gen* modules require anv_extensions.h, patch makes sure
it gets generated as a dependency before building them.

Signed-off-by: Chenglei Ren <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Cc: <[email protected]>
(cherry picked from commit 13b38ca)
This reverts commit ca36eb1.

This changes help pass dEQP-VK.spirv_assembly.instruction.*
time out issue.

Test: No regression on CTS test
Signed-off-by: Chenglei Ren <[email protected]>
As some extension has been added to CTS white list, now we could
add them back.

Test: No regression on CTS test
Signed-off-by: Chenglei Ren <[email protected]>
On low powerful platform(APL), ssbo & ubo scalar needs much more time
to finish. This will cause CTS test cases time out. Here we disable
VK_EXT_scalar_block_layout to pass them.

Test: No regression on CTS test
Signed-off-by: Chenglei Ren <[email protected]>
While the number of ACPs is generally not huge compared to the number of
blocks, 16 does seem a bit small.  Bumping it to 64 takes the execution
time of the piglit vs-isnan-dvec test from about 1:18.1 on an unoptimized
debug build (what we run in CI) with NIR_VALIDATE=0 to about 1:16.2.

Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
If the destination of an ACP entry exists only within this block, then
there's no need to keep it for dataflow analysis.  We can delete it from
the out_acp table and avoid growing the bitsets any bigger than we
absolutely have to.  This reduces the maximum number of global ACP
entries in the vs-isnan-dvec with software fp64 on Kaby Lake from 8630
to 3942 and takes the execution time of the piglit vs-isnan-dvec test
from about 1:16.2 on an unoptimized debug build (what we run in CI) with
NIR_VALIDATE=0 to about 56.4 seconds.

Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
…ction

In order to set up KILL sets, the dataflow code was walking the entire
array of ACPs for every instruction.  If you assume the number of ACPs
increases roughly with the number of instructions, this is O(n^2).  As
it turns out, regions_overlap() is not nearly as cheap as one would like
and shows up as a significant chunk on perf traces.

This commit changes things around and instead first builds an array of
exec_lists which it uses like a hash table (keyed off ACP source or
destination) similar to what's done in the rest of the copy-prop code.
By first walking the list of ACPs and populating the table and then
walking instructions and only looking at ACPs which probably have the
same VGRF number, we can reduce the complexity to O(n).  This takes the
execution time of the piglit vs-isnan-dvec test from about 56.4 seconds
on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to
about 38.7 seconds.

Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Copy link

@sysopenci sysopenci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autobuild started from pull-request-changes on this PR.

FAILURE: CheckBug Bad comments/Bugs

For more information, see: /absp/builders/celadon-autobuild/builds/914

@tpalli
Copy link
Contributor

tpalli commented Jun 10, 2019

@renchenglei FYI, for issue INTERNAL: Revert "Revert "intel/compiler: More peephole select"" there is the real fix now in Mesa master branch. It is commit f4ef34f, "intel/fs: Add an UNDEF instruction to avoid excess live ranges". I think we should use that commit (if feasible) instead of revert.

@renchenglei
Copy link
Contributor Author

@tpalli, thanks a lot for the info! Let me have a try, :)

@strassek
Copy link
Contributor

We may have already covered this somewhere, but for the vk extension stuff can we test the api level in the script? It seems like the extension whitelist for Android is set for each release, we might as well do the same in Mesa.

@renchenglei
Copy link
Contributor Author

Thanks @strassek, that make sense! Recently, I am blocked by some other work, once done, I will take a look this.

@renchenglei
Copy link
Contributor Author

@renchenglei FYI, for issue INTERNAL: Revert "Revert "intel/compiler: More peephole select"" there is the real fix now in Mesa master branch. It is commit f4ef34f, "intel/fs: Add an UNDEF instruction to avoid excess live ranges". I think we should use that commit (if feasible) instead of revert.

We tried this commit(f4ef34f), there seems regression with this new commit.
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.at_sample_id*
dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_ignores_centroid.137_191_1*
Maybe there is other dependence patch, once I have time, I will have a try with master branch.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.