Commit graph

206 commits

Author SHA1 Message Date
Fernando Sahmkow 194579bc4f ShaderCache: Fix Phi Nodes Type on OGL. 2021-11-01 22:26:17 +01:00
Fernando Sahmkow c50ad56bf5 ShaderCache: Order Phi Arguments from farthest away to nearest. 2021-10-31 19:34:15 +01:00
Fernando Sahmkow e5291e2031 TexturePass: Fix clamping of images as this allowed negative indices. 2021-10-24 20:46:36 +02:00
Fernando Sahmkow 3f4444b552 Shader Compiler: avoid overflowed indices on indixed samplers. 2021-10-17 03:38:09 +02:00
Morph db07ca6c7f
Merge pull request #6767 from ReinUsesLisp/fold-float-pack
shader: Fold UnpackFloat2x16 and PackFloat2x16
2021-07-30 02:07:52 -04:00
bunnei a98f14e9b0
Merge pull request #6722 from ReinUsesLisp/xmad-opts
shader: Fold integer FMA from Nvidia's pattern
2021-07-29 18:45:37 -07:00
ReinUsesLisp 8c9febe8f7 shader: Fold UnpackFloat2x16 and PackFloat2x16
Simplifies the code a bit when possible. These instructions should be
no-ops codegen wise.
2021-07-29 21:22:52 -03:00
ReinUsesLisp 1bb46b7d64 shader: Mark ConvertF16F32 and ConvertF32F16 as fp16 instructions
Fixes instances where fp16 types are not declared on SPIR-V but they are
used. This shouldn't happen on master, as it's been uncovered by an
additional optimization pass.
2021-07-27 21:33:05 -03:00
ReinUsesLisp 66a0cedba3 shader: Fold integer FMA from Nvidia's pattern
Fold shaders doing "a * b + c" on integers from the pattern generated by
Nvidia's GL compiler.

On a somewhat complex compute shader it reduces the code size by 16
instructions from 2 matches on Turing GPUs.

On Intel as extracted from KHR_pipeline_executable_properties:
Before the optimization:
```
Instruction Count: 2057
Basic Block Count: 45
Scratch Memory Size: 14752
Spill Count: 232
Fill Count: 261
SEND Count: 610
Cycle Count: 11325
```

After the optimization:
```
Instruction Count: 2046
Basic Block Count: 44
Scratch Memory Size: 13728
Spill Count: 219
Fill Count: 268
SEND Count: 604
Cycle Count: 11367
```
2021-07-26 04:58:02 -03:00
ReinUsesLisp 09fb41dc63 shader: Use TryInstRecursive on XMAD multiply folding
Simplify a bit the logic.
2021-07-26 04:15:27 -03:00
ReinUsesLisp bf2956d77a shader: Avoid usage of C++20 ranges to build in clang 2021-07-22 21:51:40 -04:00
lat9nq 49946cf780 shader_recompiler, video_core: Resolve clang errors
Silences the following warnings-turned-errors:
-Wsign-conversion
-Wunused-private-field
-Wbraced-scalar-init
-Wunused-variable

And some other errors
2021-07-22 21:51:40 -04:00
ameerj 41c6cb70f9 glsl: Fix tracking of info.uses_shadow_lod 2021-07-22 21:51:40 -04:00
ameerj 57f222c56e dual_vertex_pass: Clang format 2021-07-22 21:51:40 -04:00
ReinUsesLisp 7dafa96ab5 shader: Rework varyings and implement passthrough geometry shaders
Put all varyings into a single std::bitset with helpers to access it.

Implement passthrough geometry shaders using host's.
2021-07-22 21:51:39 -04:00
lat9nq 257d2aab74 lower_int64_to_int32: Add missing include 2021-07-22 21:51:39 -04:00
ReinUsesLisp d8d5501459 shader: Add int64 to int32 lowering pass 2021-07-22 21:51:39 -04:00
ReinUsesLisp 04ef2160f9 shader: Teach global memory base tracker to follow vectors 2021-07-22 21:51:39 -04:00
ReinUsesLisp 97e80dda55 shader: Add constant propagation to integer vectors 2021-07-22 21:51:39 -04:00
ReinUsesLisp 808ef97a08 shader: Move loop safety tests to code emission 2021-07-22 21:51:39 -04:00
ameerj a0365217f5 texture_pass: Fix is_read image qualification
Atomic operations are considered to have both read and write access. This was not  being accounted for.
2021-07-22 21:51:38 -04:00
ReinUsesLisp 0cd08b3e72 shader: Align constant buffer sizes to 16 bytes
WAR for AMD reading zeroes on uniform buffers of size 2.
2021-07-22 21:51:38 -04:00
ReinUsesLisp 374eeda1a3 shader: Properly manage attributes not written from previous stages 2021-07-22 21:51:38 -04:00
ameerj d36f667bc0 glsl: Address rest of feedback 2021-07-22 21:51:38 -04:00
ameerj a0d0704aff glsl: Conditionally add EXT_texture_shadow_lod 2021-07-22 21:51:38 -04:00
ameerj 6aa1bf7b6f glsl: Implement legacy varyings 2021-07-22 21:51:38 -04:00
ameerj 9ccbd74991 glsl: Fix ATOM and implement ATOMS 2021-07-22 21:51:37 -04:00
ameerj 5399906c26 glsl: Track S32 atomics 2021-07-22 21:51:36 -04:00
ameerj 11ba190462 glsl: Revert ssbo aliasing. Storage Atomics impl 2021-07-22 21:51:36 -04:00
ameerj 3d9ecbe998 glsl: Wip storage atomic ops 2021-07-22 21:51:36 -04:00
ReinUsesLisp 7ac55c2a75 shader: Fix loop safety to SSA pass 2021-07-22 21:51:35 -04:00
lat9nq 373f75d944 shader: Add shader loop safety check settings
Also add a setting for enable Nsight Aftermath.
2021-07-22 21:51:35 -04:00
FernandoS27 562af30181 shader: Fix VertexA Shaders. 2021-07-22 21:51:34 -04:00
ReinUsesLisp 4a2361a1e2 buffer_cache: Reduce uniform buffer size from shader usage
Increases performance significantly on certain titles.
2021-07-22 21:51:34 -04:00
ReinUsesLisp 5539b13c5a shader,glasm: Implement legacy texcoord loads 2021-07-22 21:51:34 -04:00
ReinUsesLisp ac0f5d2ab6 shader: Track legacy varyings 2021-07-22 21:51:34 -04:00
ReinUsesLisp 457dda69cc shader: Clang-format secondary textures 2021-07-22 21:51:34 -04:00
ReinUsesLisp 627161c38e shader: Fix secondary textures 2021-07-22 21:51:34 -04:00
ReinUsesLisp fbf5cdcba0 shader: Fix FSwizzleAdd folding when going through phi nodes 2021-07-22 21:51:34 -04:00
ReinUsesLisp 77ee733c3a glasm: Remove unintentionally committed fmt::prints 2021-07-22 21:51:33 -04:00
ReinUsesLisp bf5e48ffe4 glasm: Initial implementation of phi nodes on GLASM 2021-07-22 21:51:31 -04:00
ReinUsesLisp d54d7de40e glasm: Rework control flow introducing a syntax list
This commit regresses VertexA shaders, their transformation pass has to
be adapted to the new control flow.
2021-07-22 21:51:31 -04:00
ReinUsesLisp c4fd6b55bc glasm: Implement shuffle and vote instructions on GLASM 2021-07-22 21:51:31 -04:00
FernandoS27 ee61ec2c39 shader: Optimize NVN Fallthrough 2021-07-22 21:51:30 -04:00
ameerj 7ecc6de56a shader: Implement Int32 SUATOM/SURED 2021-07-22 21:51:30 -04:00
FernandoS27 c49d56c931 shader: Address feedback 2021-07-22 21:51:29 -04:00
FernandoS27 b541f5e5e3 shader: Implement VertexA stage 2021-07-22 21:51:29 -04:00
ameerj 20e86fd615 shader: Fix BFE s32 undefined check
Our unit tests were hitting this exception.
2021-07-22 21:51:29 -04:00
ReinUsesLisp 50eb03382e shader: Fix error checking in bitfieldExtract and implement bitfieldInsert folding 2021-07-22 21:51:29 -04:00
ReinUsesLisp 0c7230a606 shader: Add more strict validation the pass 2021-07-22 21:51:29 -04:00
ReinUsesLisp 25949b864c shader: Fix forward referencing identity instructions when inserting phi 2021-07-22 21:51:29 -04:00
ReinUsesLisp 92a01984e6 shader: Remove invalidated blocks in dead code elimination pass 2021-07-22 21:51:29 -04:00
ReinUsesLisp d10cf55353 shader: Implement indexed textures 2021-07-22 21:51:28 -04:00
ReinUsesLisp 23182fa59c shader: Intrusively store in a block if it's sealed or not 2021-07-22 21:51:28 -04:00
ReinUsesLisp 050e81500c shader: Move microinstruction header to the value header 2021-07-22 21:51:28 -04:00
ReinUsesLisp 4209828646 shader: Intrusively store register values in block for SSA pass 2021-07-22 21:51:28 -04:00
ReinUsesLisp dd860b684c shader: Implement D3D samplers 2021-07-22 21:51:28 -04:00
ReinUsesLisp a8d46a5eae shader: Add constant propagation for arithmetic right shifts 2021-07-22 21:51:28 -04:00
ReinUsesLisp 7018e524f5 shader: Add NVN storage buffer fallbacks
When we can't track the SSBO origin of a global memory instruction,
leave it as a global memory operation and assume these pointers are in
the NVN storage buffer slots, then apply a linear search in the shader's
runtime.
2021-07-22 21:51:28 -04:00
FernandoS27 f69d0b91ff shader: Address feedback 2021-07-22 21:51:28 -04:00
FernandoS27 080857b60e shader: Add coarse derivatives 2021-07-22 21:51:28 -04:00
FernandoS27 04c459fc8d shader: Implement fine derivates constant propagation 2021-07-22 21:51:28 -04:00
ReinUsesLisp 50f8007172 shader: Fix Phi node types 2021-07-22 21:51:28 -04:00
ReinUsesLisp 80940b1706 shader: Implement SampleMask 2021-07-22 21:51:28 -04:00
ReinUsesLisp 95815a3883 shader: Implement PIXLD.MY_INDEX 2021-07-22 21:51:28 -04:00
ReinUsesLisp e3514bcd6b spirv: Implement ViewportMask with NV_viewport_array2 2021-07-22 21:51:28 -04:00
ReinUsesLisp b0f1255c8c shader: Implement PrimitiveId 2021-07-22 21:51:27 -04:00
ReinUsesLisp 183855e396 shader: Implement tessellation shaders, polygon mode and invocation id 2021-07-22 21:51:27 -04:00
ReinUsesLisp 34519d3fc6 shader: Mark atomic instructions as writes 2021-07-22 21:51:27 -04:00
ReinUsesLisp 416e1b7441 spirv: Implement image buffers 2021-07-22 21:51:27 -04:00
ReinUsesLisp d8ec99dada spirv: Implement Layer stores 2021-07-22 21:51:27 -04:00
ReinUsesLisp fa75b9b062 spirv: Rework storage buffers and shader memory 2021-07-22 21:51:27 -04:00
ReinUsesLisp 2597cee85b shader: Add constant propagation for *&^| binary operations 2021-07-22 21:51:27 -04:00
ReinUsesLisp 23b8714732 spirv: Define StorageImageWriteWithoutFormat capability when used 2021-07-22 21:51:27 -04:00
ReinUsesLisp 5c61e860e4 shader: Implement SR_THREAD_KILL 2021-07-22 21:51:27 -04:00
ameerj 3db2b3effa shader: Implement ATOM/S and RED 2021-07-22 21:51:27 -04:00
ReinUsesLisp ab543f1821 spirv: Guard against typeless image reads on unsupported devices 2021-07-22 21:51:27 -04:00
ReinUsesLisp 9280cd649a shader: Move LaneId to the warp emission file and fix AMD 2021-07-22 21:51:27 -04:00
ReinUsesLisp 7cb2ab3585 shader: Implement SULD and SUST 2021-07-22 21:51:26 -04:00
lat9nq 5bfcafa0a2 shader: Address feedback + clang format 2021-07-22 21:51:26 -04:00
lat9nq 0bb85f6a75 shader_recompiler,video_core: Cleanup some GCC and Clang errors
Mostly fixing unused *, implicit conversion, braced scalar init,
fpermissive, and some others.

Some Clang errors likely remain in video_core, and std::ranges is still
a pertinent issue in shader_recompiler

shader_recompiler: cmake: Force bracket depth to 1024 on Clang
Increases the maximum fold expression depth

thread_worker: Include condition_variable

Don't use list initializers in control flow

Co-authored-by: ReinUsesLisp <reinuseslisp@airmail.cc>
2021-07-22 21:51:26 -04:00
ReinUsesLisp 1f3eb601ac shader: Implement texture buffers 2021-07-22 21:51:26 -04:00
FernandoS27 dcaf0e9150 shader: Address feedback 2021-07-22 21:51:26 -04:00
FernandoS27 73cb17f41b shader: Implement indexed Position and ClipDistances 2021-07-22 21:51:26 -04:00
FernandoS27 1d51803169 shader: Implement indexed attributes 2021-07-22 21:51:26 -04:00
ReinUsesLisp 417fb5d385 shader: Move recursive SSA rewrite to the heap 2021-07-22 21:51:26 -04:00
ReinUsesLisp da6cf2632c shader: Add subgroup masks 2021-07-22 21:51:26 -04:00
ReinUsesLisp 85795de99f shader: Abstract breadth searches and use the abstraction 2021-07-22 21:51:26 -04:00
ReinUsesLisp 3f594dd86b shader: Reimplement GetCbufU64 as GetCbufU32x2
It may generate better code on some compilers and it's easier to handle.
2021-07-22 21:51:26 -04:00
ReinUsesLisp 9a342f5605 shader: Rework global memory tracking to use breadth-first search 2021-07-22 21:51:26 -04:00
FernandoS27 ed6a1b1a3d shader: Address feedback 2021-07-22 21:51:26 -04:00
FernandoS27 baec84247f shader: Address Feedback 2021-07-22 21:51:26 -04:00
FernandoS27 45d547af11 shader: Implement SR_LaneId 2021-07-22 21:51:26 -04:00
FernandoS27 ecb30c9072 shader: Improve VOTE.VTG stub 2021-07-22 21:51:25 -04:00
FernandoS27 12f5f32098 shader: Mark SSBOs as written when they are 2021-07-22 21:51:25 -04:00
FernandoS27 d819ba4489 shader: Implement ViewportIndex 2021-07-22 21:51:25 -04:00
FernandoS27 bee8188799 shader: Fold composite extract 2021-07-22 21:51:25 -04:00
FernandoS27 c3bace756f shader: Fold comparisons and Pack/Unpack16 2021-07-22 21:51:25 -04:00
ReinUsesLisp 5f22cd89e2 shader: Fix constant propagation to use reverse post order 2021-07-22 21:51:25 -04:00
FernandoS27 0c4cf3b9eb shader: Implement ClipDistance 2021-07-22 21:51:25 -04:00