Commit Graph

696 Commits

Author SHA1 Message Date
Fernando Sahmkow
9293c3a0f2 Shader_IR: Fix TLD4 and add Bindless Variant.
This commit fixes an issue where not all 4 results of tld4 were being
written, the color component was defaulted to red, among other things.
It also implements the bindless variant.
2019-10-30 12:02:03 -04:00
ReinUsesLisp
a993df1ee2
shader/node: Unpack bindless texture encoding
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.

Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
Rodrigo Locatti
26f3e18c5c
Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased
Implement Fast BRX, fix TXQ and addapt the Shader Cache for it
2019-10-26 16:56:13 -03:00
Fernando Sahmkow
be856a38d6 Shader_IR: Address Feedback. 2019-10-26 15:38:30 -04:00
Rodrigo Locatti
a0d79085c4
Merge pull request #3027 from lioncash/lookup
shader_ir: Use std::array with std::pair instead of std::unordered_map
2019-10-26 05:49:15 -03:00
Rodrigo Locatti
d52598173d
Merge pull request #3013 from FernandoS27/tld4s-fix
Shader_Ir: Fix TLD4S from using a component mask.
2019-10-25 20:06:26 -03:00
ReinUsesLisp
78f3e8a757 gl_shader_cache: Implement locker variants invalidation 2019-10-25 09:01:32 -04:00
ReinUsesLisp
ec85648af3 gl_shader_disk_cache: Store and load fast BRX 2019-10-25 09:01:31 -04:00
ReinUsesLisp
fa2c297f3e const_buffer_locker: Minor style changes 2019-10-25 09:01:31 -04:00
ReinUsesLisp
7b81ba4d8a gl_shader_decompiler: Move entries to a separate function 2019-10-25 09:01:31 -04:00
Fernando Sahmkow
1244f2d368 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:31 -04:00
Fernando Sahmkow
a05120ec0b Shader_IR: Correct typo in Consistent method. 2019-10-25 09:01:30 -04:00
Fernando Sahmkow
33fcec3502 Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it 2019-10-25 09:01:30 -04:00
Fernando Sahmkow
8909f52166 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:30 -04:00
Fernando Sahmkow
acd6441134 Shader_Cache: setup connection of ConstBufferLocker 2019-10-25 09:01:29 -04:00
Fernando Sahmkow
1a58f45d76 VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders. 2019-10-25 09:01:29 -04:00
Fernando Sahmkow
2ef696c85a Shader_IR: Implement BRX tracking. 2019-10-25 09:01:29 -04:00
Lioncash
382717172e shader_ir: Use std::array with pair instead of unordered_map
Given the overall size of the maps are very small, we can use arrays of
pairs here instead of always heap allocating a new map every time the
functions are called. Given the small size of the maps, the difference
in container lookups are negligible, especially given the entries are
already sorted.
2019-10-24 00:25:38 -04:00
Lioncash
1f5401c89c video_core/shader: Resolve instances of variable shadowing
Silences a few -Wshadow warnings.
2019-10-23 23:00:31 -04:00
Fernando Sahmkow
1509d2ffbd Shader_Ir: Fix TLD4S from using a component mask.
TLD4S always outputs 4 values, the previous code checked a component 
mask and omitted those values that weren't part of it. This commit 
corrects that and makes sure all 4 values are set.
2019-10-22 10:59:07 -04:00
ReinUsesLisp
1ea07954fb shader_ir/memory: Ignore global memory when tracking fails
Ignore global memory operations instead of invoking undefined behaviour
when constant buffer tracking fails and we are blasting through asserts,
ignore the operation.

In the case of LDG this means filling the destination registers with
zeroes; for STG this means ignore the instruction as a whole.

The default behaviour is still to abort execution on failure.
2019-10-22 02:49:17 -03:00
Lioncash
074b38b7a9 video_core/shader/ast: Make ShowCurrentState() and SanityCheck() const member functions
These can also trivially be made const member functions, with the
addition of a few consts.
2019-10-17 20:59:48 -04:00
Lioncash
222f4b45eb video_core/shader/ast: Make ASTManager::Print a const member function
Given all visiting functions never modify the nodes, we can trivially
make this a const member function.
2019-10-17 20:56:39 -04:00
Lioncash
7831e86c34 video_core/shader/ast: Make ExprPrinter members private
This member already has an accessor, so there's no need for it to be
public.
2019-10-17 20:39:36 -04:00
Lioncash
a2eccbf075 video_core/shader/ast: Make Indent() return a string_view
The returned string is simply a substring of our constexpr tabs
string_view, so we can just use a string_view here as well, since the
original string_view is guaranteed to always exist.

Now the function is fully non-allocating.
2019-10-17 20:29:00 -04:00
Lioncash
15d177a6ac video_core/shader/ast: Make Indent() private
It's never used outside of this class, so we can narrow its scope down.
2019-10-17 20:26:13 -04:00
Lioncash
7f6a8a33d4 video_core/shader/ast: Rename Ident() to Indent()
This can be confusing, given "ident" is generally used as a shorthand
for "identifier".
2019-10-17 20:26:13 -04:00
Lioncash
081530686c video_core/shader/ast: Make use of fmt where applicable
Makes a few strings nicer to read and also eliminates a bit of string
churn with operator+.
2019-10-17 20:26:10 -04:00
bunnei
9fe8072c67
Merge pull request #2980 from lioncash/warn
maxwell_3d: Silence truncation warnings
2019-10-17 14:02:16 -04:00
Lioncash
77b4916b33 control_flow: Silence truncation warnings
This can be trivially fixed by making the input size a size_t.
CFGRebuildState's constructor parameter is already a std::size_t, so
this just makes the size type fully conform with it.
2019-10-15 19:10:28 -04:00
Lioncash
67658dd6e8 shader/node: std::move Meta instance within OperationNode constructor
Allows usages of the constructor to avoid an unnecessary copy.
2019-10-15 18:21:59 -04:00
ReinUsesLisp
3d0f357307
shader/half_set_predicate: Fix HSETP2 for constant buffers
HSETP2 when used with a constant buffer parses the second operand type
as F32. This is not configurable.
2019-10-07 14:49:47 -03:00
ReinUsesLisp
632c9e4ee3
shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUG 2019-10-07 14:48:58 -03:00
Lioncash
f883cd4f0e video_core/control_flow: Eliminate variable shadowing warnings 2019-10-05 09:14:27 -04:00
Lioncash
25702b6256 video_core/control_flow: Eliminate pessimizing moves
These can inhibit the ability of a compiler to perform RVO.
2019-10-05 09:14:27 -04:00
Lioncash
d82b181d44 video_core/ast: Unindent most of IsFullyDecompiled() by one level 2019-10-05 09:14:27 -04:00
Lioncash
6c41d1cd7e video_core/ast: Make ShowCurrentState() take a string_view instead of std::string
Allows the function to be non-allocating in terms of the output string.
2019-10-05 09:14:27 -04:00
Lioncash
3c54edae24 video_core/ast: Eliminate variable shadowing warnings 2019-10-05 09:14:26 -04:00
Lioncash
5a0a9c7449 video_core/ast: Replace std::string with a constexpr std::string_view
Same behavior, but without the need to heap allocate
2019-10-05 09:14:26 -04:00
Lioncash
3a20d9734f video_core/ast: Default the move constructor and assignment operator
This is behaviorally equivalent and also fixes a bug where some members
weren't being moved over.
2019-10-05 09:14:26 -04:00
Lioncash
43503a69bf video_core/{ast, expr}: Organize forward declaration
Keeps them alphabetically sorted for readability.
2019-10-05 09:14:26 -04:00
Lioncash
50ad745585 video_core/expr: Supply operator!= along with operator==
Provides logical symmetry to the interface.
2019-10-05 09:14:26 -04:00
Lioncash
8eb1398f8d video_core/{ast, expr}: Use std::move where applicable
Avoids unnecessary atomic reference count increments and decrements.
2019-10-05 09:14:23 -04:00
Lioncash
8e0c80f269 video_core/ast: Supply const accessors for data where applicable
Provides const equivalents of data accessors for use within const
contexts.
2019-10-05 08:22:03 -04:00
Fernando Sahmkow
e6eae4b815 Shader_ir: Address feedback 2019-10-04 18:52:57 -04:00
Fernando Sahmkow
3c09d9abe6 Shader_Ir: Address Feedback and clang format. 2019-10-04 18:52:57 -04:00
Fernando Sahmkow
7c756baa77 Shader_IR: clean up AST handling and add documentation. 2019-10-04 18:52:55 -04:00
Fernando Sahmkow
5ea740beb5 Shader_IR: Correct OutwardMoves for Ifs 2019-10-04 18:52:54 -04:00
Fernando Sahmkow
b3c46d6948 Shader_IR: corrections and clang-format 2019-10-04 18:52:53 -04:00
Fernando Sahmkow
2e9a810423 Shader_IR: allow else derivation to be optional. 2019-10-04 18:52:52 -04:00
Fernando Sahmkow
ca9901867e vk_shader_compiler: Implement the decompiler in SPIR-V 2019-10-04 18:52:51 -04:00
Fernando Sahmkow
0366c18d87 Shader_IR: mark labels as unused for partial decompile. 2019-10-04 18:52:51 -04:00
Fernando Sahmkow
47e4f6a52c Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes. 2019-10-04 18:52:50 -04:00
Fernando Sahmkow
38fc995f6c gl_shader_decompiler: Implement AST decompiling 2019-10-04 18:52:50 -04:00
Fernando Sahmkow
6fdd501113 shader_ir: Declare Manager and pass it to appropiate programs. 2019-10-04 18:52:49 -04:00
Fernando Sahmkow
8be6e1c522 shader_ir: Corrections to outward movements and misc stuffs 2019-10-04 18:52:48 -04:00
Fernando Sahmkow
4fde66e609 shader_ir: Add basic goto elimination 2019-10-04 18:52:48 -04:00
Fernando Sahmkow
c17953978b shader_ir: Initial Decompile Setup 2019-10-04 18:52:47 -04:00
bunnei
376f1a4432
Merge pull request #2869 from ReinUsesLisp/suld
shader/image: Implement SULD and fix SUATOM
2019-09-23 21:47:03 -04:00
David
9d69206cd0
Merge pull request #2870 from FernandoS27/multi-draw
Implement a MME Draw commands Inliner and correct host instance drawing
2019-09-22 23:13:02 +10:00
Rodrigo Locatti
9286976948
Merge pull request #2878 from FernandoS27/icmp
shader_ir: Implement ICMP
2019-09-21 18:06:07 -03:00
ReinUsesLisp
44000971e2
gl_shader_decompiler: Use uint for images and fix SUATOM
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
ReinUsesLisp
675f23aedc
shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
Fernando Sahmkow
527b841c15 Shader_IR: ICMP corrections and fixes 2019-09-21 14:28:03 -04:00
bunnei
88d857499b
Merge pull request #2855 from ReinUsesLisp/shfl
shader_ir/warp: Implement SHFL for Nvidia devices
2019-09-20 17:10:42 -04:00
Fernando Sahmkow
4b81d19a1a Shader_IR: Implement ICMP. 2019-09-19 20:56:29 -04:00
Fernando Sahmkow
7606da5611 VideoCore: Corrections to the MME Inliner and removal of hacky instance management. 2019-09-19 11:41:29 -04:00
bunnei
b31880dc5e
Merge pull request #2784 from ReinUsesLisp/smem
shader_ir: Implement shared memory
2019-09-18 16:26:05 -04:00
ReinUsesLisp
0526bf1895 shader_ir/warp: Implement SHFL 2019-09-17 17:44:07 -03:00
ReinUsesLisp
36abf67e79 shader/image: Implement SUATOM and fix SUST 2019-09-10 20:22:31 -03:00
bunnei
34b2c60f95
Merge pull request #2823 from ReinUsesLisp/shr-clamp
shader/shift: Implement SHR wrapped and clamped variants
2019-09-10 11:56:17 -04:00
ReinUsesLisp
1f43e5296f gl_shader_decompiler: Keep track of written images and mark them as modified 2019-09-05 23:26:05 -03:00
ReinUsesLisp
3a450c1395 kepler_compute: Implement texture queries 2019-09-05 20:35:51 -03:00
ReinUsesLisp
4de04eba39 shader_ir: Implement LD_S
Loads from shared memory.
2019-09-05 01:38:37 -03:00
ReinUsesLisp
f17415d431 shader_ir: Implement ST_S
This instruction writes to a memory buffer shared with threads within
the same work group. It is known as "shared" memory in GLSL.
2019-09-05 01:38:37 -03:00
ReinUsesLisp
77ef4fa907 shader/shift: Implement SHR wrapped and clamped variants
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
ReinUsesLisp
dfae2d141a half_set_predicate: Fix predicate assignments 2019-09-04 01:54:23 -03:00
bunnei
81fbc5370d
Merge pull request #2812 from ReinUsesLisp/f2i-selector
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei
d4f33b822b
Merge pull request #2811 from ReinUsesLisp/fsetp-fix
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
Rodrigo Locatti
4d4f9cc104 video_core: Silent miscellaneous warnings (#2820)
* texture_cache/surface_params: Remove unused local variable

* rasterizer_interface: Add missing documentation commentary

* maxwell_dma: Remove unused rasterizer reference

* video_core/gpu: Sort member declaration order to silent -Wreorder warning

* fermi_2d: Remove unused MemoryManager reference

* video_core: Silent unused variable warnings

* buffer_cache: Silent -Wreorder warnings

* kepler_memory: Remove unused MemoryManager reference

* gl_texture_cache: Add missing override

* buffer_cache: Add missing include

* shader/decode: Remove unused variables
2019-08-30 14:08:00 -04:00
bunnei
f8cc5668f8
Merge pull request #2758 from ReinUsesLisp/packed-tid
shader/decode: Implement S2R Tic
2019-08-29 12:58:43 -04:00
ReinUsesLisp
e3534700d7 shader_ir/conversion: Split int and float selector and implement F2F H1 2019-08-28 16:09:33 -03:00
ReinUsesLisp
b13fbc25b8 shader_ir/conversion: Implement F2I F16 Ra.H1 2019-08-27 23:40:40 -03:00
ReinUsesLisp
6207751b00 float_set_predicate: Add missing negation bit for the second operand 2019-08-27 21:57:43 -03:00
ReinUsesLisp
4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
bunnei
dfdd20142e
Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix
half_set_predicate: Fix HSETP2_C constant buffer offset
2019-08-21 10:29:17 -04:00
bunnei
cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
bunnei
ca61e298b3
Merge pull request #2778 from ReinUsesLisp/nop
shader_ir: Implement NOP
2019-08-18 08:51:34 -04:00
ReinUsesLisp
2ff8044806 shader_ir: Implement NOP 2019-08-04 03:02:55 -03:00
ReinUsesLisp
ec0da3ef64 half_set_predicate: Fix HSETP2_C constant buffer offset 2019-08-04 02:50:55 -03:00
ReinUsesLisp
77f1a676a1 decode/half_set_predicate: Fix predicates 2019-07-26 00:12:38 -03:00
bunnei
b0ff3179ef
Merge pull request #2739 from lioncash/cflow
video_core/control_flow: Minor changes/warning cleanup
2019-07-25 13:04:56 -04:00
bunnei
4d26550f5f
Merge pull request #2737 from FernandoS27/track-fix
Shader_Ir: Correct tracking to track from right to left
2019-07-25 12:41:52 -04:00
bunnei
31e8a61527
Merge pull request #2743 from FernandoS27/surpress-assert
Downgrade and suppress a series of GPU asserts and debug messages.
2019-07-25 12:34:36 -04:00
ReinUsesLisp
104641db07 shader/decode: Implement S2R Tic 2019-07-22 16:16:10 -03:00
Fernando Sahmkow
11f4e739bd Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
Fernando Sahmkow
1158777737 Shader_Ir: Change Debug Asserts for Log Warnings 2019-07-19 22:15:34 -04:00
ReinUsesLisp
45c162444d shader/half_set_predicate: Fix HSETP2 implementation 2019-07-19 22:21:22 -03:00
ReinUsesLisp
6c4985edc9 shader/half_set_predicate: Implement missing HSETP2 variants 2019-07-19 22:20:47 -03:00
Lioncash
c1c89411da video_core/control_flow: Provide operator!= for types with operator==
Provides operational symmetry for the respective structures.
2019-07-18 21:03:31 -04:00
Lioncash
1780e0e3d0 video_core/control_flow: Prevent sign conversion in TryGetBlock()
The return value is a u32, not an s32, so this would result in an
implicit signedness conversion.
2019-07-18 21:03:31 -04:00
Lioncash
a162a844d2 video_core/control_flow: Remove unnecessary BlockStack copy constructor
This is the default behavior of the copy constructor, so it doesn't need
to be specified.

While we're at it we can make the other non-default constructor
explicit.
2019-07-18 21:03:30 -04:00
Lioncash
56bc11d952 video_core/control_flow: Use std::move where applicable
Results in less work being done where avoidable.
2019-07-18 21:03:30 -04:00
Lioncash
e7b39f47f8 video_core/control_flow: Use the prefix variant of operator++ for iterators
Same thing, but potentially allows a standard library implementation to
pick a more efficient codepath.
2019-07-18 21:03:30 -04:00
Lioncash
6885e7e7ec video_core/control_flow: Use empty() member function for checking emptiness
It's what it's there for.
2019-07-18 21:03:30 -04:00
Lioncash
45fa12a05c video_core: Resolve -Wreorder warnings
Ensures that the constructor members are always initialized in the order
that they're declared in.
2019-07-18 21:03:30 -04:00
Lioncash
47df844338 video_core/control_flow: Make program_size for ScanFlow() a std::size_t
Prevents a truncation warning from occurring with MSVC. Also the
internal data structures already treat it as a size_t, so this is just a
discrepancy in the interface.
2019-07-18 21:03:29 -04:00
Lioncash
3df9558593 video_core/control_flow: Place all internally linked types/functions within an anonymous namespace
Previously, quite a few functions were being linked with external
linkage.
2019-07-18 21:03:29 -04:00
Lioncash
1109db86b7 video_core/shader/decode: Prevent sign-conversion warnings
Makes it explicit that the conversions here are intentional.
2019-07-18 21:03:29 -04:00
bunnei
63bda67a34
Merge pull request #2738 from lioncash/shader-ir
shader-ir: Minor cleanup-related changes
2019-07-18 13:52:01 -04:00
Fernando Sahmkow
5a06e33859 Shader_Ir: correct clang format 2019-07-18 10:09:26 -04:00
Fernando Sahmkow
0b65e9335e Shader_Ir: Downgrade precision and rounding asserts to debug asserts.
This commit reduces the sevirity of asserts for FP precision and 
rounding as this are well known and have little to no consequences in 
gpu's accuracy.
2019-07-18 08:17:19 -04:00
Fernando Sahmkow
223a535f3f
Merge pull request #2740 from lioncash/bra
shader/decode/other: Correct branch indirect argument within BRA handling
2019-07-17 14:25:08 -04:00
Lioncash
bebbdc2067 shader_ir: std::move Node instance where applicable
These are std::shared_ptr instances underneath the hood, which means
copying them isn't as cheap as a regular pointer. Particularly so on
weakly-ordered systems.

This avoids atomic reference count increments and decrements where they
aren't necessary for the core set of operations.
2019-07-16 19:49:23 -04:00
Lioncash
60926ac16b shader_ir: Rename Get/SetTemporal to Get/SetTemporary
This is more accurate in terms of describing what the functions are
actually doing. Temporal relates to time, not the setting of a temporary
itself.
2019-07-16 19:47:43 -04:00
Lioncash
44d87ff641 shader_ir: Remove unused includes
Removes unnecessary header dependencies.
2019-07-16 19:47:42 -04:00
Fernando Sahmkow
d614193e49 Shader_Ir: Correct tracking to track from right to left 2019-07-16 15:06:59 -04:00
Fernando Sahmkow
b56e7f870a
Merge pull request #2565 from ReinUsesLisp/track-indirect
shader/track: Track indirect buffers
2019-07-16 14:58:35 -04:00
Lioncash
e2d7dda166 shader/decode/other: Correct branch indirect argument within BRA handling
This appears to have been a copy/paste error introduced within
8a6fc529a9
2019-07-16 12:20:45 -04:00
Fernando Sahmkow
1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
ReinUsesLisp
afa8096df5 shader: Allow tracking of indirect buffers without variable offset
While changing this code, simplify tracking code to allow returning
the base address node, this way callers don't have to manually rebuild
it on each invocation.
2019-07-14 22:36:44 -03:00
Fernando Sahmkow
0ec9da2f9f
Merge pull request #2692 from ReinUsesLisp/tlds-f16
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
Fernando Sahmkow
f2549739d1 shader_ir: Add comments on missing instruction.
Also shows Nvidia's address space on comments.
2019-07-09 17:15:45 -04:00
Fernando Sahmkow
2de7649311 shader_ir: limit explorastion to best known program size. 2019-07-09 08:14:43 -04:00
Fernando Sahmkow
e7c6045a03 control_flow: Correct block breaking algorithm. 2019-07-09 08:14:43 -04:00
Fernando Sahmkow
dc4a93594c control_flow: Assert shaders bigger than limit. 2019-07-09 08:14:42 -04:00
Fernando Sahmkow
e7a88f0ab3 control_flow: Address feedback. 2019-07-09 08:14:42 -04:00
Fernando Sahmkow
34357b110c shader_ir: Correct parsing of scheduling instructions and correct sizing 2019-07-09 08:14:41 -04:00
Fernando Sahmkow
cfb3db1a32 shader_ir: Correct max sizing 2019-07-09 08:14:40 -04:00
Fernando Sahmkow
d45fed3030 shader_ir: Remove unnecessary constructors and use optional for ScanFlow result 2019-07-09 08:14:40 -04:00
Fernando Sahmkow
01b21ee1e8 shader_ir: Corrections, documenting and asserting control_flow 2019-07-09 08:14:39 -04:00
Fernando Sahmkow
d5533b440c shader_ir: Unify blocks in decompiled shaders. 2019-07-09 08:14:39 -04:00
Fernando Sahmkow
926b80102f shader_ir: Decompile Flow Stack 2019-07-09 08:14:38 -04:00
Fernando Sahmkow
459fce3a8f shader_ir: propagate shader size to the IR 2019-07-09 08:14:37 -04:00
Fernando Sahmkow
8a6fc529a9 shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
Fernando Sahmkow
c218ae4b02 shader_ir: Remove the old scanner. 2019-07-09 08:14:36 -04:00
Fernando Sahmkow
8af6e6a052 shader_ir: Implement a new shader scanner 2019-07-09 08:14:36 -04:00
ReinUsesLisp
c9d886c84e gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
Tobias
be020f7621
Delete decode_integer_set.cpp 2019-07-07 21:40:33 +02:00
ReinUsesLisp
d0966b9f7c shader/texture: Add F16 support for TLDS 2019-07-07 16:05:56 -03:00
ReinUsesLisp
10a83653ee decode/texture: Address feedback 2019-06-24 02:05:05 -03:00
Fernando Sahmkow
d1812316e1 texture_cache: Style and Corrections 2019-06-20 21:24:47 -04:00
Fernando Sahmkow
b7de31ac97 shader_ir: Fix image copy rebase issues 2019-06-20 21:38:34 -03:00
ReinUsesLisp
9097301d92 shader: Implement bindless images 2019-06-20 21:38:33 -03:00
ReinUsesLisp
06c4ce8645 shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp
4e81fc8296 shader: Implement texture buffers 2019-06-20 21:36:12 -03:00
ReinUsesLisp
fe8e6618f2 shader: Split SSY and PBK stack
Hardware testing revealed that SSY and PBK push to a different stack,
allowing code like this:

        SSY label1;
        PBK label2;
        SYNC;
label1: PBK;
label2: EXIT;
2019-06-07 02:18:27 -03:00
ReinUsesLisp
769a50661a shader/node: Minor changes
Reflect std::shared_ptr nature of Node on initializers and remove
constant members in nodes.

Add some commentaries.
2019-06-06 20:03:33 -03:00
ReinUsesLisp
e1b3be7ced shader: Move Node declarations out of the shader IR header
Analysis passes do not have a good reason to depend on shader_ir.h to
work on top of nodes. This splits node-related declarations to their own
file and leaves the IR in shader_ir.h
2019-06-06 20:02:37 -03:00
ReinUsesLisp
bf4dfb3ad4 shader: Use shared_ptr to store nodes and move initialization to file
Instead of having a vector of unique_ptr stored in a vector and
returning star pointers to this, use shared_ptr. While changing
initialization code, move it to a separate file when possible.

This is a first step to allow code analysis and node generation beyond
the ShaderIR class.
2019-06-05 20:41:52 -03:00
bunnei
e3608578e4
Merge pull request #2446 from ReinUsesLisp/tid
shader: Implement S2R Tid{XYZ} and CtaId{XYZ}
2019-05-29 12:21:17 -04:00
bunnei
1a2d90ab09
Merge pull request #2485 from ReinUsesLisp/generic-memory
shader/memory: Implement generic memory stores and loads (ST and LD)
2019-05-24 18:24:26 -04:00
Lioncash
b6dcb1ae4d shader/shader_ir: Make Comment() take a std::string by value
This allows for forming comment nodes without making unnecessary copies
of the std::string instance.

e.g. previously:

Comment(fmt::format("Base address is c[0x{:x}][0x{:x}]",
        cbuf->GetIndex(), cbuf_offset));

Would result in a copy of the string being created, as CommentNode()
takes a std::string by value (a const ref passed to a value parameter
results in a copy).

Now, only one instance of the string is ever moved around. (fmt::format
returns a std::string, and since it's returned from a function by value,
this is a prvalue (which can be treated like an rvalue), so it's moved
into Comment's string parameter), we then move it into the CommentNode
constructor, which then moves the string into its member variable).
2019-05-23 03:01:55 -03:00
Lioncash
228e58d0a5 shader/decode/*: Add missing newline to files lacking them
Keeps the shader code file endings consistent.
2019-05-23 02:55:52 -03:00
Lioncash
87b4c1ac5e shader/decode/*: Eliminate indirect inclusions
Amends cases where we were using things that were indirectly being
satisfied through other headers. This way, if those headers change and
eliminate dependencies on other headers in the future, we don't have
cascading compilation errors.
2019-05-23 02:55:52 -03:00
Lioncash
195b54602f shader/decode/memory: Remove left in debug pragma 2019-05-22 17:08:50 -04:00
ReinUsesLisp
75e7b45d69 shader/memory: Implement ST (generic memory) 2019-05-20 22:41:53 -03:00
ReinUsesLisp
f78ef617b6 shader/memory: Implement LD (generic memory) 2019-05-20 22:38:59 -03:00
ReinUsesLisp
9c3461604c shader: Implement S2R Tid{XYZ} and CtaId{XYZ} 2019-05-20 16:36:49 -03:00
bunnei
d49efbfb4a
Merge pull request #2441 from ReinUsesLisp/al2p
shader: Implement AL2P and ALD.PHYS
2019-05-19 14:02:58 -04:00
Lioncash
e310d943b8 shader/shader_ir: Remove unnecessary inline specifiers
constexpr internally links by default, so the inline specifier is
unnecessary.
2019-05-19 08:23:15 -04:00
Lioncash
212b148923 shader/shader_ir: Simplify constructors for OperationNode
Many of these constructors don't even need to be templated. The only
ones that need to be templated are the ones that actually make use of
the parameter pack.

Even then, since std::vector accepts an initializer list, we can supply
the parameter pack directly to it instead of creating our own copy of
the list, then copying it again into the std::vector.
2019-05-19 08:23:14 -04:00
Lioncash
81e7e63080 shader/shader_ir: Remove unnecessary template parameter packs from Operation() overloads where applicable
These overloads don't actually make use of the parameter pack, so they
can be turned into regular non-template function overloads.
2019-05-19 08:23:14 -04:00
Lioncash
e09ee0ff23 shader/shader_ir: Mark tracking functions as const member functions
These don't actually modify instance state, so they can be marked as
const member functions
2019-05-19 08:23:09 -04:00
Lioncash
ce04ab38bb shader/shader_ir: Place implementations of constructor and destructor in cpp file
Given the class contains quite a lot of non-trivial types, place the
constructor and destructor within the cpp file to avoid inlining
construction and destruction code everywhere the class is used.
2019-05-19 04:02:02 -04:00
Lioncash
e43ba3acd4 video_core/shader/decode/texture: Remove unused variable from GetTld4Code() 2019-05-09 18:49:56 -04:00
Lioncash
9e15193ef8
shader/decode/texture: Remove unused variable
This isn't used anywhere, so we can get rid of it.
2019-05-04 02:10:38 -04:00
ReinUsesLisp
d4df803b2b shader_ir/other: Implement IPA.IDX 2019-05-02 21:46:37 -03:00
ReinUsesLisp
28bffb1ffa shader_ir/memory: Assert on non-32 bits ALD.PHYS 2019-05-02 21:46:25 -03:00
ReinUsesLisp
fe700e1856 shader: Add physical attributes commentaries 2019-05-02 21:46:25 -03:00
ReinUsesLisp
c6f9e651b2 gl_shader_decompiler: Implement GLSL physical attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp
71aa9d0877 shader_ir/memory: Implement physical input attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp
06b363c9b5 shader: Remove unused AbufNode Ipa mode 2019-05-02 21:46:25 -03:00
ReinUsesLisp
002ecbea19 shader_ir/memory: Emit AL2P IR 2019-05-02 21:46:25 -03:00
bunnei
91e239d66f
Merge pull request #2435 from ReinUsesLisp/misc-vc
shader_ir: Miscellaneous fixes
2019-04-28 22:29:43 -04:00
bunnei
c52233ec8b
Merge pull request #2322 from ReinUsesLisp/wswitch
video_core: Silent -Wswitch warnings
2019-04-28 22:24:58 -04:00
bunnei
9a3737120d
Merge pull request #2423 from FernandoS27/half-correct
Corrections on Half Float operations: HADD2 HMUL2 and HFMA2
2019-04-28 22:24:22 -04:00
ReinUsesLisp
2156e52014 shader_ir: Move Sampler index entry in operand< to sort declarations 2019-04-26 01:13:05 -03:00
ReinUsesLisp
b77b4b76bb shader_ir: Add missing entry to Sampler operand< comparison 2019-04-26 01:11:24 -03:00
ReinUsesLisp
0b91087a1e shader_ir/texture: Fix sampler const buffer key shift 2019-04-26 01:09:29 -03:00
Fernando Sahmkow
623b2e4b8f Corrections Half Float operations on const buffers and implement saturation. 2019-04-20 21:11:33 -04:00
bunnei
da0c3bc658
Merge pull request #2407 from FernandoS27/f2f
Do some corrections in conversion shader instructions.
2019-04-20 00:42:34 -04:00
bunnei
650d9b1044
Merge pull request #2409 from ReinUsesLisp/half-floats
shader_ir/decode: Miscellaneous fixes to half-float decompilation
2019-04-19 21:31:52 -04:00
ReinUsesLisp
fbe8d1ceaa video_core: Silent -Wswitch warnings 2019-04-18 15:54:39 -03:00
bunnei
5bd5140bde
Merge pull request #2348 from FernandoS27/guest-bindless
Implement Bindless Textures on Shader Decompiler and GL backend
2019-04-17 20:59:49 -04:00
bunnei
0cfbd3325b
Merge pull request #2315 from ReinUsesLisp/severity-decompiler
shader_ir/decode: Reduce the severity of common assertions
2019-04-16 22:21:19 -04:00
ReinUsesLisp
f43995ec53 shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmetic
Operations done before the main half float operation (like HAdd) were
managing a packed value instead of the unpacked one. Adding an unpacked
operation allows us to drop the per-operand MetaHalfArithmetic entry,
simplifying the code overall.
2019-04-15 21:16:10 -03:00
ReinUsesLisp
64613db605 shader_ir/decode: Implement half float saturation 2019-04-15 21:16:10 -03:00
ReinUsesLisp
90cbf89303 shader_ir/decode: Reduce severity of unimplemented half-float FTZ 2019-04-15 21:16:09 -03:00
ReinUsesLisp
acf618afbc renderer_opengl: Implement half float NaN comparisons 2019-04-15 21:13:26 -03:00
ReinUsesLisp
ae46ad48ed shader_ir: Avoid using static on heap-allocated objects
Using static here might be faster at runtime, but it adds a heap
allocation called before main.
2019-04-15 21:12:43 -03:00
Fernando Sahmkow
aa471274d9 Do some corrections in conversion shader instructions.
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
ReinUsesLisp
5c280e6ff0 shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
Fernando Sahmkow
16adc735a5 Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
Fernando Sahmkow
ef8be408d3 Adapt Bindless to work with AOFFI 2019-04-08 12:07:56 -04:00
Fernando Sahmkow
492040bd9c Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. 2019-04-08 11:36:11 -04:00
Fernando Sahmkow
c60b0b8432 Fix TMML 2019-04-08 11:35:22 -04:00
Fernando Sahmkow
fd4e994de3 Refactor GetTextureCode and GetTexCode to use an optional instead of optional parameters 2019-04-08 11:35:18 -04:00
Fernando Sahmkow
4841440382 Implement TXQ_B 2019-04-08 11:29:52 -04:00
Fernando Sahmkow
189bd1980c Implement TMML_B 2019-04-08 11:29:49 -04:00
Fernando Sahmkow
ac3ba9a33e Corrections to TEX_B 2019-04-08 11:28:44 -04:00
Fernando Sahmkow
7af82ca022 Implement Bindless Handling on SetupTexture 2019-04-08 11:23:46 -04:00
Fernando Sahmkow
fe392fff24 Unify both sampler types. 2019-04-08 11:23:45 -04:00
Fernando Sahmkow
e28fd3d0a5 Implement Bindless Samplers and TEX_B in the IR. 2019-04-08 11:23:42 -04:00
ReinUsesLisp
04979560fb shader_ir/memory: Reduce severity of LD_L cache management and log it 2019-04-03 17:12:44 -03:00
ReinUsesLisp
24abeb9a67 shader_ir/memory: Reduce severity of ST_L cache management and log it 2019-04-03 17:12:44 -03:00
Mat M
da02946f4f
shader_ir/decode: Silent implicit sign conversion warning
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-03-31 00:12:54 -03:00
ReinUsesLisp
cb68ce7c2f shader_ir/decode: Implement AOFFI for TEX and TLD4 2019-03-30 02:53:29 -03:00
ReinUsesLisp
cf4ecc1945 shader_ir: Implement immediate register tracking 2019-03-30 02:53:16 -03:00
ReinUsesLisp
5ca63d0675 shader/decode: Remove extras from MetaTexture 2019-02-26 00:11:30 -03:00
ReinUsesLisp
48e6f77c03 shader/decode: Split memory and texture instructions decoding 2019-02-26 00:11:30 -03:00
Lioncash
c1b2e35625 shader/track: Resolve variable shadowing warnings 2019-02-25 09:10:59 -05:00
bunnei
c07987dfab
Merge pull request #2118 from FernandoS27/ipa-improve
shader_decompiler: Improve Accuracy of Attribute Interpolation.
2019-02-24 23:04:22 -05:00
Fernando Sahmkow
10682ad7e0 shader_decompiler: Improve Accuracy of Attribute Interpolation. 2019-02-14 03:25:07 -04:00
ReinUsesLisp
e60d4d70bc gl_shader_decompiler: Re-implement TLDS lod 2019-02-12 17:03:07 -03:00
bunnei
444231a83d
Merge pull request #2108 from FernandoS27/fix-cc
Fix incorrect value for CC bit in IADD
2019-02-12 10:39:03 -05:00
bunnei
c1accfefde
Merge pull request #2109 from FernandoS27/fix-f2i
Corrected F2I None mode to RoundEven.
2019-02-12 10:20:29 -05:00
Fernando Sahmkow
f5ec165e8c Corrected F2I None mode to RoundEven. 2019-02-11 18:46:45 -04:00
Fernando Sahmkow
edd668047c Fix incorrect value for CC bit in IADD 2019-02-11 16:44:43 -04:00
ReinUsesLisp
889c646ac0 shader_ir: Remove F4 prefix to texture operations
This was originally included because texture operations returned a vec4.
These operations now return a single float and the F4 prefix doesn't
mean anything.
2019-02-07 17:36:46 -03:00
ReinUsesLisp
d62b0a9e29 shader_ir: Clean texture management code
Previous code relied on GLSL parameter order (something that's always
ill-formed on an IR design). This approach passes spatial coordiantes
through operation nodes and array and depth compare values in the the
texture metadata. It still contains an "extra" vector containing generic
nodes for bias and component index (for example) which is still a bit
ill-formed but it should be better than the previous approach.
2019-02-07 00:46:13 -03:00
bunnei
f09d1dffd1
Merge pull request #2083 from ReinUsesLisp/shader-ir-cbuf-tracking
shader/track: Add a more permissive global memory tracking
2019-02-06 21:56:14 -05:00
ReinUsesLisp
cfb20c4c9d gl_shader_disk_cache: Save GLSL and entries into the precompiled file 2019-02-06 22:23:39 -03:00
bunnei
72c70d6808
Merge pull request #2081 from ReinUsesLisp/lmem-64
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
bunnei
bb4549a73d
Merge pull request #2082 from FernandoS27/txq-stl
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00
Fernando Sahmkow
0306c50339 Fix TXQ not using the component mask. 2019-02-03 18:17:18 -04:00
ReinUsesLisp
dfa7be5ddf shader_ir/memory: Add ST_L 64 and 128 bits stores 2019-02-03 19:08:10 -03:00
ReinUsesLisp
0d1d755086 shader/track: Search inside of conditional nodes
Some games search conditionally use global memory instructions. This
allows the heuristic to search inside conditional nodes for the source
constant buffer.
2019-02-03 17:21:20 -03:00
ReinUsesLisp
42b75e8be8 shader_ir: Rename BasicBlock to NodeBlock
It's not always used as a basic block. Rename it for consistency.
2019-02-03 17:21:20 -03:00
ReinUsesLisp
6a6fabea58 shader_ir: Pass decoded nodes as a whole instead of per basic blocks
Some games call LDG at the top of a basic block, making the tracking
heuristic to fail. This commit lets the heuristic the decoded nodes as a
whole instead of per basic blocks.

This may lead to some false positives but allows it the heuristic to
track cases it previously couldn't.
2019-02-03 17:21:20 -03:00
ReinUsesLisp
f61c1ed246 shader_ir/memory: Add LD_L 128 bits loads 2019-02-03 00:35:34 -03:00
ReinUsesLisp
9feb68085d shader_bytecode: Rename BytesN enums to BitsN 2019-02-03 00:25:40 -03:00
ReinUsesLisp
0be835132c shader_ir/memory: Add LD_L 64 bits loads 2019-02-03 00:25:40 -03:00
ReinUsesLisp
477d616f7d shader_ir: Unify constant buffer offset values
Constant buffer values on the shader IR were using different offsets if
the access direct or indirect. cbuf34 has a non-multiplied offset while
cbuf36 does. On shader decoding this commit multiplies it by four on
cbuf34 queries.
2019-01-30 02:45:50 -03:00
ReinUsesLisp
3b84e04af1 shader_decode: Implement LDG and basic cbuf tracking 2019-01-30 00:00:15 -03:00
Lioncash
b2b98b2f44 shader/shader_ir: Amend three comment typos
Given we're in the area, these are three trivial typos that can be
corrected.
2019-01-28 07:52:04 -05:00
Lioncash
62e08c30b7 shader/shader_ir: Amend constructor initializer ordering for AbufNode
Orders the class members in the same order that they would actually be
initialized in. Gets rid of two compiler warnings.
2019-01-28 07:50:34 -05:00
Lioncash
3e1a9a45a6 shader/decode: Avoid a pessimizing std::move within DecodeRange()
std::moveing a local variable in a return statement has the potential to
prevent copy elision from occurring, so this can just be converted into
a regular return.
2019-01-28 07:43:23 -05:00
ReinUsesLisp
a63d7c49fc shader_ir: Fixup clang build 2019-01-15 21:06:05 -03:00
ReinUsesLisp
1c9c4eefeb shader_decode: Fixup XMAD 2019-01-15 17:54:53 -03:00
ReinUsesLisp
170c8212bb shader_ir: Pass to decoder functions basic block's code 2019-01-15 17:54:53 -03:00
ReinUsesLisp
2d6c064e66 shader_decode: Improve zero flag implementation 2019-01-15 17:54:53 -03:00
ReinUsesLisp
d911740e5d shader_ir: Remove composite primitives and use temporals instead 2019-01-15 17:54:53 -03:00
ReinUsesLisp
50195b1704 shader_decode: Use proper primitive names 2019-01-15 17:54:53 -03:00
ReinUsesLisp
2faad9bf23 shader_decode: Use BitfieldExtract instead of shift + and 2019-01-15 17:54:53 -03:00
ReinUsesLisp
52223313b1 shader_ir: Remove Ipa primitive 2019-01-15 17:54:53 -03:00
ReinUsesLisp
af5d7e2c49 video_core: Rename glsl_decompiler to gl_shader_decompiler 2019-01-15 17:54:53 -03:00
ReinUsesLisp
d9118d324a shader_ir: Remove RZ and use Register::ZeroIndex instead 2019-01-15 17:54:53 -03:00
ReinUsesLisp
5af82a8ed4 shader_decode: Implement TEXS.F16 2019-01-15 17:54:53 -03:00
ReinUsesLisp
c68c13e1aa shader_decode: Fixup R2P 2019-01-15 17:54:53 -03:00