Commit Graph

1290 Commits

Author SHA1 Message Date
bunnei
1b41b875dc shaders: Add NumTextureSamplers const, remove unused #pragma. 2018-04-14 18:50:06 -04:00
bunnei
e6224fec27 shaders: Address PR review feedback. 2018-04-14 16:01:41 -04:00
bunnei
eabeedf6af gl_shader_decompiler: Cleanup log statements. 2018-04-14 16:01:41 -04:00
bunnei
0d408b965b shaders: Fix GCC and clang build issues. 2018-04-14 16:01:40 -04:00
bunnei
86135864da gl_shader_decompiler: Implement negate, abs, etc. and lots of cleanup. 2018-04-14 16:01:40 -04:00
bunnei
7639667562 shader_bytecode: Add FSETP and KIL to GetInfo. 2018-04-14 16:01:40 -04:00
bunnei
5a47832221 shader_bytecode: Add SubOp decoding. 2018-04-14 16:01:40 -04:00
bunnei
50023bdae7 gl_shader_decompiler: Add shader stage hint. 2018-04-14 16:01:39 -04:00
bunnei
a992aac5eb renderer_opengl: Fix Morton copy byteswap, etc. 2018-04-14 16:01:39 -04:00
bunnei
0ca8fce9d0 gl_shader_manager: Implement SetShaderSamplerBindings. 2018-04-13 23:48:30 -04:00
bunnei
beddc8afd2 gl_rasterizer: Generate shaders and upload uniforms. 2018-04-13 23:48:29 -04:00
bunnei
85d77a3d24 gl_shader_decompiler: Basic impl. for very simple vertex shaders.
- Tested with Puyo Puyo Tetris and Cave Story+
2018-04-13 23:48:28 -04:00
bunnei
51f37f5061 gl_shader_manager: Cleanup and consolidate uniform handling. 2018-04-13 23:48:28 -04:00
bunnei
35aca0bf1f maxwell_3d: Make memory_manager public. 2018-04-13 23:48:27 -04:00
bunnei
33bb53571b maxwell_3d: Fix shader_config decodings. 2018-04-13 23:48:26 -04:00
bunnei
5617831d5f gl_rasterizer: Use shader program manager, remove test shader. 2018-04-13 23:48:26 -04:00
bunnei
459826a705 renderer_opengl: Add gl_shader_manager class. 2018-04-13 23:48:25 -04:00
bunnei
8aa21a03b3 maxwell_to_gl: Add a few types, etc. 2018-04-13 23:48:24 -04:00
bunnei
10953495c1 gl_shader_gen: Add hashable setup/config structs. 2018-04-13 23:48:23 -04:00
bunnei
2fcbb35ad2 gl_shader_util: Add missing includes. 2018-04-13 23:48:23 -04:00
bunnei
da1114ca59 renderer_opengl: Use OGLProgram instead of OGLShader. 2018-04-13 23:48:21 -04:00
bunnei
4f2b2d0bc5 gl_shader_util: Grab latest upstream. 2018-04-13 23:48:21 -04:00
bunnei
dbfd106ba0 gl_resource_manager: Grab latest upstream. 2018-04-13 23:48:20 -04:00
bunnei
ed7e597b44 gl_shader_decompiler: Add skeleton code from Citra for shader analysis. 2018-04-13 23:48:20 -04:00
bunnei
4e7e0f8112 shader_bytecode: Add initial module for shader decoding. 2018-04-13 23:48:19 -04:00
James Rowe
0b855f1c21 Fix clang format issues 2018-04-06 22:00:48 -06:00
Subv
dcc27d6dc1 GPU: Assert when finding a texture with a format type other than UNORM. 2018-04-06 20:44:46 -06:00
Subv
b0ca330e14 GL: Set up the textures used for each draw call.
Each Maxwell shader stage can have an arbitrary number of textures, but we're limited to a certain number in OpenGL. We try to only use the minimum amount of host textures by not keeping a 1:1 relation between guest texture ids and host texture ids, ie, guest texture id 8 can be host texture id 0 if it's the only texture used in the guest shader program.
This mapping will have to be passed to the shader decompiler so it can rewrite the texture accesses.
2018-04-06 20:44:46 -06:00
Subv
cb3183212d GL: Bind the textures to the shaders used for drawing. 2018-04-06 20:44:46 -06:00
Subv
65faeb9b2a GLCache: Specialize the MortonCopy function for the DXT1 texture format.
It will now use the UnswizzleTexture function instead of the MortonCopyPixels128, which doesn't seem to work for textures.
2018-04-06 20:44:46 -06:00
Subv
b258403f0d GLCache: Implemented GetTextureSurface. 2018-04-06 20:44:45 -06:00
Subv
65ea52394b GLCache: Support uploading compressed textures to the GPU.
Compressed texture formats like DXT1, DXT2, DXT3, etc will use this to ease the load on the CPU.
2018-04-06 20:44:45 -06:00
Subv
73eaef9c05 GL: Remove remaining references to 3DS-specific pixel formats 2018-04-06 20:44:42 -06:00
Subv
b305646c44 RasterizerCache: Remove 3DS-specific pixel formats.
We're only left with RGB8 and DXT1 for now. More will be added as they are needed.
2018-04-06 20:40:24 -06:00
Subv
c28ed85875 GL: Create the sampler objects when starting up the GL rasterizer. 2018-04-06 20:40:24 -06:00
Subv
ca96b04a0c GL: Ported the SamplerInfo struct from citra. 2018-04-06 20:40:24 -06:00
Subv
0171ec606b GL: Rename PicaTexture to MaxwellTexture. 2018-04-06 20:40:24 -06:00
Subv
f73a280eeb GL: Added functions to convert Maxwell tex filters and wrap modes to OpenGL. 2018-04-06 20:40:23 -06:00
Subv
ad1810e895 Textures: Added a helper function to know if a texture is blocklinear or pitch. 2018-04-06 20:40:23 -06:00
N00byKing
d1d7582a5b
rasterizer_interface.h: Update from citra to yuzu 2018-04-04 23:07:58 +02:00
N00byKing
27dbbd8227
gl_rasterizer_cache.cpp: Update from citra to yuzu 2018-04-04 23:05:10 +02:00
N00byKing
cfc28e0c1a
gl_rasterizer_cache.h: Update from citra to yuzu 2018-04-04 23:04:24 +02:00
N00byKing
ca17f581f5
renderer_opengl.h: Update from citra to yuzu 2018-04-04 23:03:02 +02:00
Subv
11b4ab9685 GPU: Use the MacroInterpreter class to execute the GPU macros instead of HLEing them. 2018-04-01 12:07:26 -05:00
Subv
1ec8d2123d GPU: Implemented a gpu macro interpreter.
The Ryujinx macro interpreter and envydis were used as reference.

Macros are programs that are uploaded by the games during boot and can later be called by writing to their method id in a GPU command buffer.
2018-04-01 12:07:26 -05:00
bunnei
5e343edc9e renderer_opengl: Use better naming for DrawScreens and DrawSingleScreen. 2018-03-26 21:17:07 -04:00
bunnei
c33abac275 gl_rasterizer: Move code to bind framebuffer surfaces before draw to its own function. 2018-03-26 21:17:05 -04:00
bunnei
d30110348b gl_rasterizer: Add a SyncViewport method. 2018-03-26 21:17:04 -04:00
bunnei
67bc2f5ecd gl_rasterizer: Move PrimitiveTopology check to MaxwellToGL. 2018-03-26 21:17:03 -04:00
bunnei
666d53299c graphics_surface: Fix merge conflicts. 2018-03-26 21:17:03 -04:00
bunnei
ac19e3d061 gl_rasterizer: Use ReadBlock instead of GetPointer for SetupVertexArray. 2018-03-26 21:17:02 -04:00
bunnei
a6cab532f8 gl_rasterizer: Normalize vertex array data as appropriate. 2018-03-26 21:17:02 -04:00
bunnei
527ce12ce4 maxwel_to_gl: Fix string formatting in log statements. 2018-03-26 21:17:01 -04:00
bunnei
d89bfec5f5 rasterizer: Rename DrawTriangles to DrawArrays. 2018-03-26 21:17:00 -04:00
bunnei
1bfc0dc2db gl_rasterizer: Use passthrough shader for SetupVertexShader. 2018-03-26 21:17:00 -04:00
bunnei
0a5832798a renderer_opengl: Logging, etc. cleanup. 2018-03-26 21:16:59 -04:00
bunnei
7504df52fc renderer_opengl: Remove framebuffer RasterizerFlushVirtualRegion hack. 2018-03-26 21:16:58 -04:00
bunnei
c1ccbf332f gl_rasterizer_cache: Implement UpdatePagesCachedCount. 2018-03-26 21:16:58 -04:00
bunnei
c2dbdefedf gl_rasterizer: Implement SetupVertexArray. 2018-03-26 21:16:56 -04:00
bunnei
cd8bb6ea9b gl_rasterizer_cache: Fix an ASSERT_MSG. 2018-03-26 21:16:56 -04:00
bunnei
4369af6b7e maxwell_to_gl: Add module and function for decoding VertexType. 2018-03-26 21:16:55 -04:00
bunnei
3754e0fdfd maxwell_3d: Use names that match envytools for VertexType. 2018-03-26 21:16:55 -04:00
bunnei
15925b8293 maxwell_3d: Add VertexAttribute struct and cleanup. 2018-03-26 21:16:54 -04:00
bunnei
0ee38e1363 gl_rasterizer: Use 32 texture units instead of 3. 2018-03-26 21:16:53 -04:00
bunnei
0162a2d5cb gl_rasterizer: Implement DrawTriangles. 2018-03-26 21:16:53 -04:00
bunnei
33c0bf9dc5 Maxwell3D: Call AccelerateDrawBatch on DrawArrays. 2018-03-26 21:16:52 -04:00
bunnei
ed2134784e gl_rasterizer: Implement AnalyzeVertexArray. 2018-03-26 21:16:52 -04:00
bunnei
8041d72a1f gl_rasterizer_cache: MortonCopy Switch-style. 2018-03-26 21:16:51 -04:00
bunnei
170ac3f9ee gl_rasterizer_cache: Implement GetFramebufferSurfaces. 2018-03-26 21:16:51 -04:00
bunnei
94c70693f9 maxwell: Add RenderTargetFormat enum. 2018-03-26 21:16:49 -04:00
bunnei
1a9df83535 renderer_opengl: Only draw the screen if a framebuffer is specified. 2018-03-26 21:16:49 -04:00
Subv
4697025b73 GPU: Load the sampler info (TSC) when retrieving active textures. 2018-03-26 15:46:49 -05:00
Subv
56e2013c1f GPU: Added the TSC structure. It contains information about the sampler. 2018-03-26 15:45:05 -05:00
Subv
6afe9e0105 GPU: Added more fields to the TIC structure. 2018-03-26 15:44:20 -05:00
Subv
0ce52b1da2 GPU: Make the debug_context variable a member of the frontend instead of a global. 2018-03-24 23:35:06 -05:00
Subv
2c785bd06c GPU: Added a function to retrieve the active textures for a shader stage.
TODO: A shader may not use all of these textures at the same time, shader analysis should be performed to determine which textures are actually sampled.
2018-03-24 11:31:53 -05:00
Subv
39e60cfeb1 Frontend: Updated the surface view debug widget to work with Maxwell surfaces. 2018-03-24 11:31:53 -05:00
Subv
1c31e2b3d2 GPU: Implement the Incoming/FinishedPrimitiveBatch debug breakpoints. 2018-03-24 11:31:50 -05:00
Subv
1ad97c75a0 GPU: Implement the MaxwellCommandLoaded/Processed debug breakpoints. 2018-03-24 11:31:50 -05:00
Subv
77fd0d47e7 Frontend: Ported the GPU breakpoints and surface viewer widgets from citra. 2018-03-24 11:31:49 -05:00
Subv
1b8d798835 GPU: Added a method to unswizzle a texture without decoding it.
Allow unswizzling of DXT1 textures.
2018-03-24 11:30:56 -05:00
Subv
71ebc3e90d GPU: Preliminary work for texture decoding. 2018-03-24 11:30:56 -05:00
Subv
9b9de30086 GPU: Added viewport registers to Maxwell3D's reg structure. 2018-03-24 01:22:19 -05:00
bunnei
d561e4acc8 gl_rasterizer: Fake render in green, because it's cooler. 2018-03-23 22:27:53 -04:00
bunnei
4ed54738fc gl_rasterizer: Log warning instead of sync'ing unimplemented funcs. 2018-03-23 22:24:16 -04:00
bunnei
b7da9d5a54 gl_rasterizer_cache: Add missing include for vm_manager. 2018-03-23 16:54:20 -04:00
bunnei
0f8401906b renderer_opengl: Only invalidate the framebuffer region, not flush. 2018-03-23 15:52:14 -04:00
bunnei
054393917e renderer_opengl: Fixes for properly flushing & rendering the framebuffer. 2018-03-23 15:49:04 -04:00
bunnei
b36b627d4d RasterizerCacheOpenGL: FlushAll should flush full memory region. 2018-03-23 15:25:16 -04:00
bunnei
11047d7fd5 rasterizer: Flush and invalidate regions should be 64-bit. 2018-03-23 15:01:45 -04:00
bunnei
cdf541fb5b renderer_opengl: Add framebuffer_transform_flags member variable. 2018-03-23 14:59:14 -04:00
bunnei
ec4e1a3685 renderer_opengl: Better handling of framebuffer transform flags. 2018-03-23 14:58:27 -04:00
bunnei
c2c55e0811 renderer_opengl: Use accelerated framebuffer load with LoadFBToScreenInfo. 2018-03-22 23:28:37 -04:00
bunnei
a0b1235f82 gl_rasterizer: Implement AccelerateDisplay method from Citra. 2018-03-22 23:06:54 -04:00
bunnei
f61b9f7338 LoadGLBuffer: Use bytes_per_pixel, not bits. 2018-03-22 23:01:57 -04:00
bunnei
6ced80bb47 gl_rasterizer_cache: LoadGLBuffer should do a morton copy. 2018-03-22 22:54:04 -04:00
bunnei
740310113b video_core: Move MortonCopyPixels128 to utils header. 2018-03-22 22:52:40 -04:00
bunnei
8a250de987 video_core: Remove usage of PAddr and replace with VAddr. 2018-03-22 21:13:46 -04:00
bunnei
bfe45774f1 video_core: Move FramebufferInfo to FramebufferConfig in GPU. 2018-03-22 21:04:30 -04:00
bunnei
c6362543d4 gl_rasterizer: Replace a bunch of UNIMPLEMENTED with ASSERT. 2018-03-22 20:19:34 -04:00
bunnei
f707c2dac4 gl_rasterizer: Add a simple passthrough shader in lieu of shader generation. 2018-03-22 20:00:41 -04:00
bunnei
7c3a263839 gpu: Expose Maxwell3D engine. 2018-03-22 19:48:20 -04:00
bunnei
3a6604e8fa maxwell_3d: Add some format decodings and string helper functions. 2018-03-22 19:47:28 -04:00
bunnei
656de23d93 renderer: Create rasterizer and cleanup. 2018-03-22 19:46:37 -04:00
Subv
c450d264eb GPU: Added vertex attribute format registers. 2018-03-21 09:26:47 -05:00
Subv
ae28a52277 GPU: Added registers for the number of vertices to render. 2018-03-20 23:28:06 -05:00
bunnei
0b3ab30762
Merge pull request #254 from bunnei/port-citra-renderer
Port Citra OpenGL rasterizer code
2018-03-20 21:37:43 -04:00
bunnei
6e3222363c renderer_gl: Port boilerplate rasterizer code over from Citra. 2018-03-20 00:07:32 -04:00
bunnei
9c468e0c55 gl_shader_util: Sync latest version with Citra. 2018-03-20 00:07:31 -04:00
bunnei
d7b1ebe4a8 renderer_gl: Port over gl_shader_gen module from Citra. 2018-03-20 00:07:30 -04:00
Mat M
f4700ccabf
Merge pull request #253 from Subv/rt_depth
GPU: Added registers for color and Z buffers.
2018-03-19 23:37:47 -04:00
bunnei
4bdb46e4c2 renderer_gl: Port over gl_shader_decompiler module from Citra. 2018-03-19 23:14:03 -04:00
bunnei
a3e10b1a72 renderer_gl: Port over gl_rasterizer_cache module from Citra. 2018-03-19 23:14:03 -04:00
bunnei
db0cfb8e8b gl_resource_manager: Sync latest version with Citra. 2018-03-19 23:14:02 -04:00
bunnei
0e4b9cdde4 renderer_gl: Port over gl_stream_buffer module from Citra. 2018-03-19 23:14:02 -04:00
bunnei
6a0902e56d gl_state: Sync latest version with Citra. 2018-03-19 23:13:49 -04:00
Subv
7a27a11770 GPU: Added Z buffer registers to Maxwell3D's reg structure. 2018-03-19 16:55:33 -05:00
Subv
21d9519032 GPU: Added the render target (RT) registers to Maxwell3D's reg structure. 2018-03-19 16:46:29 -05:00
N00byKing
1d8b6ad13b Clang Fixes 2018-03-19 17:53:35 +01:00
N00byKing
ef875d6a35 Clean Warnings (?) 2018-03-19 17:07:08 +01:00
Subv
dcae0c9a4f GPU: Added the TSC registers to the Maxwell3D register structure. 2018-03-19 00:36:25 -05:00
Subv
cff7b29bba GPU: Added the TIC registers to the Maxwell3D register structure. 2018-03-19 00:32:57 -05:00
Subv
03156d0c9a GPU: Implement macro 0xE1A BindTextureInfoBuffer in HLE.
This macro simply sets the current CB_ADDRESS to the texture buffer address for the input shader stage.
2018-03-18 19:03:40 -05:00
Subv
7b6868e908 GPU: Implement the BindStorageBuffer macro method in HLE.
This macro binds the SSBO Info Buffer as the current ConstBuffer.
This buffer is usually bound to c0 during shader execution.
Games seem to use this macro instead of directly writing the address for some reason.
2018-03-18 16:50:42 -05:00
Subv
85d820b1b4 GPU: Handle writes to the CB_DATA method.
Writing to this method will cause the written value to be stored in the currently-set ConstBuffer plus CB_POS.

This method is usually used to upload uniforms or other shader-visible data.
2018-03-18 15:23:24 -05:00
Subv
a64b936cbe GPU: Move the GPU's class constructor and destructors to a cpp file.
This should reduce recompile times when editing the Maxwell3D register structure.
2018-03-18 15:23:24 -05:00
Subv
aa586fa268 GPU: Store uploaded GPU macros and keep track of the number of method parameters. 2018-03-18 11:51:46 -05:00
Subv
7ac8657432 GPU: Macros are specific to the Maxwell3D engine, so handle them internally. 2018-03-18 11:51:45 -05:00
Subv
ccb8da1512 GPU: Renamed ShaderType to ShaderStage as that is less confusing. 2018-03-17 18:32:57 -05:00
Subv
88698c156f GPU: Store shader constbuffer bindings in the GPU state. 2018-03-17 18:32:57 -05:00
Subv
66dae22790 GPU: Corrected some register offsets and removed superfluous macro registers. 2018-03-17 18:32:56 -05:00
Subv
1d9d9c16e8 GPU: Make the SetShader macro call do the same as the real macro's code.
It'll now set the CB_SIZE, CB_ADDRESS and CB_BIND registers when it's called.

Presumably this SetShader function is binding the constant shader uniforms to buffer 1 (c1[]).
2018-03-17 18:32:55 -05:00
Subv
579000e747 GPU: Corrected the parameter documentation for the SetShader macro call.
Register 0xE24 is actually a macro that sets some shader parameters in the register structure.

Macros are uploaded to the GPU at startup and have their own ISA, we'll probably write an interpreter for this in the future.
2018-03-17 13:55:42 -05:00
bunnei
516ef4f19f
Merge pull request #242 from Subv/set_shader
GPU: Handle the SetShader method call (0xE24) and store the shader config.
2018-03-17 00:34:17 -04:00
Subv
f93d769a1c GPU: Handle the SetShader method call (0xE24) and store the shader config. 2018-03-16 22:51:06 -05:00
Subv
d2888f7e90 GPU: Added the vertex array registers. 2018-03-16 22:47:45 -05:00
bunnei
cd4e8a989c
Merge pull request #241 from Subv/gpu_method_call
GPU: Process command mode 5 (IncreaseOnce) differently from other commands
2018-03-16 22:28:22 -04:00
Subv
29feece4b8 GPU: Process command mode 5 (IncreaseOnce) differently from other commands.
Accumulate all arguments before calling the desired method.

Note: Maybe we should do the same for the NonIncreasing mode?
2018-03-16 20:32:44 -05:00
Subv
bf310a41b8 GPU: Assert that we get a 0 CODE_ADDRESS register in the 3D engine.
Shader address calculation depends on this value to some extent, we do not currently know what it being 0 entails.
2018-03-16 19:24:41 -05:00
Subv
cbec739e7b GPU: Added Maxwell registers for Shader Program control. 2018-03-16 19:23:11 -05:00
Subv
5fb4c718cc GPU: Intercept writes to the VERTEX_END_GL register.
This is the register that gets written after a game calls DrawArrays().

We should collect all GPU state and draw using our graphics API here.
2018-03-04 19:14:04 -05:00
Lioncash
490d0e36a0
maxwell_3d: Make constructor explicit 2018-02-13 23:47:51 -05:00
bunnei
af8ae770ef
Merge pull request #187 from Subv/maxwell3d_query
GPU: Partially implemented the QUERY_* registers in the Maxwell3D engine.
2018-02-13 23:25:07 -05:00
bunnei
be5ba4d952
Merge pull request #178 from Subv/command_buffers
GPU: Added a command processor to decode the GPU pushbuffers and forward the commands to their respective engines
2018-02-12 13:51:52 -05:00
Subv
ac61a7d1e6 GPU: Partially implemented the QUERY_* registers in the Maxwell3D engine.
Only QueryMode::Write is supported at the moment.
2018-02-12 12:34:41 -05:00
Subv
6cddf9d88e Make a GPU class in VideoCore to contain the GPU state.
Also moved the GPU MemoryManager class to video_core since it makes more sense for it to be there.
2018-02-11 23:44:12 -05:00
Subv
e01a8f2187 GPU: Added a command processor to decode the GPU pushbuffers and forward the commands to their respective engines. 2018-02-11 22:42:48 -05:00
bunnei
deadcb39c2 renderer_opengl: Support framebuffer flip vertical. 2018-02-11 21:03:55 -05:00
MerryMage
738f91a57d memory: Replace all memory hooking with Special regions 2018-01-27 15:16:39 +00:00
James Rowe
096be16636 Format: Run the new clang format on everything 2018-01-20 16:45:11 -07:00
Lioncash
e710a1b989 CMakeLists: Derive the source directory grouping from targets themselves
Removes the need to store to separate SRC and HEADER variables, and then
construct the target in most cases.
2018-01-17 21:51:43 -05:00
MerryMage
e35644c005 clang-format 2018-01-16 18:05:21 +00:00
bunnei
92801b1c34 renderer_gl: Clear screen to black before rendering framebuffer. 2018-01-15 00:20:19 -05:00
bunnei
ebd613c2cc renderer: Render previous frame when no new one is available. 2018-01-14 23:54:56 -05:00
MerryMage
e86bdb1601 Fix build on macOS and linux 2018-01-13 22:38:52 +00:00
James Rowe
389979018c Remove gpu debugger and get yuzu qt to compile 2018-01-12 19:11:04 -07:00
James Rowe
1d28b2e142 Remove references to PICA and rasterizers in video_core 2018-01-12 19:11:03 -07:00
bunnei
11adef4843 renderer_opengl: Fix LOG_TRACE in LoadFBToScreenInfo. 2018-01-11 22:32:44 -05:00
bunnei
ee4691297f renderer_opengl: Support rendering Switch framebuffer. 2018-01-10 23:28:59 -05:00
bunnei
236d463c52 render_base: Add a struct describing framebuffer metadata. 2018-01-10 23:28:56 -05:00
bunnei
866e66dc31 renderer_opengl: Add MortonCopyPixels function for Switch framebuffer. 2018-01-10 23:28:53 -05:00
bunnei
9e2ad45c98 renderer_opengl: Update DrawScreens for Switch. 2018-01-10 23:28:49 -05:00
bunnei
93480b10ef core/video_core: Fix a bunch of u64 -> u32 warnings. 2018-01-01 15:40:35 -05:00
bunnei
960a1416de hle: Initial implementation of NX service framework and IPC. 2017-10-14 22:18:42 -04:00
Huw Pascoe
b3b34a1e76 Extracted the attribute setup and draw commands into their own functions 2017-10-04 01:08:29 +01:00
Huw Pascoe
a13ab958cb Fixed type conversion ambiguity 2017-09-30 09:34:35 +01:00
Subv
a321bce378 Disable unary operator- on Math::Vec2/Vec3/Vec4 for unsigned types.
It is unlikely we will ever use this without first doing a Cast to a signed type.
Fixes 9 "unary minus operator applied to unsigned type, result still unsigned" warnings on MSVC2017.3
2017-09-27 09:06:41 -05:00
B3n30
dc6a365337 Merge pull request #2951 from huwpascoe/perf-4
Optimized Morton
2017-09-25 08:28:55 +02:00
Huw Pascoe
903906da3b Optimized Float<M,E> multiplication
Before:

ucomiss xmm1, xmm1
jp      .L9
pxor    xmm2, xmm2
mov     edx, 1
ucomiss xmm0, xmm2
setp    al
cmovne  eax, edx
test    al, al
jne     .L9
.L3:
movaps  xmm0, xmm2
ret
.L9:
ucomiss xmm0, xmm0
jp      .L10
pxor    xmm2, xmm2
mov     edx, 1
ucomiss xmm1, xmm2
setp    al
cmovne  eax, edx
test    al, al
je      .L3

After:

movaps  xmm2, xmm1
mulss   xmm2, xmm0
ucomiss xmm2, xmm2
jnp     .L3
ucomiss xmm1, xmm0
jnp     .L11
.L3:
movaps  xmm0, xmm2
ret
.L11:
pxor    xmm2, xmm2
jmp     .L3
2017-09-25 00:54:02 +01:00
Huw Pascoe
876aa82c29 Optimized Morton 2017-09-24 22:27:14 +01:00
James Rowe
93930a966f Merge pull request #2921 from jroweboy/batch-fix-2
GPU: Add draw for immediate and batch modes
2017-09-24 07:57:16 -06:00
James Rowe
19d41dcc6e Remove pipeline.gpu_mode and fix minor issues 2017-09-23 09:28:20 -06:00
Yuri Kunde Schlesner
a7758b0b36 Merge pull request #2928 from huwpascoe/master
Fixed framebuffer warning
2017-09-22 04:06:38 +02:00
Huw Pascoe
a234e4c200 Improved performance of FromAttributeBuffer
Ternary operator is optimized by the compiler
whereas std::min() is meant to return a value.

I've noticed a 5%-10% emulation speed increase.
2017-09-17 15:56:36 +01:00
Huw Pascoe
6a110ac5f5 Fixed framebuffer warning 2017-09-17 11:57:06 +01:00
Yuri Kunde Schlesner
699c920991 Merge pull request #2900 from wwylele/clip-2
PICA: implement custom clip plane
2017-09-16 10:23:00 +02:00
James Rowe
ad0b57f407 GPU: Add draw for immediate and batch modes
PR #1461 introduced a regression where some games would change configuration
even while in the poorly named "drawing" mode, which broke the heuristic
citra was using to determine when to draw the batch. This change adds
back in a draw call for batching, and also adds in a draw call in
immediate mode each time it adds a triangle.
2017-09-11 09:21:43 -06:00
bunnei
11baa40d75 Merge pull request #2865 from wwylele/gs++
PICA: implemented geometry shader
2017-09-07 23:02:59 -04:00
bunnei
ff4941fb3a Merge pull request #2914 from wwylele/fresnel-fix
pica/lighting: only apply Fresnel factor for the last light
2017-09-05 10:00:49 -04:00
wwylele
12fbc8c8df pica/lighting: only apply Fresnel factor for the last light 2017-09-03 08:22:03 +03:00
wwylele
e2c41a5891 video_core: report telemetry for gas mode 2017-08-31 12:54:17 +03:00
bunnei
f0e461bf6f Merge pull request #2891 from wwylele/sw-bump
SwRasterizer/Lighting: implement bump mapping
2017-08-30 21:07:30 -04:00
Weiyi Wang
647f017c6d Merge pull request #2892 from Subv/warnings2
Warnings: Fixed a few missing-return warnings in video_core.
2017-08-28 03:21:51 -05:00
Subv
da88f3b8f0 Warnings: Fixed a few missing-return warnings in video_core. 2017-08-26 11:58:22 -05:00
wwylele
417cb45e3f SwRasterizer/Clipper: flip the sign convention to match PICA and OpenGL 2017-08-25 07:26:45 +03:00
wwylele
addbcd5784 gl_rasterizer: implement custom clip plane 2017-08-25 07:26:45 +03:00
wwylele
ea51a3af26 SwRasterizer: implement custom clip plane 2017-08-24 15:34:27 +03:00
wwylele
17c6104d2a gl_rasterizer/lighting: more accurate CP formula 2017-08-22 09:34:44 +03:00
wwylele
b5aa570354 SwRasterizer/Lighting: implement LUT input CP 2017-08-22 09:34:44 +03:00
wwylele
3e478ca131 SwRasterizer/Lighting: implement bump mapping 2017-08-22 09:34:44 +03:00
wwylele
63b6e802cd swrasterizer: remove invalid TODO
This function is called in clipping, before the pespective divide, and is not used in later rasterization. Thus it doesn't need perspective correction.
2017-08-21 08:03:07 +03:00
wwylele
72b26ac32f swrasterizer/clipper: remove tested TODO
hwtested. Current implementation is the correct behavior
2017-08-21 08:03:07 +03:00
wwylele
5a4af616c6 gl_shader_gen: simplify and clarify the depth transformation between vertex shader and fragment shader 2017-08-21 08:03:07 +03:00
wwylele
1eca380886 gl_rasterizer: add clipping plane z<=0 defined in PICA 2017-08-21 08:03:07 +03:00
Yuri Kunde Schlesner
46d1ca768d Merge pull request #2872 from wwylele/sw-geo-factor
SwRasterizer/Lighting: implement geometric factor
2017-08-20 17:49:42 -07:00
James Rowe
8afa81ac1b Merge pull request #2871 from wwylele/sw-spotlight
SwRasterizer/Lighting: implement spot light
2017-08-19 20:10:24 -06:00
wwylele
0f35755572 pica/command_processor: build geometry pipeline and run geometry shader
The geometry pipeline manages data transfer between VS, GS and primitive assembler. It has known four modes:
 - no GS mode: sends VS output directly to the primitive assembler (what citra currently does)
 - GS mode 0: sends VS output to GS input registers, and sends GS output to primitive assembler
 - GS mode 1: sends VS output to GS uniform registers, and sends GS output to primitive assembler. It also takes an index from the index buffer at the beginning of each primitive for determine the primitive size.
 - GS mode 2: similar to mode 1, but doesn't take the index and uses a fixed primitive size.
hwtest shows that immediate mode also supports GS (at least for mode 0), so the geometry pipeline gets refactored into its own class for supporting both drawing mode.
In the immediate mode, some games don't set the pipeline registers to a valid value until the first attribute input, so a geometry pipeline reset flag is set in `pipeline.vs_default_attributes_setup.index` trigger, and the actual pipeline reconfigure is triggered in the first attribute input.
In the normal drawing mode with index buffer, the vertex cache is a little bit modified to support the geometry pipeline. Instead of OutputVertex, it now holds AttributeBuffer, which is the input to the geometry pipeline. The AttributeBuffer->OutputVertex conversion is done inside the pipeline vertex handler. The actual hardware vertex cache is believed to be implemented in a similar way (because this is the only way that makes sense).
Both geometry pipeline and GS unit rely on states preservation across drawing call, so they are put into the global state. In the future, the other three vertex shader units should be also placed in the global state, and a scheduler should be implemented on top of the four units. Note that the current gs_unit already allows running VS on it in the future.
2017-08-19 10:13:20 +03:00
wwylele
8285ca4ad8 pica/shader/jit: implement SETEMIT and EMIT 2017-08-19 10:13:20 +03:00
wwylele
36981a5aa6 pica/primitive_assembly: Handle winding for GS primitive
hwtest shows that, although GS always emit a group of three vertices as one primitive, it still respects to the topology type, as if the three vertices are input into the primitive assembler independently and sequentially. It is also shown that the winding flag in SETEMIT only takes effect for Shader topology type, which is believed to be the actual difference between List and Shader (hence removed the TODO). However, only Shader topology type is observed in official games when GS is in use, so the other mode seems to be just unintended usage.
2017-08-19 10:13:20 +03:00
wwylele
bb63ae3052 correct constness 2017-08-19 10:13:20 +03:00
wwylele
28128348f2 pica/shader/interpreter: implement SETEMIT and EMIT 2017-08-19 10:13:20 +03:00
wwylele
46c6973d2b pica/shader: extend UnitState for GS
Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access.
This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
2017-08-19 10:13:20 +03:00
wwylele
686fb3e78c gl_shader_gen: don't call SampleTexture when bump map is not used 2017-08-11 18:35:00 +03:00
wwylele
945f9a1b04 SwRasterizer/Lighting: implement spot light 2017-08-11 01:19:10 +03:00
wwylele
14ee32c46a SwRasterizer/Lighting: implement geometric factor 2017-08-11 01:18:43 +03:00
wwylele
5d9d42f0d0 SwRasterizer/Lighting: use make_tuple instead of constructor
implicit tuple constructor is a c++17 thing, which is not supported by some not-so-old libraries. Play safe for now
2017-08-10 12:19:58 +03:00
wwylele
db309b2423 pica/regs: layout geometry shader configuration regs
All the register meanings are derived from ctrulib (3dbrew is outdated for most of them)
2017-08-10 01:53:08 +03:00
Weiyi Wang
792dee47a7 Merge pull request #2822 from wwylele/sw_lighting-2
Implement fragment lighting in the sw renderer (take 2)
2017-08-09 18:54:29 +03:00
wwylele
baa24f4ea9 pica: upload shared shader code to both unit 2017-08-07 10:30:05 +03:00
wwylele
2252a63f80 SwRasterizer/Lighting: shorten file name 2017-08-03 13:51:22 +03:00
wwylele
eda28266fb SwRasterizer/Lighting: move to its own file 2017-08-02 22:20:40 +03:00
wwylele
48b4105871 SwRasterizer/Lighting: reduce confusion 2017-08-02 22:07:15 +03:00
wwylele
c59ed47608 SwRasterizer/Lighting: move quaternion normalization to the caller 2017-08-02 22:05:53 +03:00
wwylele
c89f804a01 pica/shader_interpreter: fix off-by-one in LOOP 2017-07-27 13:48:27 +03:00
Sebastian Valle
c6a2e519ef Merge pull request #2816 from wwylele/proctex-lutlutlut
gl_rasterizer: use texture buffer for proctex LUT
2017-07-22 23:03:48 -05:00
Sebastian Valle
e646bd902d Merge pull request #2834 from wwylele/depth-enable-fix
gl_rasterizer_cache: fix using_depth_fb
2017-07-22 23:02:59 -05:00
bunnei
df8b9863f9 telemetry: Log performance, configuration, and system data. 2017-07-17 21:32:28 -04:00
wwylele
4feff63ffa SwRasterizer/Lighting: dist atten lut input need to be clamp 2017-07-11 22:19:00 +03:00
wwylele
56e5425e59 SwRasterizer/Lighting: unify float suffix 2017-07-11 22:15:35 +03:00
wwylele
e415558a4f SwRasterizer/Lighting: get rid of nested return 2017-07-11 22:15:35 +03:00
wwylele
c6d1472513 SwRasterizer/Lighting: refactor GetLutValue into a function.
merging similar pattern. Also makes the code more similar to the gl one
2017-07-11 22:15:35 +03:00
wwylele
f13cf506e0 SwRasterizer: only interpolate quat and view when lighting is enabled 2017-07-11 21:35:57 +03:00
wwylele
efc655aec0 SwRasterizer/Lighting: pass lighting state as parameter 2017-07-11 20:06:26 +03:00
Subv
9906feefbd SwRasterizer/Lighting: Move the clamp highlight calculation to the end of the per-light loop body. 2017-07-11 19:39:15 +03:00
Subv
7526af5e52 SwRasterizer/Lighting: Move the lighting enable check outside the ComputeFragmentsColors function. 2017-07-11 19:39:15 +03:00
Subv
b8229a7684 SwRasterizer/Lighting: Do not use global registers state in ComputeFragmentsColors. 2017-07-11 19:39:15 +03:00
Subv
7bc467e872 SwRasterizer/Lighting: Do not use global state in LookupLightingLut. 2017-07-11 19:39:15 +03:00
Subv
37ac2b6657 SwRasterizer/Lighting: Fixed a bug where the distance attenuation bias was being set to the dist atten scale. 2017-07-11 19:39:15 +03:00
Subv
6250f52e93 SwRasterizer: Fixed a few conversion warnings and moved per-light values into the per-light loop. 2017-07-11 19:39:15 +03:00
Subv
2d69a9b8bf SwRasterizer: Run clang-format 2017-07-11 19:39:15 +03:00
Subv
73566ff7a9 SwRasterizer: Flip the vertex quaternions before clipping (if necessary). 2017-07-11 19:39:15 +03:00
Subv
2a75837bc3 SwRasterizer: Corrected the light LUT lookups. 2017-07-11 19:39:15 +03:00
Subv
f2d4d5c219 SwRasterizer: Corrected the light LUT lookups. 2017-07-11 19:39:15 +03:00
Subv
80b6fc592e SwRasterizer: Fixed the lighting lut lookup function. 2017-07-11 19:39:15 +03:00
Subv
10b0bea060 SwRasterizer: Calculate fresnel for fragment lighting. 2017-07-11 19:39:15 +03:00
Subv
46b8c8e1da SwRasterizer: Calculate specular_1 for fragment lighting. 2017-07-11 19:39:15 +03:00
Subv
be25e78b07 SwRasterizer: Calculate specular_0 for fragment lighting. 2017-07-11 19:39:15 +03:00
Subv
b2f472a2b1 SwRasterizer: Implement primary fragment color. 2017-07-11 19:39:15 +03:00
wwylele
8482933db8 gl_rasterizer: use texture buffer for proctex LUT 2017-07-01 11:02:48 +03:00
wwylele
8978ecb09c gl_rasterizer: use texture buffer for fog LUT 2017-06-22 20:41:00 +03:00
wwylele
f1e377f57e gl_rasterizer: create the texture before applying the state
this is a rebasing error from #2792. It doesn't affect much though, because the later more Apply() call fixes/hides it
2017-06-22 17:47:46 +03:00
wwylele
457659fe01 gl_state: reset 1d textures 2017-06-21 23:13:06 +03:00
wwylele
42f7ca7412 gl_rasterizer: fix glGetUniformLocation type 2017-06-21 23:13:06 +03:00
wwylele
be9e952bdc gl_rasterizer: manage texture ids in one place 2017-06-21 23:13:06 +03:00
wwylele
ab60414122 gl_rasterizer/lighting: fix LUT interpolation 2017-06-21 23:13:06 +03:00
Yuri Kunde Schlesner
d0888f8548 Merge pull request #2776 from wwylele/geo-factor
Fragment lighting: implement geometric factor
2017-06-18 14:18:48 -07:00
wwylele
5a454173a8 gl_rasterizer/lighting: use the formula from the paper for germetic factor 2017-06-18 10:29:02 +03:00
Yuri Kunde Schlesner
f6715f98f5 Stop using reserved operator names (and/or/xor) with Xbyak
Also has the Dynarmic upgrade with the same change
2017-06-17 12:20:22 -07:00
wwylele
7052d43a67 gl_rasterizer/lighting: implement geometric factor 2017-06-15 14:59:01 +03:00
Yuri Kunde Schlesner
da1bec121a Merge pull request #2762 from wwylele/light-cp-tangent
Fragment lighting: implement lut input 5 (CP) and tangent mapping
2017-06-14 20:08:26 -07:00