mirrors/yuzu

mirror of https://github.com/yuzu-emu/yuzu.git synced 2024-07-04 23:31:19 +01:00

Author	SHA1	Message	Date
ReinUsesLisp	fe931ac976	{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers Drop MemoryBarrier from the buffer cache and use Maxwell3D's register WaitForIdle. To implement this on OpenGL we just call glMemoryBarrier with the necessary bits. Vulkan lacks this synchronization primitive, so we set an event and immediately wait for it. This is not a pretty solution, but it's what Vulkan can do without submitting the current command buffer to the queue (which ends up being more expensive on the CPU).	2020-04-28 02:18:12 -03:00
Fernando Sahmkow	b87422a86f	VideoCore/GPU: Delegate subchannel engines to the dma pusher.	2020-04-27 22:07:21 -04:00
Fernando Sahmkow	90e5694230	VideoCore/Engines: Refactor Engines CallMethod.	2020-04-27 21:47:58 -04:00
ReinUsesLisp	bb1ed66d99	maxwell_3d: Fix depth clamping register Using deko3d as reference: `4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)` We were using bits 3 and 4 to determine depth clamping, but these are the same both enabled and disabled: state->depthClampEnable ? 0x101A : 0x181D The same happens on Nvidia's OpenGL driver, where they do something like this (default capabilities, GL 4.5 compatibility): (state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c There's always a difference between the first bits in this register, but bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This commit changes yuzu's behaviour to use bit 11 to determine depth clamping. - Fixes depth issues on Super Mario Odyssey's intro.	2020-04-27 20:50:14 -03:00
Fernando Sahmkow	1517cba8ca	Merge pull request #3766 from ReinUsesLisp/renderpass-cache-key vk_renderpass_cache: Pack renderpass cache key and unify keys	2020-04-27 16:05:14 -04:00
Fernando Sahmkow	a65e9ad552	Merge pull request #3756 from ReinUsesLisp/integrated-devices vk_memory_manager: Remove unified memory model flag	2020-04-27 16:04:22 -04:00
bunnei	6c7d8073be	Merge pull request #3742 from FernandoS27/command-list Optimize GPU Command Lists and Introduce Fast GPU Time Option	2020-04-27 00:18:46 -04:00
ReinUsesLisp	8da16cf9fb	texture_cache: Reintroduce preserve_contents accurately This reverts commit `94b0e2e5da`. preserve_contents proved to be a meaningful optimization. This commit reintroduces it but properly implemented on OpenGL. We have to make sure the clear removes all the previous contents of the image. It's not currently implemented on Vulkan because we can do smart things there that's preferred to be introduced in a separate commit.	2020-04-26 19:53:02 -03:00
Rodrigo Locatti	7e38dd580f	Merge pull request #3753 from ReinUsesLisp/ac-vulkan {gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers	2020-04-26 01:55:43 -03:00
ReinUsesLisp	ddd82ef42b	shader/memory_util: Deduplicate code Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as well as shader decoder code. While we are at it, fix a bug in gl_shader_cache where compute shaders had an start offset of a stage shader.	2020-04-26 01:38:51 -03:00
ReinUsesLisp	e895a4e2d7	shader/arithmetic_integer: Fix edge case and mark IADD.X Rd.CC as unimplemented IADD.X Rd.CC requires some extra logic that is not currently implemented. Abort when this is hit.	2020-04-25 22:58:33 -03:00
ReinUsesLisp	2a96bea6a7	shader/arithmetic_integer: Change IAdd to UAdd to avoid signed overflow Signed integer addition overflow might be undefined behavior. It's free to change operations to UAdd and use unsigned integers to avoid potential bugs.	2020-04-25 22:57:54 -03:00
ReinUsesLisp	c788f9c0bd	shader/arithmetic_integer: Implement IADD.X IADD.X takes the carry flag and adds it to the result. This is generally used to emulate 64-bit operations with 32-bit registers.	2020-04-25 22:56:11 -03:00
ReinUsesLisp	255197e643	shader/arithmetic_integer: Implement CC for IADD	2020-04-25 22:55:26 -03:00
ReinUsesLisp	ffc5ec6fa8	decode/register_set_predicate: Implement CC P2R CC takes the state of condition codes and puts them into a register. We already have this implemented for PR (predicates). This commit implements CC over that.	2020-04-25 22:54:42 -03:00
ReinUsesLisp	d523734266	decode/register_set_predicate: Use move for shared pointers Avoid atomic counters used by shared pointers.	2020-04-25 22:54:14 -03:00
bunnei	c5bf693882	Merge pull request #3721 from ReinUsesLisp/sort-devices vulkan/wrapper: Sort physical devices	2020-04-25 03:27:40 -04:00
bunnei	4e37825dab	Merge pull request #3734 from ReinUsesLisp/half-float-mods decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits	2020-04-25 00:41:43 -04:00
ReinUsesLisp	527a1574c3	vk_rasterizer: Pack texceptions and color formats on invalid formats Sometimes for unknown reasons NVN games can bind a render target format of 0. This may be a yuzu bug. With the commits before this the formats were specified without being "packed", assuming all formats and texceptions will be written like in the color_attachments vector. To address this issue, iterate all render targets and pack them as they are valid. This way they will match color_attachments. - Fixes validation errors and graphical issues on Breath of the Wild.	2020-04-24 22:21:29 -03:00
bunnei	7c8acb0025	Merge pull request #3749 from ReinUsesLisp/lea-imm shader/arithmetic_integer: Fix LEA_IMM encoding	2020-04-24 14:30:13 -04:00
Fernando Sahmkow	d8a961cd6c	Revert: shader_decode: Fix LD, LDG when track constant buffer.	2020-04-24 11:00:54 -04:00
Markus Wick	e717a1df20	Fix -Wdeprecated-copy warning.	2020-04-24 09:33:04 +02:00
Markus Wick	c499c22cf7	Fix -Werror=conversion error.	2020-04-24 09:33:04 +02:00
ReinUsesLisp	dbaebd8582	decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits The encoding for negation and absolute value was wrong. Extracting is now done manually. Similar instructions having different encodings is the rule, not the exception. To keep sanity and readability I preferred to extract the desired bit manually. This is implemented against nxas: `8dbc389957/table.h (L68)` That is itself tested against nvdisasm (Nvidia's official disassembler).	2020-04-23 18:29:38 -03:00
ReinUsesLisp	4fb921ff6b	shader/texture: Support multiple unknown sampler properties This allows deducing some properties from the texture instruction before asking the runtime. By doing this we can handle type mismatches in some instructions from the renderer instead of the shader decoder. Fixes texelFetch issues with games using 2D texture instructions on a 1D sampler.	2020-04-23 18:04:13 -03:00
ReinUsesLisp	72deb773fd	shader_ir: Turn classes into data structures	2020-04-23 18:00:06 -03:00
ReinUsesLisp	3e35101895	vk_rasterizer: Fix framebuffer creation validation errors Framebuffer creation was ignoring the number of color attachments.	2020-04-23 17:34:16 -03:00
ReinUsesLisp	8c37cd1af6	vk_pipeline_cache: Unify pipeline cache keys into a single operation This allows us to call Common::CityHash and std::memcmp only once for GraphicsPipelineCacheKey. While we are at it, do the same for compute.	2020-04-23 17:34:16 -03:00
ReinUsesLisp	f665c92114	vk_renderpass_cache: Pack renderpass cache key to 12 bytes	2020-04-23 17:34:16 -03:00
bunnei	ff0c49e1ce	kernel: memory: Improve implementation of device shared memory. (#3707 ) * kernel: memory: Improve implementation of device shared memory. * fixup! kernel: memory: Improve implementation of device shared memory. * fixup! kernel: memory: Improve implementation of device shared memory.	2020-04-23 11:37:12 -04:00
Fernando Sahmkow	5c9feaebb6	Clang Format.	2020-04-23 08:52:58 -04:00
Fernando Sahmkow	b8aef40c56	GPU: Add Fast GPU Time Option.	2020-04-23 08:52:57 -04:00
Fernando Sahmkow	18a88d19dc	Maxwell3D: Process Macros on MultiMethod.	2020-04-23 08:52:56 -04:00
Fernando Sahmkow	3fedcc2f6e	DMAPusher: Propagate multimethod writes into the engines.	2020-04-23 08:52:55 -04:00
bunnei	2409fedacf	Merge pull request #3697 from lioncash/declarations CMakeLists: Enable -Wmissing-declarations on Linux builds	2020-04-23 02:18:52 -04:00
bunnei	bf2ddb8fd5	Merge pull request #3677 from FernandoS27/better-sync Introduce Predictive Flushing and Improve ASYNC GPU	2020-04-22 22:09:38 -04:00
ReinUsesLisp	d9463f4562	vk_pipeline_cache: Fix unintentional memcpy into optional The intention behind this was to assign a float to from an uint32_t, but it was unintentionally being copied directly into the std::optional. Copy to a temporary and assign that temporary to std::optional. This can be replaced with std::bit_cast<float> once we are in C++20.	2020-04-22 21:36:05 -03:00
Fernando Sahmkow	c043ac4f13	GL_Fence_Manager: use GL_TIMEOUT_IGNORED instead of a loop,	2020-04-22 20:34:32 -04:00
Fernando Sahmkow	afae40a99e	Merge pull request #3653 from ReinUsesLisp/nsight-aftermath renderer_vulkan: Integrate Nvidia Nsight Aftermath on Windows	2020-04-22 11:39:01 -04:00
Fernando Sahmkow	4e37f1b113	Address Feedback.	2020-04-22 11:36:27 -04:00
Fernando Sahmkow	39e5b72948	Async GPU: Correct flushing behavior to be similar to old async GPU behavior.	2020-04-22 11:36:26 -04:00
Fernando Sahmkow	1b3be8a8f8	MaxwellDMA: Correct copying on accuracy level.	2020-04-22 11:36:25 -04:00
Fernando Sahmkow	644588fd88	ShaderCache/PipelineCache: Cache null shaders.	2020-04-22 11:36:25 -04:00
Fernando Sahmkow	f616dc0b59	Address Feedback.	2020-04-22 11:36:24 -04:00
Fernando Sahmkow	ec2f3e48e1	Fix GCC error.	2020-04-22 11:36:23 -04:00
Fernando Sahmkow	b3e5f177ba	QueryCache: Only do async flushes on async gpu.	2020-04-22 11:36:21 -04:00
Fernando Sahmkow	f4ab223ef0	Async GPU: Only do reactive flushing on Extreme Level.	2020-04-22 11:36:20 -04:00
ReinUsesLisp	b752faf2d3	vk_fence_manager: Initial implementation	2020-04-22 11:36:19 -04:00
Fernando Sahmkow	0649f05900	QueryCache: Implement Async Flushes.	2020-04-22 11:36:18 -04:00
Fernando Sahmkow	131b342130	OpenGL: Guarantee writes to Buffers.	2020-04-22 11:36:18 -04:00
Fernando Sahmkow	1fb516cd97	GPU: Implement Flush Requests for Async mode.	2020-04-22 11:36:17 -04:00
Fernando Sahmkow	b7bc3c2549	FenceManager: Manage syncpoints and rename fences to semaphores.	2020-04-22 11:36:16 -04:00
Fernando Sahmkow	96bb961a64	BufferCache: Refactor async managing.	2020-04-22 11:36:15 -04:00
Fernando Sahmkow	b10db7e4a5	FenceManager: Implement async buffer cache flushes on High settings	2020-04-22 11:36:15 -04:00
Fernando Sahmkow	4adfc9bb08	Rasterizer: Document SignalFence & ReleaseFences and setup skeletons on Vulkan.	2020-04-22 11:36:14 -04:00
Fernando Sahmkow	a081a7c855	GPU: Fix rebase errors.	2020-04-22 11:36:13 -04:00
Fernando Sahmkow	e84eb64e51	Rasterizer: Disable fence managing in synchronous gpu.	2020-04-22 11:36:12 -04:00
Fernando Sahmkow	165ae823f5	ThreadManager: Sync async reads on accurate gpu.	2020-04-22 11:36:12 -04:00
Fernando Sahmkow	57fdbd9b89	FenceManager: Implement should wait.	2020-04-22 11:36:11 -04:00
Fernando Sahmkow	1f345ebe3a	GPU: Implement a Fence Manager.	2020-04-22 11:36:10 -04:00
Fernando Sahmkow	487379c593	OpenGL: Implement Fencing backend.	2020-04-22 11:36:10 -04:00
Fernando Sahmkow	ed7e965712	TextureCache: Flush linear textures after finishing rendering.	2020-04-22 11:36:09 -04:00
Fernando Sahmkow	339d0d9d6c	GPU: Delay Fences.	2020-04-22 11:36:08 -04:00
Fernando Sahmkow	8b1eb44b3e	BufferCache: Implement OnCPUWrite and SyncGuestHost	2020-04-22 11:36:07 -04:00
Fernando Sahmkow	da8f17715d	GPU: Refactor synchronization on Async GPU	2020-04-22 11:36:06 -04:00
Fernando Sahmkow	a60a22d9c2	Texture Cache: Implement OnCPUWrite and SyncGuestHost	2020-04-22 11:36:05 -04:00
Fernando Sahmkow	084ceb925a	UI: Replasce accurate GPU option for GPU Accuracy Level	2020-04-22 11:36:04 -04:00
ReinUsesLisp	6f47bd9641	vk_memory_manager: Remove unified memory model flag All drivers (even Intel) seem to have a device local memory type that is not host visible. Remove this flag so all devices follow the same path. This fixes a crash when trying to map to host device local memory on integrated devices.	2020-04-21 22:06:38 -03:00
bunnei	d64290884a	Merge pull request #3714 from lioncash/copies gl_shader_decompiler: Avoid copies where applicable	2020-04-21 20:16:02 -04:00
ReinUsesLisp	488ed8bd02	vk_rasterizer: Add lazy default buffer maker and use it for empty buffers Introduce a default buffer getter that lazily constructs an empty buffer. This is intended to match OpenGL's buffer 0. Use this for disabled vertex and uniform buffers. While we are at it, include vertex buffer usages for staging buffers to silence validation errors.	2020-04-21 19:55:52 -03:00
ReinUsesLisp	0bbae63300	gl_rasterizer: Fix buffers without size On NVN buffers can be enabled but have no size. According to deko3d and the behavior we see in Animal Crossing: New Horizons these buffers get the special address of 0x1000 and limit themselves to 0xfff. Implement buffers without a size by binding a null buffer to OpenGL without a side. `1d1930beea/source/maxwell/gpu_3d_vbo.cpp (L62-L63)`	2020-04-21 19:55:44 -03:00
Rodrigo Locatti	f293b15611	Merge pull request #3718 from ReinUsesLisp/better-pipeline-state fixed_pipeline_state: Pack structure, use memcmp and CityHash on it	2020-04-21 18:17:58 -03:00
bunnei	9bf3abcb63	Merge pull request #3698 from lioncash/warning General: Resolve minor assorted warnings	2020-04-21 14:11:18 -04:00
bunnei	d3e0cefa60	Merge pull request #3695 from ReinUsesLisp/default-attributes maxwell_3d: Initialize format attributes constant as one	2020-04-20 21:40:18 -04:00
ReinUsesLisp	8734ccb0cb	shader/arithmetic_integer: Fix LEA_IMM encoding The operand order in LEA_IMM was flipped compared to nvdisasm. Fix that using nxas as reference: `8dbc389957/table.h (L122)`	2020-04-20 21:54:59 -03:00
Mat M	cb5b8ca886	Merge pull request #3733 from ambasta/patch-2 Initialize quad_indexed_pass before uint8_pass	2020-04-20 20:36:46 -04:00
Fernando Sahmkow	ec2f8f4272	Merge pull request #3700 from ReinUsesLisp/stream-buffer-sizes vk_stream_buffer: Fix out of memory on boot on recent Nvidia drivers	2020-04-20 09:37:42 -04:00
Amit Prakash Ambasta	5324b1d01e	Initialize quad_indexed_pass before uint8_pass Fixes Werror=reorder in gcc	2020-04-20 04:53:52 +05:30
Rodrigo Locatti	4932010c6f	Merge pull request #3729 from lioncash/globals dma_pusher: Remove reliance on the global system instance	2020-04-19 19:12:40 -03:00
bunnei	85c17a2c35	Merge pull request #3694 from ReinUsesLisp/indexed-quads vk_compute_pass: Implement indexed quads	2020-04-19 16:52:40 -04:00
Lioncash	44e959157b	dma_pusher: Remove reliance on the global system instance With this, the video core is now has no calls to the global system instance at all.	2020-04-19 16:12:08 -04:00
bunnei	2ea7a70da0	Merge pull request #3686 from lioncash/table texture_cache/format_lookup_table: Fix incorrect green, blue, and alpha indices	2020-04-19 15:33:33 -04:00
bunnei	73db83c0ab	Merge pull request #3679 from lioncash/track track: Eliminate redundant copies	2020-04-19 01:22:47 -04:00
Jan Beich	afcc84a172	renderer_vulkan: assume X11 if not Windows/macOS after `bf1d66b7c0` Render.Vulkan <Error> video_core/renderer_vulkan/renderer_vulkan.cpp:CreateInstance:131: Presentation not supported on this platform Render.Vulkan <Error> video_core/renderer_vulkan/renderer_vulkan.cpp:CreateSurface:378: Presentation not supported on this platform Core <Critical> core/core.cpp:Load:199: Failed to initialize system (Error 5)!	2020-04-19 00:32:23 +00:00
ReinUsesLisp	c81bf06d03	vulkan/wrapper: Sort physical devices Sort discrete GPUs over the rest, Nvidia over AMD, AMD over Intel, Intel over the rest. This gives us a somewhat consistent order when Optimus is removed (renderdoc does this when it's attached). This can break the configuration of users with an Intel GPU that manually remove Optimus on yuzu. That said, it's a very unlikely to happen.	2020-04-18 21:31:15 -03:00
ReinUsesLisp	d62f57cf5a	fixed_pipeline_state: Hash and compare the whole structure Pad FixedPipelineState's size to 384 bytes to be a multiple of 16. Compare the whole struct with std::memcmp and hash with CityHash. Using CityHash instead of a naive hash should reduce the number of collisions. Improve used type traits to ensure this operation is safe. With these changes the improvements to the hashable pipeline state are: Optimized structure Hash: 89 ns Comparison: 103 ns Construction: 164 ns Struct size: 384 bytes Original structure Hash: 148 ns Equal: 174 ns Construction: 281 ns Size: 1384 bytes * Attribute state initialization is not measured These measures are averages taken with std::chrono::high_accuracy_clock on MSVC shipped on Visual Studio 16.6.0 Preview 2.1.	2020-04-18 19:57:26 -03:00
ReinUsesLisp	b571c92dfd	fixed_pipeline_state: Pack blending state Reduce FixedPipelineState's size to 364 bytes.	2020-04-18 19:23:35 -03:00
ReinUsesLisp	548dd27f45	fixed_pipeline_state: Pack rasterizer state Reduce FixedPipelineState's size to 600 bytes.	2020-04-18 19:22:57 -03:00
ReinUsesLisp	7790144a55	fixed_pipeline_state: Pack depth stencil state Reduce FixedPipelineState's size to 632 bytes.	2020-04-18 19:22:11 -03:00
ReinUsesLisp	ab6704f20c	fixed_pipeline_state: Pack attribute state Reduce FixedPipelineState's size from 1384 to 664 bytes	2020-04-18 19:21:19 -03:00
Mat M	5305806071	Merge pull request #3716 from bunnei/fix-another-impl-fallthrough video_core: gl_shader_decompiler: Fix implicit fallthrough errors.	2020-04-18 15:17:52 -04:00
bunnei	03726fb7f5	video_core: gl_shader_decompiler: Fix implicit fallthrough errors.	2020-04-18 15:15:21 -04:00
Lioncash	bf328ed35a	gl_shader_decompiler: Avoid copies where applicable Avoids unnecessary reference count increments where applicable and also avoids reallocating a vector. Unlikely to make a huge difference, but given how trivial of an amendment it is, why not?	2020-04-17 20:48:52 -04:00
Markus Wick	07fbef1776	video_code: Fix implicit switch fallthrough. Since yesterday, this breaks the build on linux. So let's fix it.	2020-04-17 23:43:35 +02:00
ReinUsesLisp	a7b6bd56d7	vk_stream_buffer: Fix out of memory on boot on recent Nvidia drivers Nvidia recently introduced a new memory type for data streaming (awesome!), but yuzu was assuming that all heaps had enough memory for the assumed stream buffer size (256 MiB). This worked fine on AMD but Nvidia's new memory heap was smaller than 256 MiB. This commit changes this assumption and allocates a bit less than the size of the preferred heap, with a maximum of 256 MiB (to avoid allocating all system memory on integrated devices). - Fixes a crash on NVIDIA 450.82.0.0	2020-04-17 18:12:48 -03:00
Rodrigo Locatti	990c0b184f	Revert "gl_shader_cache: Use CompileDepth::FullDecompile on GLSL"	2020-04-17 17:41:48 -03:00
bunnei	b8f5c71f2d	Merge pull request #3666 from bunnei/new-vmm Implement a new virtual memory manager	2020-04-17 16:33:08 -04:00
bunnei	ca3af2961c	Merge pull request #3682 from lioncash/uam gl_query_cache: Resolve use-after-move in CachedQuery move assignment operator	2020-04-17 01:24:08 -04:00
bunnei	32fc2aae3c	video_core: memory_manager: Updates for Common::PageTable changes.	2020-04-17 00:59:34 -04:00
bunnei	4caff51710	core: memory: Move to Core::Memory namespace. - helpful to disambiguate Kernel::Memory namespace.	2020-04-17 00:59:28 -04:00
Lioncash	e2d8be1ca2	General: Resolve warnings related to missing declarations	2020-04-16 23:43:34 -04:00
Lioncash	678ac54749	decode/memory: Resolve unused variable warning Only the first element of the returned pair is ever used.	2020-04-16 22:45:44 -04:00
Lioncash	d159643fd7	decode/texture: Resolve unused variable warnings. Some variables aren't used, so we can remove these. Unfortunately, diagnostics are still reported on structured bindings even when annotated with [[maybe_unused]], so we need to unpack the elements that we want to use manually.	2020-04-16 22:45:41 -04:00
Lioncash	f522abd8ab	decode/texture: Collapse loop down into std::generate Same behavior, less code.	2020-04-16 22:29:07 -04:00
Lioncash	7e2d60de26	decode/texture: Eliminate trivial missing field initializer warnings We can just specify the initializers.	2020-04-16 22:27:21 -04:00
bunnei	79c1269f0f	Merge pull request #3673 from lioncash/extra CMakeLists: Specify -Wextra on linux builds	2020-04-16 21:12:33 -04:00
ReinUsesLisp	238c6016f9	maxwell_3d: Initialize format attributes constant as one nouveau expects this to be true but it doesn't set it.	2020-04-16 21:15:07 -03:00
ReinUsesLisp	c961770900	vk_compute_pass: Implement indexed quads Implement indexed quads (GL_QUADS used with glDrawElements*) with a compute pass conversion. The compute shader converts from uint8/uint16/uint32 indices to uint32. The format is passed through push constants to avoid having different variants of the same shader. - Used by Fast RMX - Used by Xenoblade Chronicles 2 (it still has graphical due to synchronization issues on Vulkan)	2020-04-16 21:12:32 -03:00
Fernando Sahmkow	c81f256111	Merge pull request #3600 from ReinUsesLisp/no-pointer-buf-cache buffer_cache: Return handles instead of pointer to handles	2020-04-16 19:58:13 -04:00
ReinUsesLisp	090fd3fefa	buffer_cache: Return handles instead of pointer to handles The original idea of returning pointers is that handles can be moved. The problem is that the implementation didn't take that in mind and made everything harder to work with. This commit drops pointer to handles and returns the handles themselves. While it is still true that handles can be invalidated, this way we get an old handle instead of a dangling pointer. This problem can be solved in the future with sparse buffers.	2020-04-16 02:33:34 -03:00
Rodrigo Locatti	a5a2ee8766	Merge pull request #3689 from lioncash/unused-var decode/shift: Remove unused variable within Shift()	2020-04-16 02:05:54 -03:00
Rodrigo Locatti	d196ce0f71	Merge pull request #3688 from lioncash/nequal surface_view: Add missing operator!= to ViewParams	2020-04-16 01:39:51 -03:00
Rodrigo Locatti	4209dba1f6	Merge pull request #3680 from lioncash/static gl_device: Mark stage_swizzle as constexpr	2020-04-16 01:26:23 -03:00
Rodrigo Locatti	60e8de7c95	Merge pull request #3687 from lioncash/constness surface_base: Make IsInside() a const member function	2020-04-16 01:22:50 -03:00
Rodrigo Locatti	612966399b	Merge pull request #3685 from lioncash/copies control_flow: Make use of std::move in TryInspectAddress()	2020-04-16 01:22:40 -03:00
Lioncash	cd2a12e78f	decode/shift: Remove unused variable within Shift() Removes a redundant variable that is already satisfied by the IsFull() utility function.	2020-04-16 00:16:06 -04:00
Lioncash	5fbe8785d2	surface_view: Add missing operator!= to ViewParams Provides logical symmetry to the interface.	2020-04-16 00:03:12 -04:00
Lioncash	d551c910bb	surface_base: Make IsInside() a const member function This doesn't modify internal state, so this can be made const.	2020-04-15 23:59:35 -04:00
bunnei	319df1db77	Merge pull request #3683 from lioncash/docs video_core: Amend doxygen comment references	2020-04-15 23:54:58 -04:00
Lioncash	636c8ab85b	texture_cache/format_lookup_table: Fix incorrect green, blue, and alpha indices Previously these were all using the red component to derive the indices, which is definitely not intentional.	2020-04-15 23:50:46 -04:00
Lioncash	72a224d3fc	control_flow: Make use of std::move in TryInspectAddress() Eliminates redundant atomic reference count increments and decrements.	2020-04-15 23:31:22 -04:00
Lioncash	11837e8f13	video_core: Amend doxygen comment references Fixes broken documentation references.	2020-04-15 22:33:29 -04:00
Lioncash	3a60f19eaf	gl_query_cache: Resolve use-after-move in CachedQuery move assignment operator Avoids potential invalid junk data from being read.	2020-04-15 22:20:06 -04:00
Lioncash	71fb156611	gl_device: Mark stage_swizzle as constexpr Previously this was mutable even though it shouldn't be.	2020-04-15 21:59:13 -04:00
Lioncash	e15ec2705c	track: Eliminate redundant copies Two variables can be references, while two others can be std::moved. Makes for 4 less atomic reference count increments and decrements.	2020-04-15 21:50:09 -04:00
Lioncash	1c340c6efa	CMakeLists: Specify -Wextra on linux builds Allows reporting more cases where logic errors may exist, such as implicit fallthrough cases, etc. We currently ignore unused parameters, since we currently have many cases where this is intentional (virtual interfaces). While we're at it, we can also tidy up any existing code that causes warnings. This also uncovered a few bugs as well.	2020-04-15 21:33:46 -04:00
Rodrigo Locatti	65cbb122ea	Merge pull request #3649 from FernandoS27/3d-fix Texture Cache: Read current data when flushing a 3D segment.	2020-04-15 17:06:55 -03:00
Fernando Sahmkow	e33196d4e7	Merge pull request #3612 from ReinUsesLisp/red shader/memory: Implement RED.E.ADD and minor changes to ATOM	2020-04-15 15:03:49 -04:00
Lioncash	213fff67bc	CMakeLists: Make -Wreorder a compile-time error This can result in silent logic bugs within code, and given the amount of times these kind of warnings are caused, they should be flagged at compile-time so no new code is submitted with them.	2020-04-15 14:14:41 -04:00
Mat M	64b5985f0a	Merge pull request #3662 from ReinUsesLisp/constant-attrs gl_rasterizer: Implement constant vertex attributes	2020-04-15 11:54:50 -04:00
Fernando Sahmkow	6789d88a9c	Texture Cache: Read current data when flushing a 3D segment. This PR corrects flushing of 3D segments when data of other segments is mixed, this aims to preserve the data in place.	2020-04-15 11:46:17 -04:00
Mat M	9208d555b7	Merge pull request #3668 from ReinUsesLisp/vtx-format-16ui maxwell_to_vk: Add uint16 vertex formats	2020-04-15 11:43:52 -04:00
Mat M	ab72696beb	Merge pull request #3656 from ReinUsesLisp/glsl-full-decompile gl_shader_cache: Use CompileDepth::FullDecompile on GLSL	2020-04-15 03:17:46 -04:00
Mat M	4878d6bb49	Merge pull request #3654 from ReinUsesLisp/fix-fb-attach gl_texture_cache: Fix layered texture attachment base level	2020-04-15 03:17:18 -04:00
Mat M	50c0a92db8	Merge pull request #3663 from ReinUsesLisp/fcmp-rc shader/arithmetic: Add FCMP_CR variant	2020-04-15 03:16:56 -04:00
Mat M	13331a3a32	Merge pull request #3664 from ReinUsesLisp/fe3h-black-squares Revert "gl_shader_decompiler: Implement merges with bitfieldInsert"	2020-04-15 03:14:28 -04:00
ReinUsesLisp	3036067047	maxwell_to_vk: Add uint16 vertex formats	2020-04-15 04:06:30 -03:00
ReinUsesLisp	b4e43c64c8	maxwell_to_vk: Add missing breaks Avoid invalid fallbacks.	2020-04-15 04:05:33 -03:00
ReinUsesLisp	0ca456830f	vk_blit_screen: Initialize all members in VkPipelineViewportStateCreateInfo When the dynamic state is specified, pViewports and pScissors are ignored, quoting the specification: pViewports is a pointer to an array of VkViewport structures, defining the viewport transforms. If the viewport state is dynamic, this member is ignored. That said, AMD's proprietary driver itself seem to read it regardless of what the specification says.	2020-04-15 03:30:08 -03:00
Rodrigo Locatti	0b132e8cc1	Merge pull request #3657 from ReinUsesLisp/viewport-zero vk_rasterizer: Default to 1 viewports with a size of 0	2020-04-15 01:51:17 -03:00
Fernando Sahmkow	daddbeffd1	Texture Cache: Only do buffer copies on accurate GPU. (#3634 ) This is a simple optimization as Buffer Copies are mostly used for texture recycling. They are, however, useful when games abuse undefined behavior but most 3D APIs forbid it.	2020-04-14 23:21:00 -04:00
ReinUsesLisp	fd6371eba7	Revert "gl_shader_decompiler: Implement merges with bitfieldInsert" This reverts commit `05cf270836`. Apparently the first approach using floats instead of bitfieldInert worked better for Fire Emblem: Three Houses. Reverting to get that behavior back.	2020-04-14 21:24:33 -03:00
ReinUsesLisp	fefe7f18f9	shader/arithmetic: Add FCMP_CR variant Adds another variant of FCMP.	2020-04-14 19:11:04 -03:00
ReinUsesLisp	6dfcabc800	gl_rasterizer: Implement constant vertex attributes Credits go to gdkchan from Ryujinx for finding constant attributes are used in retail games.	2020-04-14 17:58:53 -03:00
ReinUsesLisp	37e5c4fa7c	vk_rasterizer: Default to 1 viewports with a size of 0 Silence validation layer errors.	2020-04-14 04:44:34 -03:00
ReinUsesLisp	453d7419d9	gl_shader_cache: Use CompileDepth::FullDecompile on GLSL From my testing on a Splatoon 2 shader that takes 3800ms on average to compile changing to FullDecompile reduces it to 900ms on average. The shader decoder will automatically fallback to a more naive method if it can't use full decompile.	2020-04-14 01:34:20 -03:00
ReinUsesLisp	0e232cfdc1	renderer_vulkan: Integrate Nvidia Nsight Aftermath on Windows Adds optional support for Nsight Aftermath. It is enabled through ENABLE_NSIGHT_AFTERMATH in cmake. A path to the SDK has to be provided by the environment variable NSIGHT_AFTERMATH_SDK. Nsight Aftermath allows an application to generate "minidumps" of the GPU state when a device loss happens. By analysing these on Nsight we can know what a game was doing and why it triggered a device loss. The dump is generated inside %APPDATA%\yuzu\log\gpucrash and this directory is deleted every time a new instance is initialized with Nsight enabled. To enable it on yuzu there has a to be a driver and device capable of running Nsight Aftermath on Vulkan. That means only Turing based GPUs on the latest stable driver, beta drivers won't work for now. It is manually enabled in Configuration>Debug>Enable Graphics Debugging because when using all debugging capabilities there is a runtime cost.	2020-04-14 00:39:21 -03:00
ReinUsesLisp	21dc842171	gl_texture_cache: Fix layered texture attachment base level The base level is already included in the texture view. If we specify the base level in the texture again, this will end up in the incorrect level and potentially out of bounds.	2020-04-13 18:24:56 -03:00
ReinUsesLisp	6cfe2a7246	renderer_vulkan: Remove Nvidia checkpoints	2020-04-13 17:33:59 -03:00
ReinUsesLisp	16105c6a66	renderer_vulkan: Catch device losses in more places	2020-04-13 17:33:59 -03:00
Rodrigo Locatti	7e4a132a77	Merge pull request #3636 from ReinUsesLisp/drop-vk-hpp renderer_vulkan: Drop Vulkan-Hpp	2020-04-13 17:08:04 -03:00
Mat M	fbf13d3f48	Merge pull request #3651 from ReinUsesLisp/line-widths gl_rasterizer: Implement line widths and smooth lines	2020-04-13 10:19:59 -04:00
Mat M	08266d70ba	Merge pull request #3638 from ReinUsesLisp/remove-preserve-contents texture_cache: Remove preserve_contents	2020-04-13 10:19:01 -04:00
Mat M	c4001225f6	Merge pull request #3631 from ReinUsesLisp/more-astc texture/astc: More small ASTC optimizations	2020-04-13 10:17:32 -04:00
Mat M	7b62212461	Merge pull request #3619 from ReinUsesLisp/i2i shader/conversion: Implement I2I sign extension, saturation and selection	2020-04-13 10:17:07 -04:00
Mat M	3351e1e94f	Merge pull request #3627 from ReinUsesLisp/layered-view gl_texture_cache: Attach view instead of base texture for layered attchments	2020-04-13 10:16:18 -04:00
Mat M	d37d899431	Merge pull request #3646 from ReinUsesLisp/fix-glsl-turing gl_shader_decompiler: Improve generated code in HMergeH*	2020-04-13 10:15:12 -04:00
Mat M	47036859eb	Merge pull request #3633 from ReinUsesLisp/clean-texdec shader/texture: Remove type mismatches management from shader decoder	2020-04-13 10:13:05 -04:00
ReinUsesLisp	76615b9f34	gl_rasterizer: Implement line widths and smooth lines Implements "legacy" features from OpenGL present on hardware such as smooth lines and line width.	2020-04-13 01:30:34 -03:00
ReinUsesLisp	05cf270836	gl_shader_decompiler: Implement merges with bitfieldInsert This also fixes Turing issues but it avoids doing more bitcasts. This should improve the generated code while also avoiding more points where compilers can flush floats.	2020-04-12 22:39:59 -03:00
Fernando Sahmkow	3d91dbb21d	Merge pull request #3578 from ReinUsesLisp/vmnmx shader/video: Partially implement VMNMX	2020-04-12 10:44:03 -04:00
ReinUsesLisp	75eb953575	gl_shader_decompiler: Improve generated code in HMergeH* Avoiding bitwise expressions, this fixes Turing issues in shaders using half float merges that affected several games.	2020-04-12 05:06:55 -03:00
ReinUsesLisp	76f178ba6e	shader/video: Partially implement VMNMX Implements the common usages for VMNMX. Inputs with a different size than 32 bits are not supported and sign mismatches aren't supported either. VMNMX works as follows: It grabs Ra and Rb and applies a maximum/minimum on them (this is defined by .MX), having in mind the input sign. This result can then be saturated. After the intermediate result is calculated, it applies another operation on it using Rc. These operations are merges, accumulations or another min/max pass. This instruction allows to implement with a more flexible approach GCN's min3 and max3 instructions (for instance).	2020-04-12 00:34:42 -03:00
ReinUsesLisp	a7baf6fee4	video_core: Add MSAA registers in 3D engine and TIC This adds the registers used for multisampling. It doesn't implement anything for now.	2020-04-12 00:21:27 -03:00
ReinUsesLisp	94b0e2e5da	texture_cache: Remove preserve_contents preserve_contents was always true. We can't assume we don't have to preserve clears because scissored and color masked clears exist. This removes preserve_contents and assumes it as true at all times.	2020-04-11 01:51:02 -03:00
ReinUsesLisp	2905142f47	renderer_vulkan: Drop Vulkan-Hpp	2020-04-10 22:49:02 -03:00
bunnei	51c6688e21	Merge pull request #3594 from ReinUsesLisp/vk-instance yuzu: Drop SDL2 and Qt frontend Vulkan requirements	2020-04-10 20:06:55 -04:00
ReinUsesLisp	a87b16da9a	shader/texture: Remove type mismatches management from shader decoder Since commit `e22816a5bb` we handle type mismatches from the CPU. We don't need to hack our shader decoder due to game bugs anymore. Removed in this commit.	2020-04-10 00:57:32 -03:00
Fernando Sahmkow	7182ef31c9	Merge pull request #3622 from ReinUsesLisp/srgb-texture-border video_core/texture: Use a LUT to convert sRGB texture borders	2020-04-09 18:01:48 -04:00
ReinUsesLisp	6bf5d2b011	astc: Hard code bit depth changes to 8 and use fast replicate	2020-04-09 18:37:12 -03:00
Rodrigo Locatti	36f607217f	Merge pull request #3610 from FernandoS27/gpu-caches Refactor all the GPU Caches to use VAddr for cache addressing	2020-04-09 17:59:21 -03:00
ReinUsesLisp	bd2c1ab8a0	astc: Use boost's static_vector to avoid heap allocations	2020-04-09 05:27:57 -03:00
ReinUsesLisp	5de130beea	astc: Implement a fast precompiled alternative for Replicate	2020-04-09 03:58:25 -03:00
ReinUsesLisp	6b4d4473be	astc: Move Replicate to a constexpr LUT when possible	2020-04-09 03:35:07 -03:00
ReinUsesLisp	d22a689250	astc: Make InputBitStream constexpr	2020-04-09 02:54:05 -03:00
ReinUsesLisp	0efc230381	astc: OutputBitStream style changes and make it constexpr	2020-04-09 02:37:51 -03:00
bunnei	b96fd0bd0e	Merge pull request #3601 from ReinUsesLisp/some-shader-encodings video_core/shader: Add some instruction and S2R encodings	2020-04-09 00:17:39 -04:00
ReinUsesLisp	6c8f9f40d7	gl_texture_cache: Attach view instead of base texture for layered attachments This way we are not ignoring the base layer of the current texture.	2020-04-08 22:20:25 -03:00
Fernando Sahmkow	7cd6daf115	VkRasterizer: Eliminate Legacy code.	2020-04-08 18:59:09 -04:00
Fernando Sahmkow	1c18dc6577	Memory: Correct GCC errors.	2020-04-08 18:09:16 -04:00
Fernando Sahmkow	913f42a3a7	Memory: Address Feedback.	2020-04-08 13:40:46 -04:00
Fernando Sahmkow	e00d992848	GPUMemoryManager: Improve safety of memory reads.	2020-04-08 12:08:06 -04:00
ReinUsesLisp	a209d464f9	video_core/textures: Move GetMaxAnisotropy to cpp file	2020-04-07 20:47:31 -03:00
ReinUsesLisp	d7db088180	video_core/texture: Use a LUT to convert sRGB texture borders This is a reversed look up table extracted from https://gist.github.com/rygorous/2203834#file-gistfile1-cpp-L41-L62 that is used in `04d4e9e587/source/maxwell/tsc_generate.cpp (L38)` Games usually bind 0xFD expecting a float texture border of 1.0f. The conversion previous to this commit was multiplying the uint8 sRGB texture border color by 255. This is close to 1.0f but when that difference matters, some graphical glitches appear. This look up table is manually changed in the edges, clamping towards 0.0f and 1.0f. While we are at it, move this logic to its own translation unit.	2020-04-07 20:38:14 -03:00
bunnei	f316911248	Merge pull request #3599 from ReinUsesLisp/revert-3499 Revert "Merge pull request #3499 from ReinUsesLisp/depth-2d-array"	2020-04-07 16:51:41 -04:00
ReinUsesLisp	bf1d66b7c0	yuzu: Drop SDL2 and Qt frontend Vulkan requirements Create Vulkan instances and surfaces from the Vulkan backend.	2020-04-07 16:32:19 -03:00
Rodrigo Locatti	487f9ba525	Merge pull request #3489 from namkazt/patch-2 shader: implement SULD.D bits32/64	2020-04-07 16:21:09 -03:00
Nguyen Dac Nam	935648ffa9	address nit.	2020-04-07 18:29:30 +07:00
ReinUsesLisp	bc1b4b85b0	renderer_vulkan: Query device names from the backend	2020-04-07 02:23:23 -03:00
ReinUsesLisp	da706cad25	shader/conversion: Implement I2I sign extension, saturation and selection Reimplements I2I adding sign extension, saturation (clamp source value to the destination), selection and destination sizes that are not 32 bits wide. It doesn't implement CC yet.	2020-04-07 02:19:44 -03:00
Nguyen Dac Nam	bf1174c114	Apply suggestions from code review Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>	2020-04-07 07:55:49 +07:00
Fernando Sahmkow	f9d5718c4b	Clang Format.	2020-04-06 09:23:08 -04:00
Fernando Sahmkow	ea535d9470	Shader/Pipeline Cache: Use VAddr instead of physical memory for addressing.	2020-04-06 09:23:07 -04:00
Fernando Sahmkow	3dd5c07454	Query Cache: Use VAddr instead of physical memory for adressing.	2020-04-06 09:23:07 -04:00
Fernando Sahmkow	7fcd0fee6d	Buffer Cache: Use vAddr instead of physical memory.	2020-04-06 09:23:06 -04:00
Fernando Sahmkow	6ee316cb8f	Texture Cache: Use vAddr instead of physical memory for caching.	2020-04-06 09:23:05 -04:00
Fernando Sahmkow	9c0f40a1f5	GPU: Setup Flush/Invalidate to use VAddr instead of CacheAddr	2020-04-06 09:21:46 -04:00
Fernando Sahmkow	588a20be3f	Merge pull request #3513 from ReinUsesLisp/native-astc video_core: Use native ASTC when available	2020-04-06 09:21:11 -04:00
namkazy	2c98e14d13	shader_decode: SULD.D using std::pair instead of out parameter	2020-04-06 13:46:55 +07:00
namkazy	9efa51311f	shader_decode: SULD.D avoid duplicate code block.	2020-04-06 13:34:06 +07:00

... 2 3 4 5 6 ...

4456 commits