Ryujinx

mirror of https://github.com/Ryujinx/Ryujinx.git synced 2024-12-21 10:32:01 +00:00

Author	SHA1	Message	Date
riperiperi	470be03c2f	GPU: Add fallback when 16-bit formats are not supported (#4108 ) * Add conversion for 16 bit RGBA formats (not supported in Rosetta) * Rebase fix Rebase fix * Forgot to remove this * Fix RGBA16 format conversion * Add RGBA4 -> RGBA8 conversion * Handle host stride alignment * Address Feedback Part 1 * Can't count * Don't zero out rgb when alpha is 0 * Separate RGBA4 and 5-bit component formats Not sure of a better way to name them... * Add A1B5G5R5 conversion * Put this in the right place. * Make format naming consistent for capabilities * Change method names	2022-12-26 15:50:27 -03:00
riperiperi	bf77d1cab9	GPU: Pass SpanOrArray for Texture SetData to avoid copy (#3745 ) * GPU: Pass SpanOrArray for Texture SetData to avoid copy Texture data is often converted before upload, meaning that an array was allocated to perform the conversion into. However, the backend SetData methods were being passed a Span of that data, and the Multithreaded layer does `ToArray()` on it so that it can be stored for later! This method can't extract the original array, so it creates a copy. This PR changes the type passed for textures to a new ref struct called SpanOrArray, which is backed by either a ReadOnlySpan or an array. The benefit here is that we can have a ToArray method that doesn't copy if it is originally backed by an array. This will also avoid a copy when running the ASTC decoder. On NieR this was taking 38% of texture upload time, which it does a _lot_ of when you move between areas, so there should be a 1.6x performance boost when strictly uploading textures. No doubt this will also improve texture streaming performance in UE4 games, and maybe a small reduction with video playback. From the numbers, it's probably possible to improve the upload rate by a further 1.6x by performing layout conversion on GPU. I'm not sure if we could improve it further than that - multithreading conversion on CPU would probably result in memory bottleneck. This doesn't extend to buffers, since we don't convert their data on the GPU emulator side. * Remove implicit cast to array.	2022-10-08 12:04:47 -03:00
riperiperi	cda659955c	Texture Sync, incompatible overlap handling, data flush improvements. (#2971 ) * Initial test for texture sync * WIP new texture flushing setup * Improve rules for incompatible overlaps Fixes a lot of issues with Unreal Engine games. Still a few minor issues (some caused by dma fast path?) Needs docs and cleanup. * Cleanup, improvements Improve rules for fast DMA * Small tweak to group together flushes of overlapping handles. * Fixes, flush overlapping texture data for ASTC and BC4/5 compressed textures. Fixes the new Life is Strange game. * Flush overlaps before init data, fix 3d texture size/overlap stuff * Fix 3D Textures, faster single layer flush Note: nosy people can no longer merge this with Vulkan. (unless they are nosy enough to implement the new backend methods) * Remove unused method * Minor cleanup * More cleanup * Use the More Fun and Hopefully No Driver Bugs method for getting compressed tex too This one's for metro * Address feedback, ASTC+ETC to FormatClass * Change offset to use Span slice rather than IntPtr Add * Fix this too	2022-01-09 13:28:48 -03:00
gdkchan	a87f7f2029	Fix DMA copy fast path line size when xCount < stride (#2942 )	2021-12-26 13:05:26 -03:00
riperiperi	4b60371e64	Return mapped buffer pointer directly for flush, WriteableRegion for textures (#2494 ) * Return mapped buffer pointer directly for flush, WriteableRegion for textures A few changes here to generally improve performance, even for platforms not using the persistent buffer flush. - Texture and buffer flush now return a ReadOnlySpan<byte>. It's guaranteed that this span is pinned in memory, but it will be overwritten on the next flush from that thread, so it is expected that the data is used before calling again. - As a result, persistent mappings no longer copy to a new array - rather the persistent map is returned directly as a Span<>. A similar host array is used for the glGet flushes instead of allocating new arrays each time. - Texture flushes now do their layout conversion into a WriteableRegion when the texture is not MultiRange, which allows the flush to happen directly into guest memory rather than into a temporary span, then copied over. This avoids another copy when doing layout conversion. Overall, this saves 1 data copy for buffer flush, 1 copy for linear textures with matching source/target stride, and 2 copies for block textures or linear textures with mismatching strides. * Fix tests * Fix array pointer for Mesa/Intel path * Address some feedback * Update method for getting array pointer.	2021-07-19 19:10:54 -03:00
riperiperi	ca5ac37cd6	Flush buffers and texture data through a persistent mapped buffer. (#2481 ) * Use persistent buffers to flush texture data * Flush buffers via copy to persistent buffers. * Log error when timing out, small refactoring.	2021-07-16 18:10:20 -03:00
riperiperi	dbce3455ad	Fix lineSize for LinearStrided -> Linear conversion (#2091 ) Fixes a possible crash when width is greater than stride, which can happen due to alignment when copying textures.	2021-03-10 01:24:46 +01:00
gdkchan	4d02a2d2c0	New NVDEC and VIC implementation (#1384 ) * Initial NVDEC and VIC implementation * Update FFmpeg.AutoGen to 4.3.0 * Add nvdec dependencies for Windows * Unify some VP9 structures * Rename VP9 structure fields * Improvements to Video API * XML docs for Common.Memory * Remove now unused or redundant overloads from MemoryAccessor * NVDEC UV surface read/write scalar paths * Add FIXME comments about hacky things/stuff that will need to be fixed in the future * Cleaned up VP9 memory allocation * Remove some debug logs * Rename some VP9 structs * Remove unused struct * No need to compile Ryujinx.Graphics.Host1x with unsafe anymore * Name AsyncWorkQueue threads to make debugging easier * Make Vp9PictureInfo a ref struct * LayoutConverter no longer needs the depth argument (broken by rebase) * Pooling of VP9 buffers, plus fix a memory leak on VP9 * Really wish VS could rename projects properly... * Address feedback * Remove using * Catch OperationCanceledException * Add licensing informations * Add THIRDPARTY.md to release too Co-authored-by: Thog <me@thog.eu>	2020-07-12 05:07:01 +02:00
gdkchan	76e5af967a	Fix buffer to 3D texture copy (#1354 )	2020-07-04 01:37:36 +02:00
riperiperi	bea1fc2e8d	Optimize texture format conversion, and MethodCopyBuffer (#1274 ) * Improve performance when converting texture formats. Still more work to do. * Speed up buffer -> texture copies. No longer copies byte by byte. Fast path when formats are identical. * Fix a few things, 64 byte block fast copy. * Spacing cleanup, unrelated change. * Fix base offset calculation for region copies. * Fix Linear -> BlockLinear * Fix some nits. (part 1 of review feedback) * Use a generic version of the Convert* functions rather than lambdas. This is some real monkey's paw shit. * Remove unnecessary span constructor. * Revert "Use a generic version of the Convert* functions rather than lambdas." This reverts commit `aa43dcfbe8`. * Fix bug with rectangle destination writing, better rectangle calculation for linear textures.	2020-06-13 19:31:06 -03:00
gdkchan	34d19f381c	Fix texture level offset/size calculation when sparse tile width is > 1 (#1142 ) * Fix texture level offset/size calculation when sparse tile width is > 1 * Sparse tile width affects layer size alignment aswell	2020-04-25 23:40:20 +10:00
gdkchan	1a550e810c	Copy 16 bytes at a time for layout conversion, if possible	2020-01-09 02:13:00 +01:00
gdkchan	e25b7c9848	Initial support for the guest OpenGL driver (NVIDIA and Nouveau)	2020-01-09 02:13:00 +01:00
gdk	1bb08742c1	Calculate width from stride on texture copies	2020-01-09 02:13:00 +01:00
gdk	1876b346fe	Initial work	2020-01-09 02:13:00 +01:00

15 commits