Commit graph

984 commits

Author SHA1 Message Date
Weiyi Wang
4befbddc34
Merge pull request #3281 from jroweboy/texcache-pt2
Texture Cache Rework
2018-03-05 11:57:25 +02:00
wwylele
c2515ff39d clang-format fix 2018-03-05 11:09:20 +02:00
James Rowe
1d419bac1b Disable accelerated texture copy for Texture surfaces 2018-03-04 22:06:09 -07:00
James Rowe
18456ff9e6 Address Lioncash's comments 2018-02-05 20:31:50 -07:00
Phantom
9e16a3c449 ConvertD24S8toABGR: fix fb attachment 2018-01-31 08:55:39 -07:00
Phantom
d813bc5eb5 D24S8 to RGBA8 conversion 2018-01-31 08:55:19 -07:00
Phantom
db21154142 GetFramebufferSurfaces: Remove an assert that is no longer correct 2018-01-31 08:54:19 -07:00
James Rowe
b002511df0
citra-qt: Add customizable speed limit target (#3353)
citra-qt: Add customizable speed limit target

* Update SDL config for the new frame_limit option
* Made max lag time a function of target speed percent.
* Added a checkbox to enable/disable frame limiter
* UI: Prevent frame_limit from under/overflowing
* UI: Hide target speed percent when frame limiter is off
* Disable frame limit spin box when framelimit isn't enabled
2018-01-25 22:24:40 -07:00
Phantom
88f6521511 AccelerateTextureCopy: Better support for contiguous copy 2018-01-20 18:39:27 -07:00
Yuri Kunde Schlesner
d93ee65164 Common: Add convenience function for hashing a struct 2018-01-15 13:43:37 -08:00
Dwayne Slater
41929371dc Optimize AttributeBuffer to OutputVertex conversion (#3283)
Optimize AttributeBuffer to OutputVertex conversion

First I unrolled the inner loop, then I pushed semantics validation
outside of the hotloop.

I also added overflow slots to avoid conditional branches.

Super Mario 3D Land's intro runs at almost full speed when compiled with
Clang, and theres a noticible speed increase in MSVC. GCC hasn't been
tested but I'm confident in its ability to optimize this code.
2018-01-02 15:32:33 -08:00
Phantom
7f1aec8fbb Support for textures smaller than 8*8 2017-12-30 07:42:32 +01:00
Phantom
be1d0cee1e Fix viewport to surface rect clamping 2017-12-29 17:07:01 +01:00
Phantom
19672cfee8 CachedSurface: Add microprofile scopes for UploadGLTexture and DownloadGLTexture 2017-12-29 17:01:37 +01:00
Phantom
1591fa8d3d Remove read_framebuffer_handle and draw_framebuffer_handle from CachedSurface 2017-12-29 17:00:09 +01:00
James Rowe
1c4d1d1ace Move trasnfer_framebuffer to a member of RasterCache. Address review comments 2017-12-23 16:10:32 -07:00
James Rowe
10fb9242ae Fix clang format 2017-12-23 16:10:32 -07:00
James Rowe
4e053220a8 When downloading from a surface into gl_buffer, ingore any x/y offsets in rect and use 0,0 as the origin 2017-12-23 16:10:31 -07:00
James Rowe
7e673af527 Remove the correct intervals from the surface when validating 2017-12-23 16:10:31 -07:00
James Rowe
ac4c589ab5 Workaround for ICE on gcc5 2017-12-23 16:10:31 -07:00
Phantom
9a6a452857 Fix broken surface validation logic since removal of the reinterpret hack 2017-12-23 16:10:30 -07:00
Phantom
f893daa4a2 Perform the same checks on TexCopy params that SW does 2017-12-23 16:10:30 -07:00
James Rowe
91fad7010b Fix compilation on mac and linux 2017-12-23 16:10:30 -07:00
James Rowe
34ff77f5f7 Revert "OpenGL Cache: Ignore format reinterpretation hack"
Testing found a few games that did some crazy things which breaks the
assumptions made in that commit.
2017-12-23 16:10:29 -07:00
James Rowe
72034b772d Minor style changes 2017-12-23 16:10:29 -07:00
James Rowe
0498d34d18 OpenGL Cache: Ignore format reinterpretation hack
Several games such as Smash will cause some regions that are cached on
the gpu to be revalidated, but (seemingly) we can just ignore these
cases. If the data is already found on the gpu in dirty_regions, then we
validate those, and skip flushing that region from cpu.

Its unknown if this breaks any games, but it does speed up many games.
Additionally, it removes outlines in the pokemon games.
2017-12-23 16:10:29 -07:00
James Rowe
5b872c41d8 OpenGL Cache: Reorder methods
The previous commits added the methods where they were located
originally to try to get an easy to read diff between changes. This
commit fixes compliation since the static methods are now declared
before they are used.
2017-12-23 16:10:28 -07:00
James Rowe
24e187891f OpenGL Rasterizer: Update to use the new cache 2017-12-23 16:10:28 -07:00
James Rowe
e5adb6a26b OpenGL Cache: Add the rest of the Cache methods
Fills in the rasterizer cache methods using the helper methods added in
the previous commits.
2017-12-23 16:10:27 -07:00
James Rowe
81ea32d1e0 OpenGL Cache: Refactor Surface Cache interface
Changes the public interface of the surface cache to make it easier to
use. Reintroduces the cached page count cached pages that was removed in
an earlier commit.
2017-12-23 16:10:27 -07:00
James Rowe
3e1cbb7d14 OpenGL Cache: Split CachedSurface
Breaks CachedSurface into two classes, the parameters used to create or
find a cached surface, and the actual cached surface. This also adds a
few helper methods for getting surfaces from cache
2017-12-23 16:10:27 -07:00
James Rowe
0b98b768f5 OpenGL Cache: Add surface utility functions
Separates creating and filling surfaces into static functions that
can be reused from the different RasterizerCache methods.
2017-12-23 16:10:26 -07:00
James Rowe
e9e2d444ef OpenGL Cache: Optimize Morton Copy to copy in tiles
Compiles two lookup arrays of functions for the different
configurations of Morton Copy.
2017-12-23 16:10:26 -07:00
James Rowe
160ac25527 OpenGL State: Change setters so they don't directly write to curstate 2017-12-23 16:10:25 -07:00
James Rowe
13606a6d0b Memory: Remove count of cached pages and add InvalidateRegion
In a future commit, the count of cached pages will be reintroduced in
the actual surface cache. Also adds an Invalidate only to the cache
which marks a region as invalid in order to try to avoid a costly flush
from 3ds memory
2017-12-23 16:10:25 -07:00
James Rowe
c821c14908 Settings: Change resolution scaling to an integer instead of a float 2017-12-23 16:10:25 -07:00
Subv
3652809408 HLE: Convert GSP_GPU to ServiceFramework.
The only functional change is the error handling of GSP_GPU::ReadHWRegs function. We previously didn't return error codes (not even for success). The new returns were found by reverse engineering the GSP module.
2017-12-21 10:30:22 -05:00
Tillmann Karras
fd3ec6be30 video_core: fix infinity and NaN conversions 2017-12-14 19:51:58 +00:00
Yuri Kunde Schlesner
aecd2b85fe
Merge pull request #3261 from MerryMage/DPH
shader_jit_x64_compiler: Use haddps for horizontal summation
2017-12-13 09:09:42 -05:00
bunnei
4695f12a08
Merge pull request #3264 from lioncash/cmake-target
CMakeLists: Derive the source directory grouping from targets themselves
2017-12-12 14:34:51 -05:00
MerryMage
6c199e4699 fixup! shader_jit_x64_compiler: Use haddps for horizontal summation 2017-12-12 15:37:00 +00:00
Lioncash
ab021d163e CMakeLists: Derive the source directory grouping from targets themselves
Removes the need to store to separate SRC and HEADER variables,
and then construct the target in most cases.
2017-12-11 21:11:52 -05:00
Yuri Kunde Schlesner
ae7240a2cb
Merge pull request #3097 from ds84182/round-primary-color-swrast
Round primary color in swrast
2017-12-11 20:06:21 -05:00
MerryMage
efec8fe513 shader_jit_x64_compiler: Use haddps for horizontal summation 2017-12-10 22:04:30 +00:00
Yuri Kunde Schlesner
230a7557f1 Shader: Store AttributeBuffers in GS output buffer
This also does the output masking early at EMIT time, instead of when a
triangle is sent to the vertex handler.
2017-12-09 20:33:59 -08:00
Yuri Kunde Schlesner
0184419814 Shader: Refactor output_mask copy loop to function 2017-12-09 20:31:24 -08:00
Tillmann Karras
1c2750d5bd video_core: optimize NaN check 2017-12-05 22:34:22 +00:00
MerryMage
c1aef260af shader_jit_x64_compiler: Remove ABI overhead of LG2 and EX2
This involves reimplementing log2f and exp2f.
2017-11-30 18:17:35 +00:00
MerryMage
235a251d3c tests: Add tests for x64 shader jit
Tests LG2 and EX2 instructions
2017-11-30 18:17:35 +00:00
Dwayne Slater
fcc141a327 Maintain the PICA's 8 bits of color precision when using the interpolated primary color
This matches the software renderer by using round.
The actual hardware rounds the results up instead of flooring.
2017-11-29 16:49:04 -05:00