2023-04-11 08:23:41 +01:00
|
|
|
|
using Ryujinx.Common.Logging;
|
|
|
|
|
using System;
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
using System.Diagnostics;
|
2023-01-24 16:32:56 +00:00
|
|
|
|
using System.Linq;
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
|
|
|
|
|
namespace Ryujinx.Graphics.Vulkan
|
|
|
|
|
{
|
|
|
|
|
internal class AutoFlushCounter
|
|
|
|
|
{
|
|
|
|
|
// How often to flush on framebuffer change.
|
2023-04-11 08:23:41 +01:00
|
|
|
|
private readonly static long FramebufferFlushTimer = Stopwatch.Frequency / 1000; // (1ms)
|
|
|
|
|
|
|
|
|
|
// How often to flush on draw when fast flush mode is enabled.
|
|
|
|
|
private readonly static long DrawFlushTimer = Stopwatch.Frequency / 666; // (1.5ms)
|
|
|
|
|
|
|
|
|
|
// Average wait time that triggers fast flush mode to be entered.
|
|
|
|
|
private readonly static long FastFlushEnterThreshold = Stopwatch.Frequency / 666; // (1.5ms)
|
|
|
|
|
|
|
|
|
|
// Average wait time that triggers fast flush mode to be exited.
|
|
|
|
|
private readonly static long FastFlushExitThreshold = Stopwatch.Frequency / 10000; // (0.1ms)
|
|
|
|
|
|
|
|
|
|
// Number of frames to average waiting times over.
|
|
|
|
|
private const int SyncWaitAverageCount = 20;
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
|
|
|
|
|
private const int MinDrawCountForFlush = 10;
|
2023-02-09 01:03:41 +00:00
|
|
|
|
private const int MinConsecutiveQueryForFlush = 10;
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
private const int InitialQueryCountForFlush = 32;
|
|
|
|
|
|
2023-04-11 08:23:41 +01:00
|
|
|
|
private readonly VulkanRenderer _gd;
|
|
|
|
|
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
private long _lastFlush;
|
|
|
|
|
private ulong _lastDrawCount;
|
|
|
|
|
private bool _hasPendingQuery;
|
2023-02-09 01:03:41 +00:00
|
|
|
|
private int _consecutiveQueries;
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
private int _queryCount;
|
|
|
|
|
|
2023-01-24 16:32:56 +00:00
|
|
|
|
private int[] _queryCountHistory = new int[3];
|
|
|
|
|
private int _queryCountHistoryIndex;
|
|
|
|
|
private int _remainingQueries;
|
|
|
|
|
|
2023-04-11 08:23:41 +01:00
|
|
|
|
private long[] _syncWaitHistory = new long[SyncWaitAverageCount];
|
|
|
|
|
private int _syncWaitHistoryIndex;
|
|
|
|
|
|
|
|
|
|
private bool _fastFlushMode;
|
|
|
|
|
|
|
|
|
|
public AutoFlushCounter(VulkanRenderer gd)
|
|
|
|
|
{
|
|
|
|
|
_gd = gd;
|
|
|
|
|
}
|
|
|
|
|
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
public void RegisterFlush(ulong drawCount)
|
|
|
|
|
{
|
|
|
|
|
_lastFlush = Stopwatch.GetTimestamp();
|
|
|
|
|
_lastDrawCount = drawCount;
|
|
|
|
|
|
|
|
|
|
_hasPendingQuery = false;
|
2023-02-09 01:03:41 +00:00
|
|
|
|
_consecutiveQueries = 0;
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
public bool RegisterPendingQuery()
|
|
|
|
|
{
|
|
|
|
|
_hasPendingQuery = true;
|
2023-02-09 01:03:41 +00:00
|
|
|
|
_consecutiveQueries++;
|
2023-01-24 16:32:56 +00:00
|
|
|
|
_remainingQueries--;
|
|
|
|
|
|
|
|
|
|
_queryCountHistory[_queryCountHistoryIndex]++;
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
|
|
|
|
|
// Interrupt render passes to flush queries, so that early results arrive sooner.
|
|
|
|
|
if (++_queryCount == InitialQueryCountForFlush)
|
|
|
|
|
{
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2023-01-24 16:32:56 +00:00
|
|
|
|
public int GetRemainingQueries()
|
|
|
|
|
{
|
|
|
|
|
if (_remainingQueries <= 0)
|
|
|
|
|
{
|
|
|
|
|
_remainingQueries = 16;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (_queryCount < InitialQueryCountForFlush)
|
|
|
|
|
{
|
|
|
|
|
return Math.Min(InitialQueryCountForFlush - _queryCount, _remainingQueries);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return _remainingQueries;
|
|
|
|
|
}
|
|
|
|
|
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
public bool ShouldFlushQuery()
|
|
|
|
|
{
|
|
|
|
|
return _hasPendingQuery;
|
|
|
|
|
}
|
|
|
|
|
|
2023-04-11 08:23:41 +01:00
|
|
|
|
public bool ShouldFlushDraw(ulong drawCount)
|
|
|
|
|
{
|
|
|
|
|
if (_fastFlushMode)
|
|
|
|
|
{
|
|
|
|
|
long draws = (long)(drawCount - _lastDrawCount);
|
|
|
|
|
|
|
|
|
|
if (draws < MinDrawCountForFlush)
|
|
|
|
|
{
|
|
|
|
|
if (draws == 0)
|
|
|
|
|
{
|
|
|
|
|
_lastFlush = Stopwatch.GetTimestamp();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
long flushTimeout = DrawFlushTimer;
|
|
|
|
|
|
|
|
|
|
long now = Stopwatch.GetTimestamp();
|
|
|
|
|
|
|
|
|
|
return now > _lastFlush + flushTimeout;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2023-02-09 01:03:41 +00:00
|
|
|
|
public bool ShouldFlushAttachmentChange(ulong drawCount)
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
{
|
|
|
|
|
_queryCount = 0;
|
|
|
|
|
|
2023-02-09 01:03:41 +00:00
|
|
|
|
// Flush when there's an attachment change out of a large block of queries.
|
|
|
|
|
if (_consecutiveQueries > MinConsecutiveQueryForFlush)
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
{
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
2023-02-09 01:03:41 +00:00
|
|
|
|
_consecutiveQueries = 0;
|
|
|
|
|
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
long draws = (long)(drawCount - _lastDrawCount);
|
|
|
|
|
|
|
|
|
|
if (draws < MinDrawCountForFlush)
|
|
|
|
|
{
|
|
|
|
|
if (draws == 0)
|
|
|
|
|
{
|
|
|
|
|
_lastFlush = Stopwatch.GetTimestamp();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
long flushTimeout = FramebufferFlushTimer;
|
|
|
|
|
|
|
|
|
|
long now = Stopwatch.GetTimestamp();
|
|
|
|
|
|
|
|
|
|
return now > _lastFlush + flushTimeout;
|
|
|
|
|
}
|
2023-01-24 16:32:56 +00:00
|
|
|
|
|
|
|
|
|
public void Present()
|
|
|
|
|
{
|
2023-04-11 08:23:41 +01:00
|
|
|
|
// Query flush prediction.
|
|
|
|
|
|
2023-01-24 16:32:56 +00:00
|
|
|
|
_queryCountHistoryIndex = (_queryCountHistoryIndex + 1) % 3;
|
|
|
|
|
|
|
|
|
|
_remainingQueries = _queryCountHistory.Max() + 10;
|
|
|
|
|
|
|
|
|
|
_queryCountHistory[_queryCountHistoryIndex] = 0;
|
2023-04-11 08:23:41 +01:00
|
|
|
|
|
|
|
|
|
// Fast flush mode toggle.
|
|
|
|
|
|
|
|
|
|
_syncWaitHistory[_syncWaitHistoryIndex] = _gd.SyncManager.GetAndResetWaitTicks();
|
|
|
|
|
|
|
|
|
|
_syncWaitHistoryIndex = (_syncWaitHistoryIndex + 1) % SyncWaitAverageCount;
|
|
|
|
|
|
|
|
|
|
long averageWait = (long)_syncWaitHistory.Average();
|
|
|
|
|
|
|
|
|
|
if (_fastFlushMode ? averageWait < FastFlushExitThreshold : averageWait > FastFlushEnterThreshold)
|
|
|
|
|
{
|
|
|
|
|
_fastFlushMode = !_fastFlushMode;
|
|
|
|
|
Logger.Debug?.PrintMsg(LogClass.Gpu, $"Switched fast flush mode: ({_fastFlushMode})");
|
|
|
|
|
}
|
2023-01-24 16:32:56 +00:00
|
|
|
|
}
|
Periodically Flush Commands for Vulkan (#3689)
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
2022-09-14 17:48:31 +01:00
|
|
|
|
}
|
|
|
|
|
}
|