fix(vulkan-rt): configurable recursion depth + per-shader TLAS push for compute (#21)
Two gaps in the Vulkan RT path that fault the device on the NVIDIA
proprietary driver with a non-trivial pipeline (simple VulkanTriangle
never hit them):
1. maxPipelineRayRecursionDepth was hardcoded to 1, so any closest-hit
shader that traces a secondary ray (shadow ray — a very common
pattern) recursed past the pipeline limit (UB → device fault).
PipelineRTVulkan::Init now takes a maxRecursionDepth parameter
(default 1, clamped to the device's maxRayRecursionDepth).
2. The NVIDIA descriptor-heap AS-read workaround rewrites every shader
that reads an accelerationStructureEXT from the heap — including
compute shaders — to read the TLAS device address from a push
constant, but only RTPass pushed that address. A compute shader that
ray-queries the TLAS (rayQueryEXT) therefore ran against an unwritten
push slot → garbage AS handle → VK_ERROR_DEVICE_LOST.
WorkaroundNvidiaAS::Patch now returns a per-shader PatchResult
{patched, tlasPushOffset} instead of writing the clobber-prone global
Device::workaroundTlasPushOffset (removed). VulkanShader stores it;
ShaderBindingTableVulkan/PipelineRTVulkan carry it for RTPass, and
ComputeShader tracks its own offset and pushes the caller-supplied
TLAS address in Dispatch (new defaulted tlasAddress parameter),
mirroring RTPass::Record.
The PushConstantRewrite regression test now asserts Patch's returned
patched/offset and adds two ray-querying compute-shader cases, proving
the rewrite is stage-agnostic and the per-shader offset is correct.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
2790bbd576
commit
1c310762a7
8 changed files with 248 additions and 75 deletions
|
|
@ -39,7 +39,25 @@ export namespace Crafter {
|
|||
VkStridedDeviceAddressRegionKHR hitRegion;
|
||||
VkStridedDeviceAddressRegionKHR callableRegion;
|
||||
|
||||
void Init(VkCommandBuffer cmd, std::span<VkRayTracingShaderGroupCreateInfoKHR> raygenGroups, std::span<VkRayTracingShaderGroupCreateInfoKHR> missGroups, std::span<VkRayTracingShaderGroupCreateInfoKHR> hitGroups, ShaderBindingTableVulkan& shaderTable) {
|
||||
// NVIDIA descriptor-heap AS-read workaround (issue #15 / #7): copied
|
||||
// from the shader table at Init so RTPass can push the active TLAS
|
||||
// device address into the patched shaders' push constant. Inert on
|
||||
// every other driver.
|
||||
bool workaroundNeedsTlas = false;
|
||||
std::uint32_t workaroundTlasPushOffset = 0;
|
||||
|
||||
// maxRecursionDepth: the maximum ray-recursion depth the pipeline must
|
||||
// support — i.e. the deepest chain of nested traceRayEXT calls. The
|
||||
// raygen counts as depth 1, so a closest-hit shader that traces a shadow
|
||||
// ray needs 2. Tracing beyond the value the pipeline was created with is
|
||||
// undefined behaviour and faults the device, so a consumer with any
|
||||
// recursion past the raygen must raise this. Defaults to 1 (raygen-only,
|
||||
// matching the simple examples) and is clamped to the device's
|
||||
// maxRayRecursionDepth.
|
||||
void Init(VkCommandBuffer cmd, std::span<VkRayTracingShaderGroupCreateInfoKHR> raygenGroups, std::span<VkRayTracingShaderGroupCreateInfoKHR> missGroups, std::span<VkRayTracingShaderGroupCreateInfoKHR> hitGroups, ShaderBindingTableVulkan& shaderTable, std::uint32_t maxRecursionDepth = 1) {
|
||||
workaroundNeedsTlas = shaderTable.workaroundNeedsTlas;
|
||||
workaroundTlasPushOffset = shaderTable.workaroundTlasPushOffset;
|
||||
|
||||
std::vector<VkRayTracingShaderGroupCreateInfoKHR> groups;
|
||||
groups.reserve(raygenGroups.size() + missGroups.size() + hitGroups.size());
|
||||
|
||||
|
|
@ -60,7 +78,7 @@ export namespace Crafter {
|
|||
.pStages = shaderTable.shaderStages.data(),
|
||||
.groupCount = static_cast<std::uint32_t>(groups.size()),
|
||||
.pGroups = groups.data(),
|
||||
.maxPipelineRayRecursionDepth = 1,
|
||||
.maxPipelineRayRecursionDepth = std::min(maxRecursionDepth, Device::rayTracingProperties.maxRayRecursionDepth),
|
||||
.layout = VK_NULL_HANDLE
|
||||
};
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue