fix(vulkan-rt): configurable recursion depth + per-shader TLAS push for compute (#21)
Two gaps in the Vulkan RT path that fault the device on the NVIDIA
proprietary driver with a non-trivial pipeline (simple VulkanTriangle
never hit them):
1. maxPipelineRayRecursionDepth was hardcoded to 1, so any closest-hit
shader that traces a secondary ray (shadow ray — a very common
pattern) recursed past the pipeline limit (UB → device fault).
PipelineRTVulkan::Init now takes a maxRecursionDepth parameter
(default 1, clamped to the device's maxRayRecursionDepth).
2. The NVIDIA descriptor-heap AS-read workaround rewrites every shader
that reads an accelerationStructureEXT from the heap — including
compute shaders — to read the TLAS device address from a push
constant, but only RTPass pushed that address. A compute shader that
ray-queries the TLAS (rayQueryEXT) therefore ran against an unwritten
push slot → garbage AS handle → VK_ERROR_DEVICE_LOST.
WorkaroundNvidiaAS::Patch now returns a per-shader PatchResult
{patched, tlasPushOffset} instead of writing the clobber-prone global
Device::workaroundTlasPushOffset (removed). VulkanShader stores it;
ShaderBindingTableVulkan/PipelineRTVulkan carry it for RTPass, and
ComputeShader tracks its own offset and pushes the caller-supplied
TLAS address in Dispatch (new defaulted tlasAddress parameter),
mirroring RTPass::Record.
The PushConstantRewrite regression test now asserts Patch's returned
patched/offset and adds two ray-querying compute-shader cases, proving
the rewrite is stage-agnostic and the per-shader offset is correct.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
2790bbd576
commit
1c310762a7
8 changed files with 248 additions and 75 deletions
|
|
@ -42,14 +42,16 @@ export namespace Crafter {
|
|||
// block that VulkanShader synthesizes, so the rewritten raygen can
|
||||
// reach the acceleration structure by address instead of through
|
||||
// the faulting heap descriptor. Inert on every other driver.
|
||||
if (Device::workaroundDescriptorHeapAS) {
|
||||
if (Device::workaroundDescriptorHeapAS && pipeline->workaroundNeedsTlas) {
|
||||
VkDeviceAddress tlasAddr = RenderingElement3D::tlases[frameIdx].address;
|
||||
VkPushDataInfoEXT pushInfo {
|
||||
.sType = VK_STRUCTURE_TYPE_PUSH_DATA_INFO_EXT,
|
||||
// Where the rewritten raygen reads the TLAS address: 0 when
|
||||
// VulkanShader synthesized a fresh block, or the offset of
|
||||
// the member it appended to the shader's existing block.
|
||||
.offset = Device::workaroundTlasPushOffset,
|
||||
// Tracked per-pipeline (copied from the shader table) so a
|
||||
// later-loaded shader can't clobber it.
|
||||
.offset = pipeline->workaroundTlasPushOffset,
|
||||
.data = { .address = &tlasAddr, .size = sizeof(tlasAddr) },
|
||||
};
|
||||
Device::vkCmdPushDataEXT(cmd, &pushInfo);
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue