fix(vulkan-rt): configurable recursion depth + per-shader TLAS push for compute (#21)

Two gaps in the Vulkan RT path that fault the device on the NVIDIA
proprietary driver with a non-trivial pipeline (simple VulkanTriangle
never hit them):

1. maxPipelineRayRecursionDepth was hardcoded to 1, so any closest-hit
   shader that traces a secondary ray (shadow ray — a very common
   pattern) recursed past the pipeline limit (UB → device fault).
   PipelineRTVulkan::Init now takes a maxRecursionDepth parameter
   (default 1, clamped to the device's maxRayRecursionDepth).

2. The NVIDIA descriptor-heap AS-read workaround rewrites every shader
   that reads an accelerationStructureEXT from the heap — including
   compute shaders — to read the TLAS device address from a push
   constant, but only RTPass pushed that address. A compute shader that
   ray-queries the TLAS (rayQueryEXT) therefore ran against an unwritten
   push slot → garbage AS handle → VK_ERROR_DEVICE_LOST.

   WorkaroundNvidiaAS::Patch now returns a per-shader PatchResult
   {patched, tlasPushOffset} instead of writing the clobber-prone global
   Device::workaroundTlasPushOffset (removed). VulkanShader stores it;
   ShaderBindingTableVulkan/PipelineRTVulkan carry it for RTPass, and
   ComputeShader tracks its own offset and pushes the caller-supplied
   TLAS address in Dispatch (new defaulted tlasAddress parameter),
   mirroring RTPass::Record.

The PushConstantRewrite regression test now asserts Patch's returned
patched/offset and adds two ray-querying compute-shader cases, proving
the rewrite is stage-agnostic and the per-shader offset is correct.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
catbot 2026-06-03 18:35:39 +00:00
commit 1c310762a7
8 changed files with 248 additions and 75 deletions

View file

@ -42,14 +42,16 @@ export namespace Crafter {
// block that VulkanShader synthesizes, so the rewritten raygen can
// reach the acceleration structure by address instead of through
// the faulting heap descriptor. Inert on every other driver.
if (Device::workaroundDescriptorHeapAS) {
if (Device::workaroundDescriptorHeapAS && pipeline->workaroundNeedsTlas) {
VkDeviceAddress tlasAddr = RenderingElement3D::tlases[frameIdx].address;
VkPushDataInfoEXT pushInfo {
.sType = VK_STRUCTURE_TYPE_PUSH_DATA_INFO_EXT,
// Where the rewritten raygen reads the TLAS address: 0 when
// VulkanShader synthesized a fresh block, or the offset of
// the member it appended to the shader's existing block.
.offset = Device::workaroundTlasPushOffset,
// Tracked per-pipeline (copied from the shader table) so a
// later-loaded shader can't clobber it.
.offset = pipeline->workaroundTlasPushOffset,
.data = { .address = &tlasAddr, .size = sizeof(tlasAddr) },
};
Device::vkCmdPushDataEXT(cmd, &pushInfo);