fix(vulkan-rt): configurable recursion depth + per-shader TLAS push for compute (#21) #22
No reviewers
Labels
No labels
claude:done
claude:in-progress
claude:ready
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
Catcrafts/Crafter.Graphics!22
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "claude/issue-21"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes the two confirmed Vulkan RT gaps from #21 that fault the device on the NVIDIA proprietary driver once the pipeline gets non-trivial (the simple
VulkanTriangleexample never exercises either).1.
maxPipelineRayRecursionDepthis now configurablePipelineRTVulkan::Inittook a hardcoded.maxPipelineRayRecursionDepth = 1, so any closest-hit shader that traces a secondary ray (a shadow ray, etc.) recursed to depth 2, exceeded the pipeline limit (undefined behaviour), and faulted the device.Initnow takes a trailingstd::uint32_t maxRecursionDepth = 1parameter, clamped to the device'smaxRayRecursionDepth. Existing callers are unchanged (default 1).2. The descriptor-heap AS-read workaround now reaches compute dispatches
WorkaroundNvidiaAS::Patchrewrites every shader that reads anaccelerationStructureEXTfrom the descriptor heap — including compute shaders — to read the TLAS device address from a push constant. But onlyRTPass::Recordever pushed that address;ComputeShader::Dispatchdid not. A compute shader that ray-queries the TLAS (rayQueryEXT) therefore read an unwritten push slot → garbage AS handle →VK_ERROR_DEVICE_LOST.Patchnow returns a per-shaderPatchResult {patched, tlasPushOffset}instead of writing the globalDevice::workaroundTlasPushOffset(removed). The global was clobbered by whichever shader was patched last, so it could not serve several shaders with differing push layouts.VulkanShaderstorespatchedAS/tlasPushOffset;ShaderBindingTableVulkanandPipelineRTVulkancarry them soRTPassreads the offset off the pipeline.ComputeShadertracks its own offset and pushes the caller-supplied TLAS address inDispatch(new defaultedtlasAddressparameter — passRenderingElement3D::tlases[frameIdx].address), mirroringRTPass::Record.Tests
crafter-build testis green. ThePushConstantRewriteregression test now assertsPatch's returnedpatched/tlasPushOffset(previously read off the removed global) and adds two ray-querying compute cases (synthesize + merge), proving the rewrite is stage-agnostic and the per-shader offset is correct:Not addressed (the "possibly related" item in #21)
The single-bounce multi-instance all-miss / silent GPU hang is left untouched — it's marked "needs your eyes" with an offered repro and isn't reproducible here. Worth noting for the full 3DForts path: an RT pipeline where multiple stages (raygen and closesthit) read the AS with different push-constant layouts still resolves to a single pipeline-wide push offset (the last AS-reading stage's). That's a pre-existing limitation of the workaround, not a regression, but it'll matter once recursion is re-enabled. A minimal repro would help.
Resolves #21
Two gaps in the Vulkan RT path that fault the device on the NVIDIA proprietary driver with a non-trivial pipeline (simple VulkanTriangle never hit them): 1. maxPipelineRayRecursionDepth was hardcoded to 1, so any closest-hit shader that traces a secondary ray (shadow ray — a very common pattern) recursed past the pipeline limit (UB → device fault). PipelineRTVulkan::Init now takes a maxRecursionDepth parameter (default 1, clamped to the device's maxRayRecursionDepth). 2. The NVIDIA descriptor-heap AS-read workaround rewrites every shader that reads an accelerationStructureEXT from the heap — including compute shaders — to read the TLAS device address from a push constant, but only RTPass pushed that address. A compute shader that ray-queries the TLAS (rayQueryEXT) therefore ran against an unwritten push slot → garbage AS handle → VK_ERROR_DEVICE_LOST. WorkaroundNvidiaAS::Patch now returns a per-shader PatchResult {patched, tlasPushOffset} instead of writing the clobber-prone global Device::workaroundTlasPushOffset (removed). VulkanShader stores it; ShaderBindingTableVulkan/PipelineRTVulkan carry it for RTPass, and ComputeShader tracks its own offset and pushes the caller-supplied TLAS address in Dispatch (new defaulted tlasAddress parameter), mirroring RTPass::Record. The PushConstantRewrite regression test now asserts Patch's returned patched/offset and adds two ray-querying compute-shader cases, proving the rewrite is stage-agnostic and the per-shader offset is correct. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>