fix(vulkan-rt): merge TLAS push constant into existing block (#18)

The NVIDIA descriptor-heap AS-read workaround (#15) rewrote heap acceleration-structure reads into a load of the TLAS device address from a push-constant block. It always *synthesized a new* push-constant block, so any ray-tracing shader that already declared one ended up with two — which SPIR-V forbids ("at most one push constant block statically used per entry point"), and vkCreateShaderModule's spirv-val check rejected: Entry point id '4' uses more than one PushConstant interface. WorkaroundNvidiaAS::Patch now detects an existing PushConstant variable and, when present, appends a single ulong member (the TLAS address) to that block instead of adding a second one, reading the address through the shader's own push-constant variable. The append offset is the end of the user's block, computed from the members' explicit Offset/ArrayStride/ MatrixStride decorations (correct under both scalar and std140 layout) and rounded up to 8. Shaders with no push constant of their own keep getting a freshly synthesized single-member block at offset 0, exactly as before. That offset is published via Device::workaroundTlasPushOffset and RTPass feeds it to vkCmdPushDataEXT so the address lands where the rewritten load reads it (0 for the synthesized case, preserving prior behaviour). Verified on the affected driver (NVIDIA 610.43.02, RTX 4090): VulkanTriangle ray-traces correctly and validation-clean both with and without a user-declared raygen push constant. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 02:28:02 +00:00 · 2026-06-03 02:28:02 +00:00 · 45ecc91424
commit 45ecc91424
parent f24107264d
4 changed files with 204 additions and 55 deletions
--- a/examples/VulkanTriangle/README.md
+++ b/examples/VulkanTriangle/README.md
@ -44,12 +44,16 @@ bug (full investigation in #7, summarised below).
 proprietary driver only, `VulkanShader` rewrites the compiled SPIR-V at
 module-load time so that every `OpLoad` of an `accelerationStructureEXT`
 out of the heap becomes a load of the TLAS *device address* (from a
-synthesized push-constant block) followed by
+push-constant block) followed by
 `OpConvertUToAccelerationStructureKHR` — which reads no descriptor and so
 never touches the faulting path. `RTPass` feeds the active frame's TLAS
-address in as push data. `raygen.glsl` and the example code are unchanged;
-acceleration structures still bind into the heap normally. On every other
-driver the workaround is inert. It's gated on
+address in as push data. SPIR-V allows only one push-constant block per
+entry point, so when a shader already declares one the TLAS address is
+appended to *that* block (rather than adding a second, which would fail
+validation — issue #18); shaders without a push constant get a freshly
+synthesized single-member block. `raygen.glsl` and the example code are
+unchanged; acceleration structures still bind into the heap normally. On
+every other driver the workaround is inert. It's gated on
 `Device::workaroundDescriptorHeapAS` and confined to one fenced block in
 `interfaces/Crafter.Graphics-ShaderVulkan.cppm` so it can be deleted wholesale
 once a fixed NVIDIA driver ships.