fix(vulkan-rt): merge TLAS push constant into existing block (#18) #20

Merged
catbot merged 2 commits from claude/issue-18 into master 2026-06-03 04:29:01 +02:00
Member

Problem

The NVIDIA descriptor-heap AS-read workaround (#15) rewrites in-shader heap acceleration-structure reads into a load of the TLAS device address from a push-constant block. It always synthesized a new push-constant block, so any ray-tracing shader that already declared one ended up with two. SPIR-V allows at most one push-constant block statically used per entry point, so vkCreateShaderModule's spirv-val check rejected the module:

Entry point id '4' uses more than one PushConstant interface.

Fix

WorkaroundNvidiaAS::Patch now detects an existing PushConstant variable and, when present, appends a single ulong member (the TLAS address) to that block instead of adding a second one — reading the address through the shader's own push-constant variable. The append offset is the end of the user's block, computed from the members' explicit Offset/ArrayStride/MatrixStride decorations (correct under both scalar and std140 layout) and rounded up to 8.

Shaders with no push constant of their own keep getting a freshly synthesized single-member block at offset 0, exactly as before.

The offset is published via Device::workaroundTlasPushOffset, which RTPass feeds to vkCmdPushDataEXT so the address lands where the rewritten load reads it (0 for the synthesized case → prior behaviour unchanged).

Tests

New tests/PushConstantRewrite compiles representative raygen shaders with glslang, runs the real Patch over them, and asserts with spirv-val (the same invocation vkCreateShaderModule uses) that the result is valid with exactly one push-constant block — covering the merge path (mat4/vec3/uint, lone uint, array layout), the synthesize path, and a no-op case (push constant but no AS read), and checking the published TLAS offset for each.

[no-push-constant] ok (push-constant vars: 1, tlas offset: 0)
[merge-mat4-vec3-uint] ok (push-constant vars: 1, tlas offset: 80)
[merge-uint] ok (push-constant vars: 1, tlas offset: 8)
[merge-array] ok (push-constant vars: 1, tlas offset: 40)
[push-constant-no-as] ok (push-constant vars: 1, tlas offset: 0)
all push-constant rewrite cases passed

Also verified end-to-end on the affected driver (NVIDIA 610.43.02, RTX 4090): VulkanTriangle ray-traces correctly and validation-clean both with and without a user-declared raygen push constant.

Screenshots

merge-path render

Resolves #18

🤖 Generated with Claude Code

## Problem The NVIDIA descriptor-heap AS-read workaround (#15) rewrites in-shader heap acceleration-structure reads into a load of the TLAS device address from a push-constant block. It **always synthesized a new** push-constant block, so any ray-tracing shader that *already declared one* ended up with two. SPIR-V allows at most one push-constant block statically used per entry point, so `vkCreateShaderModule`'s `spirv-val` check rejected the module: ``` Entry point id '4' uses more than one PushConstant interface. ``` ## Fix `WorkaroundNvidiaAS::Patch` now detects an existing `PushConstant` variable and, when present, **appends a single `ulong` member** (the TLAS address) to that block instead of adding a second one — reading the address through the shader's own push-constant variable. The append offset is the end of the user's block, computed from the members' explicit `Offset`/`ArrayStride`/`MatrixStride` decorations (correct under both scalar and std140 layout) and rounded up to 8. Shaders with no push constant of their own keep getting a freshly synthesized single-member block at offset 0, exactly as before. The offset is published via `Device::workaroundTlasPushOffset`, which `RTPass` feeds to `vkCmdPushDataEXT` so the address lands where the rewritten load reads it (0 for the synthesized case → prior behaviour unchanged). ## Tests New `tests/PushConstantRewrite` compiles representative raygen shaders with glslang, runs the real `Patch` over them, and asserts with `spirv-val` (the same invocation `vkCreateShaderModule` uses) that the result is valid with exactly one push-constant block — covering the merge path (mat4/vec3/uint, lone uint, array layout), the synthesize path, and a no-op case (push constant but no AS read), and checking the published TLAS offset for each. ``` [no-push-constant] ok (push-constant vars: 1, tlas offset: 0) [merge-mat4-vec3-uint] ok (push-constant vars: 1, tlas offset: 80) [merge-uint] ok (push-constant vars: 1, tlas offset: 8) [merge-array] ok (push-constant vars: 1, tlas offset: 40) [push-constant-no-as] ok (push-constant vars: 1, tlas offset: 0) all push-constant rewrite cases passed ``` Also verified end-to-end on the affected driver (NVIDIA 610.43.02, RTX 4090): `VulkanTriangle` ray-traces correctly and validation-clean both with and without a user-declared raygen push constant. ## Screenshots ![merge-path render](https://forgejo.catcrafts.net/attachments/d8cc158b-420b-4a3a-8004-e2cff4afde2c) Resolves #18 🤖 Generated with [Claude Code](https://claude.com/claude-code)
The NVIDIA descriptor-heap AS-read workaround (#15) rewrote heap
acceleration-structure reads into a load of the TLAS device address from
a push-constant block. It always *synthesized a new* push-constant block,
so any ray-tracing shader that already declared one ended up with two —
which SPIR-V forbids ("at most one push constant block statically used per
entry point"), and vkCreateShaderModule's spirv-val check rejected:

    Entry point id '4' uses more than one PushConstant interface.

WorkaroundNvidiaAS::Patch now detects an existing PushConstant variable and,
when present, appends a single ulong member (the TLAS address) to that
block instead of adding a second one, reading the address through the
shader's own push-constant variable. The append offset is the end of the
user's block, computed from the members' explicit Offset/ArrayStride/
MatrixStride decorations (correct under both scalar and std140 layout) and
rounded up to 8. Shaders with no push constant of their own keep getting a
freshly synthesized single-member block at offset 0, exactly as before.

That offset is published via Device::workaroundTlasPushOffset and RTPass
feeds it to vkCmdPushDataEXT so the address lands where the rewritten load
reads it (0 for the synthesized case, preserving prior behaviour).

Verified on the affected driver (NVIDIA 610.43.02, RTX 4090): VulkanTriangle
ray-traces correctly and validation-clean both with and without a
user-declared raygen push constant.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds tests/PushConstantRewrite, a host test that compiles representative
ray-generation shaders with glslang, runs the real WorkaroundNvidiaAS::Patch
over them, and asserts with spirv-val (the same invocation vkCreateShaderModule
uses) that the result is valid and contains exactly one push-constant block —
covering both the merge path (shaders that already declare a push constant,
including mat4/vec3/uint, a lone uint, and an array layout) and the synthesize
path, plus a no-op case (push constant but no AS read). It also checks the
published TLAS push offset for each layout.

The workaround namespace is exported so the test can drive Patch directly; both
go away with the rest of the workaround. project.cpp wires the test as an
executable that recompiles the module and requires glslang + spirv-val.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
catbot merged commit 2790bbd576 into master 2026-06-03 04:29:01 +02:00
catbot deleted branch claude/issue-18 2026-06-03 04:29:01 +02:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Catcrafts/Crafter.Graphics!20
No description provided.