fix(vulkan-rt): merge TLAS push constant into existing block (#18) #20

Merged

catbot merged 2 commits from claude/issue-18 into master

2026-06-03 04:29:01 +02:00

catbot commented

2026-06-03 04:28:43 +02:00

Member

Problem

The NVIDIA descriptor-heap AS-read workaround (#15) rewrites in-shader heap acceleration-structure reads into a load of the TLAS device address from a push-constant block. It always synthesized a new push-constant block, so any ray-tracing shader that already declared one ended up with two. SPIR-V allows at most one push-constant block statically used per entry point, so vkCreateShaderModule's spirv-val check rejected the module:

Entry point id '4' uses more than one PushConstant interface.

Fix

WorkaroundNvidiaAS::Patch now detects an existing PushConstant variable and, when present, appends a single ulong member (the TLAS address) to that block instead of adding a second one — reading the address through the shader's own push-constant variable. The append offset is the end of the user's block, computed from the members' explicit Offset/ArrayStride/MatrixStride decorations (correct under both scalar and std140 layout) and rounded up to 8.

Shaders with no push constant of their own keep getting a freshly synthesized single-member block at offset 0, exactly as before.

The offset is published via Device::workaroundTlasPushOffset, which RTPass feeds to vkCmdPushDataEXT so the address lands where the rewritten load reads it (0 for the synthesized case → prior behaviour unchanged).

Tests

New tests/PushConstantRewrite compiles representative raygen shaders with glslang, runs the real Patch over them, and asserts with spirv-val (the same invocation vkCreateShaderModule uses) that the result is valid with exactly one push-constant block — covering the merge path (mat4/vec3/uint, lone uint, array layout), the synthesize path, and a no-op case (push constant but no AS read), and checking the published TLAS offset for each.

[no-push-constant] ok (push-constant vars: 1, tlas offset: 0)
[merge-mat4-vec3-uint] ok (push-constant vars: 1, tlas offset: 80)
[merge-uint] ok (push-constant vars: 1, tlas offset: 8)
[merge-array] ok (push-constant vars: 1, tlas offset: 40)
[push-constant-no-as] ok (push-constant vars: 1, tlas offset: 0)
all push-constant rewrite cases passed

Also verified end-to-end on the affected driver (NVIDIA 610.43.02, RTX 4090): VulkanTriangle ray-traces correctly and validation-clean both with and without a user-declared raygen push constant.

Screenshots

Resolves #18

🤖 Generated with Claude Code

## Problem The NVIDIA descriptor-heap AS-read workaround (#15) rewrites in-shader heap acceleration-structure reads into a load of the TLAS device address from a push-constant block. It **always synthesized a new** push-constant block, so any ray-tracing shader that *already declared one* ended up with two. SPIR-V allows at most one push-constant block statically used per entry point, so `vkCreateShaderModule`'s `spirv-val` check rejected the module: ``` Entry point id '4' uses more than one PushConstant interface. ``` ## Fix `WorkaroundNvidiaAS::Patch` now detects an existing `PushConstant` variable and, when present, **appends a single `ulong` member** (the TLAS address) to that block instead of adding a second one — reading the address through the shader's own push-constant variable. The append offset is the end of the user's block, computed from the members' explicit `Offset`/`ArrayStride`/`MatrixStride` decorations (correct under both scalar and std140 layout) and rounded up to 8. Shaders with no push constant of their own keep getting a freshly synthesized single-member block at offset 0, exactly as before. The offset is published via `Device::workaroundTlasPushOffset`, which `RTPass` feeds to `vkCmdPushDataEXT` so the address lands where the rewritten load reads it (0 for the synthesized case → prior behaviour unchanged). ## Tests New `tests/PushConstantRewrite` compiles representative raygen shaders with glslang, runs the real `Patch` over them, and asserts with `spirv-val` (the same invocation `vkCreateShaderModule` uses) that the result is valid with exactly one push-constant block — covering the merge path (mat4/vec3/uint, lone uint, array layout), the synthesize path, and a no-op case (push constant but no AS read), and checking the published TLAS offset for each. ``` [no-push-constant] ok (push-constant vars: 1, tlas offset: 0) [merge-mat4-vec3-uint] ok (push-constant vars: 1, tlas offset: 80) [merge-uint] ok (push-constant vars: 1, tlas offset: 8) [merge-array] ok (push-constant vars: 1, tlas offset: 40) [push-constant-no-as] ok (push-constant vars: 1, tlas offset: 0) all push-constant rewrite cases passed ``` Also verified end-to-end on the affected driver (NVIDIA 610.43.02, RTX 4090): `VulkanTriangle` ray-traces correctly and validation-clean both with and without a user-declared raygen push constant. ## Screenshots ![merge-path render](https://forgejo.catcrafts.net/attachments/d8cc158b-420b-4a3a-8004-e2cff4afde2c) Resolves #18 🤖 Generated with [Claude Code](https://claude.com/claude-code)

result.png

45 KiB

catbot added 2 commits

2026-06-03 04:28:43 +02:00

fix(vulkan-rt): merge TLAS push constant into existing block (#18 ) 45ecc91424

The NVIDIA descriptor-heap AS-read workaround (#15) rewrote heap
acceleration-structure reads into a load of the TLAS device address from
a push-constant block. It always *synthesized a new* push-constant block,
so any ray-tracing shader that already declared one ended up with two —
which SPIR-V forbids ("at most one push constant block statically used per
entry point"), and vkCreateShaderModule's spirv-val check rejected:

    Entry point id '4' uses more than one PushConstant interface.

WorkaroundNvidiaAS::Patch now detects an existing PushConstant variable and,
when present, appends a single ulong member (the TLAS address) to that
block instead of adding a second one, reading the address through the
shader's own push-constant variable. The append offset is the end of the
user's block, computed from the members' explicit Offset/ArrayStride/
MatrixStride decorations (correct under both scalar and std140 layout) and
rounded up to 8. Shaders with no push constant of their own keep getting a
freshly synthesized single-member block at offset 0, exactly as before.

That offset is published via Device::workaroundTlasPushOffset and RTPass
feeds it to vkCmdPushDataEXT so the address lands where the rewritten load
reads it (0 for the synthesized case, preserving prior behaviour).

Verified on the affected driver (NVIDIA 610.43.02, RTX 4090): VulkanTriangle
ray-traces correctly and validation-clean both with and without a
user-declared raygen push constant.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

test(vulkan-rt): spirv-val coverage for the push-constant rewrite (#18 ) 471f480c5d

Adds tests/PushConstantRewrite, a host test that compiles representative
ray-generation shaders with glslang, runs the real WorkaroundNvidiaAS::Patch
over them, and asserts with spirv-val (the same invocation vkCreateShaderModule
uses) that the result is valid and contains exactly one push-constant block —
covering both the merge path (shaders that already declare a push constant,
including mat4/vec3/uint, a lone uint, and an array layout) and the synthesize
path, plus a no-op case (push constant but no AS read). It also checks the
published TLAS push offset for each layout.

The workaround namespace is exported so the test can drive Patch directly; both
go away with the rest of the workaround. project.cpp wires the test as an
executable that recompiles the module and requires glslang + spirv-val.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

catbot merged commit 2790bbd576 into master

2026-06-03 04:29:01 +02:00

catbot deleted branch claude/issue-18

2026-06-03 04:29:01 +02:00

catbot referenced this pull request from a commit

2026-06-03 04:29:02 +02:00

Merge pull request 'fix(vulkan-rt): merge TLAS push constant into existing block (#18)' (#20) from claude/issue-18 into master