Commit graph

7 commits

Author SHA1 Message Date
catbot
45ecc91424 fix(vulkan-rt): merge TLAS push constant into existing block (#18)
The NVIDIA descriptor-heap AS-read workaround (#15) rewrote heap
acceleration-structure reads into a load of the TLAS device address from
a push-constant block. It always *synthesized a new* push-constant block,
so any ray-tracing shader that already declared one ended up with two —
which SPIR-V forbids ("at most one push constant block statically used per
entry point"), and vkCreateShaderModule's spirv-val check rejected:

    Entry point id '4' uses more than one PushConstant interface.

WorkaroundNvidiaAS::Patch now detects an existing PushConstant variable and,
when present, appends a single ulong member (the TLAS address) to that
block instead of adding a second one, reading the address through the
shader's own push-constant variable. The append offset is the end of the
user's block, computed from the members' explicit Offset/ArrayStride/
MatrixStride decorations (correct under both scalar and std140 layout) and
rounded up to 8. Shaders with no push constant of their own keep getting a
freshly synthesized single-member block at offset 0, exactly as before.

That offset is published via Device::workaroundTlasPushOffset and RTPass
feeds it to vkCmdPushDataEXT so the address lands where the rewritten load
reads it (0 for the synthesized case, preserving prior behaviour).

Verified on the affected driver (NVIDIA 610.43.02, RTX 4090): VulkanTriangle
ray-traces correctly and validation-clean both with and without a
user-declared raygen push constant.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 02:28:02 +00:00
catbot
950059c86e fix(vulkan-rt): work around NVIDIA descriptor-heap AS-read device-loss (#15)
Reading an acceleration structure through VK_EXT_descriptor_heap aborts
with VK_ERROR_DEVICE_LOST on NVIDIA 610.43.02 — a brand-new-extension
driver fault isolated in #7 (engine setup is correct and validation-clean;
images/buffers through the same heap work, and both traceRayEXT and inline
rayQuery fault identically on the AS read).

An acceleration structure can equally be reached by its device address via
OpConvertUToAccelerationStructureKHR, which reads no descriptor and so never
touches the faulting heap path. glslang has no GLSL spelling for that
conversion, so VulkanShader rewrites the compiled SPIR-V at module-load
time: every `OpLoad %accelStruct <heap-ptr>` becomes a load of the TLAS
device address from a synthesized push-constant block followed by the
convert. RTPass pushes the active frame's TLAS address into that push
constant. User GLSL and example code are unchanged; acceleration structures
still bind into the heap normally.

The workaround is gated on Device::workaroundDescriptorHeapAS (true only on
the NVIDIA proprietary driver) and confined to one fenced block in
Crafter.Graphics-ShaderVulkan.cppm plus the RTPass push and the shaderInt64
feature toggle — delete those once a fixed NVIDIA driver ships and the heap
AS read becomes the direct path again.

Verified: VulkanTriangle ray-traces correctly on native NVIDIA (RTX 4090),
validation-layer-clean, no device loss. The SPIR-V rewrite was independently
validated with spirv-val on both the VulkanTriangle and Sponza raygen
modules.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 01:59:54 +00:00
catbot
4e42d663a6 WebGPU RT: wavefront tracer core (GENERATE/PREP/TRACE/SHADE/RESOLVE)
Replace the megakernel @compute entry with five wavefront kernels sharing
one module, connected by GPU ray/hit/payload buffers and a GPU-driven
indirect bounce loop:

  GENERATE -> (PREP -> TRACE -> SHADE) x maxDepth -> RESOLVE

- TRACE contains zero user code (pure _rtwTraverseTlas/Blas, opaque-only).
- PREP publishes dispatchWorkgroupsIndirect args from the live ray count;
  the indirect-args buffer lives in its own bind group so it is never
  bound read-write in the same dispatch that consumes it as INDIRECT.
- New emit/accumulate API: rtEmitPrimaryRay / rtEmitRay / rtAccumulate,
  plus an optional user Resolve stage (tonemap hook; identity by default).
- Per-pass WfParams via a dynamic-offset uniform ring (curIsA/bounce vary
  between passes within one submit).
- Payload-typed wfPayload binding emitted in the codegen region after the
  user's struct Payload; payload travels with each ray (2*W*H slots).
- Request maxBufferSize / maxStorageBufferBindingSize / maxComputeWorkgroups
  PerDimension so the W*H-sized work buffers fit past the 128MB baseline.

VulkanTriangle ported to the new API and renders bit-identical to the
megakernel baseline at maxDepth=1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 16:24:41 +00:00
b5d0f52da0 webgpu sponza 2026-05-19 00:27:09 +02:00
5553ded476 webgpu triangle 2026-05-18 18:43:30 +02:00
5352ef69a2 browser DOM support 2026-05-18 02:07:48 +02:00
216972e73a new UI system 2026-05-01 23:35:37 +02:00
Renamed from interfaces/Crafter.Graphics-MouseElement.cppm (Browse further)