Extends the cross-backend RT type surface for procedural geometry +
any-hit on the WebGPU path:
- RTShaderGroupType::ProceduralHitGroup + RTShaderGroup::intersectionShader
(mirror VK_RAY_TRACING_SHADER_GROUP_TYPE_PROCEDURAL_HIT_GROUP_KHR).
- WebGPURTStage::Intersection for AABB intersection shaders.
- Mesh::BuildProcedural(span<RTAabb>, opaque) — the WebGPU analog of a
VK_GEOMETRY_TYPE_AABBS_KHR geometry.
- wgpuRegisterMeshBLAS gains geomType / opaqueFlag / primCount.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the megakernel @compute entry with five wavefront kernels sharing
one module, connected by GPU ray/hit/payload buffers and a GPU-driven
indirect bounce loop:
GENERATE -> (PREP -> TRACE -> SHADE) x maxDepth -> RESOLVE
- TRACE contains zero user code (pure _rtwTraverseTlas/Blas, opaque-only).
- PREP publishes dispatchWorkgroupsIndirect args from the live ray count;
the indirect-args buffer lives in its own bind group so it is never
bound read-write in the same dispatch that consumes it as INDIRECT.
- New emit/accumulate API: rtEmitPrimaryRay / rtEmitRay / rtAccumulate,
plus an optional user Resolve stage (tonemap hook; identity by default).
- Per-pass WfParams via a dynamic-offset uniform ring (curIsA/bounce vary
between passes within one submit).
- Payload-typed wfPayload binding emitted in the codegen region after the
user's struct Payload; payload travels with each ray (2*W*H slots).
- Request maxBufferSize / maxStorageBufferBindingSize / maxComputeWorkgroups
PerDimension so the W*H-sized work buffers fit past the 128MB baseline.
VulkanTriangle ported to the new API and renders bit-identical to the
megakernel baseline at maxDepth=1.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>