WebGPU RT: wavefront tracer core (GENERATE/PREP/TRACE/SHADE/RESOLVE)

Replace the megakernel @compute entry with five wavefront kernels sharing
one module, connected by GPU ray/hit/payload buffers and a GPU-driven
indirect bounce loop:

  GENERATE -> (PREP -> TRACE -> SHADE) x maxDepth -> RESOLVE

- TRACE contains zero user code (pure _rtwTraverseTlas/Blas, opaque-only).
- PREP publishes dispatchWorkgroupsIndirect args from the live ray count;
  the indirect-args buffer lives in its own bind group so it is never
  bound read-write in the same dispatch that consumes it as INDIRECT.
- New emit/accumulate API: rtEmitPrimaryRay / rtEmitRay / rtAccumulate,
  plus an optional user Resolve stage (tonemap hook; identity by default).
- Per-pass WfParams via a dynamic-offset uniform ring (curIsA/bounce vary
  between passes within one submit).
- Payload-typed wfPayload binding emitted in the codegen region after the
  user's struct Payload; payload travels with each ray (2*W*H slots).
- Request maxBufferSize / maxStorageBufferBindingSize / maxComputeWorkgroups
  PerDimension so the W*H-sized work buffers fit past the 128MB baseline.

VulkanTriangle ported to the new API and renders bit-identical to the
megakernel baseline at maxDepth=1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
catbot 2026-05-31 16:24:41 +00:00
commit 4e42d663a6
9 changed files with 755 additions and 101 deletions

View file

@ -1,11 +1,12 @@
// WebGPU port of raygen.glsl. Mirrors the pinhole camera setup the
// Payload type is declared in closesthit.wgsl (concatenated earlier).
// WebGPU wavefront raygen. Runs in GENERATE: compute the pinhole camera
// ray and emit it as the pixel's primary ray. Shading happens later in
// SHADE (closesthit/miss). The Payload type is declared in closesthit.wgsl.
fn raygen_main(gid: vec3<u32>) {
if (gid.x >= hdr.surfaceW || gid.y >= hdr.surfaceH) { return; }
if (gid.x >= wfParams.surfaceW || gid.y >= wfParams.surfaceH) { return; }
let pixel = vec2<f32>(f32(gid.x), f32(gid.y));
let resolution = vec2<f32>(f32(hdr.surfaceW), f32(hdr.surfaceH));
let resolution = vec2<f32>(f32(wfParams.surfaceW), f32(wfParams.surfaceH));
let uv = (pixel + vec2<f32>(0.5)) / resolution;
let ndc = uv * 2.0 - vec2<f32>(1.0);
@ -23,17 +24,11 @@ fn raygen_main(gid: vec3<u32>) {
var payload: Payload;
payload.color = vec3<f32>(0.0);
traceRay(
0u, // tlasIdx (unused)
0u, // ray flags
0xFFu, // cull mask
0u, 0u, 0u, // sbtRecordOffset, sbtRecordStride, missIndex
rtEmitPrimaryRay(
origin, 0.001,
direction, 10000.0,
&payload,
);
textureStore(outImage,
vec2<i32>(i32(gid.x), i32(gid.y)),
vec4<f32>(payload.color, 1.0));
0u, // ray flags
0xFFu, // cull mask
0u, 0u, // sbtRecordOffset, missIndex
payload);
}