Replace the megakernel @compute entry with five wavefront kernels sharing one module, connected by GPU ray/hit/payload buffers and a GPU-driven indirect bounce loop: GENERATE -> (PREP -> TRACE -> SHADE) x maxDepth -> RESOLVE - TRACE contains zero user code (pure _rtwTraverseTlas/Blas, opaque-only). - PREP publishes dispatchWorkgroupsIndirect args from the live ray count; the indirect-args buffer lives in its own bind group so it is never bound read-write in the same dispatch that consumes it as INDIRECT. - New emit/accumulate API: rtEmitPrimaryRay / rtEmitRay / rtAccumulate, plus an optional user Resolve stage (tonemap hook; identity by default). - Per-pass WfParams via a dynamic-offset uniform ring (curIsA/bounce vary between passes within one submit). - Payload-typed wfPayload binding emitted in the codegen region after the user's struct Payload; payload travels with each ray (2*W*H slots). - Request maxBufferSize / maxStorageBufferBindingSize / maxComputeWorkgroups PerDimension so the W*H-sized work buffers fit past the 128MB baseline. VulkanTriangle ported to the new API and renders bit-identical to the megakernel baseline at maxDepth=1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
15 lines
546 B
WebGPU Shading Language
15 lines
546 B
WebGPU Shading Language
// Payload declared here so the WGSL assembler sees it before the wfPayload
|
|
// binding, the SHADE dispatch, and the raygen source.
|
|
//
|
|
// Wavefront model: closesthit_main runs in SHADE and accumulates the
|
|
// pixel's color directly (rtAccumulate) instead of writing a payload that
|
|
// raygen reads back.
|
|
|
|
struct Payload {
|
|
color: vec3<f32>,
|
|
};
|
|
|
|
fn closesthit_main(ray: RayDesc, hit: HitInfo, payload: ptr<function, Payload>) {
|
|
let bary = vec3<f32>(1.0 - hit.attribs.x - hit.attribs.y, hit.attribs.x, hit.attribs.y);
|
|
rtAccumulate(bary);
|
|
}
|