Catcrafts/Crafter.Graphics

Fork 0

Commit graph

Author	SHA1	Message	Date
catbot	23780d83a8	fix(webgpu): request adapter's storage-buffer limit, not hardcoded 16 dom-webgpu.js capped maxStorageBuffersPerShaderStage at 16 even when the adapter reports far more (64 in our test env). The wavefront SHADE kernel already binds ~16 storage buffers before any user binding, so any RT pipeline declaring 2+ user storage buffers at @group(3) overflowed the limit and failed to build with "Too many bindings of type StorageBuffers". Request the adapter's reported maxStorageBuffersPerShaderStage / maxStorageBuffersInPipelineLayout instead of a fixed 16. `clamp` already mins against the adapter cap, so baseline-only devices still get a valid request, and the `\|\| 16` fallback + the `typeof cap === "number"` guard handle limit names a browser doesn't expose (Firefox returns null for maxStorageBuffersInPipelineLayout). Verified in-browser: a 17-storage-buffer compute pipeline fails with the exact reported error on a device clamped to 16, and builds cleanly on a device requesting the adapter's 64. RTStress renders correctly. Resolves #8 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 21:55:42 +00:00
catbot	358084185a	docs: wavefront RT in README + design-doc status; add RTStress to examples	2026-05-31 20:29:12 +00:00
catbot	4e42d663a6	WebGPU RT: wavefront tracer core (GENERATE/PREP/TRACE/SHADE/RESOLVE) Replace the megakernel @compute entry with five wavefront kernels sharing one module, connected by GPU ray/hit/payload buffers and a GPU-driven indirect bounce loop: GENERATE -> (PREP -> TRACE -> SHADE) x maxDepth -> RESOLVE - TRACE contains zero user code (pure _rtwTraverseTlas/Blas, opaque-only). - PREP publishes dispatchWorkgroupsIndirect args from the live ray count; the indirect-args buffer lives in its own bind group so it is never bound read-write in the same dispatch that consumes it as INDIRECT. - New emit/accumulate API: rtEmitPrimaryRay / rtEmitRay / rtAccumulate, plus an optional user Resolve stage (tonemap hook; identity by default). - Per-pass WfParams via a dynamic-offset uniform ring (curIsA/bounce vary between passes within one submit). - Payload-typed wfPayload binding emitted in the codegen region after the user's struct Payload; payload travels with each ray (2WH slots). - Request maxBufferSize / maxStorageBufferBindingSize / maxComputeWorkgroups PerDimension so the W*H-sized work buffers fit past the 128MB baseline. VulkanTriangle ported to the new API and renders bit-identical to the megakernel baseline at maxDepth=1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 16:24:41 +00:00

Author

SHA1

Message

Date

catbot

23780d83a8

fix(webgpu): request adapter's storage-buffer limit, not hardcoded 16

dom-webgpu.js capped maxStorageBuffersPerShaderStage at 16 even when the
adapter reports far more (64 in our test env). The wavefront SHADE kernel
already binds ~16 storage buffers before any user binding, so any RT
pipeline declaring 2+ user storage buffers at @group(3) overflowed the
limit and failed to build with "Too many bindings of type StorageBuffers".

Request the adapter's reported maxStorageBuffersPerShaderStage /
maxStorageBuffersInPipelineLayout instead of a fixed 16. `clamp` already
mins against the adapter cap, so baseline-only devices still get a valid
request, and the `|| 16` fallback + the `typeof cap === "number"` guard
handle limit names a browser doesn't expose (Firefox returns null for
maxStorageBuffersInPipelineLayout).

Verified in-browser: a 17-storage-buffer compute pipeline fails with the
exact reported error on a device clamped to 16, and builds cleanly on a
device requesting the adapter's 64. RTStress renders correctly.

Resolves #8

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-05-31 21:55:42 +00:00

catbot

358084185a

docs: wavefront RT in README + design-doc status; add RTStress to examples

2026-05-31 20:29:12 +00:00

catbot

4e42d663a6

WebGPU RT: wavefront tracer core (GENERATE/PREP/TRACE/SHADE/RESOLVE)

Replace the megakernel @compute entry with five wavefront kernels sharing
one module, connected by GPU ray/hit/payload buffers and a GPU-driven
indirect bounce loop:

  GENERATE -> (PREP -> TRACE -> SHADE) x maxDepth -> RESOLVE

- TRACE contains zero user code (pure _rtwTraverseTlas/Blas, opaque-only).
- PREP publishes dispatchWorkgroupsIndirect args from the live ray count;
  the indirect-args buffer lives in its own bind group so it is never
  bound read-write in the same dispatch that consumes it as INDIRECT.
- New emit/accumulate API: rtEmitPrimaryRay / rtEmitRay / rtAccumulate,
  plus an optional user Resolve stage (tonemap hook; identity by default).
- Per-pass WfParams via a dynamic-offset uniform ring (curIsA/bounce vary
  between passes within one submit).
- Payload-typed wfPayload binding emitted in the codegen region after the
  user's struct Payload; payload travels with each ray (2*W*H slots).
- Request maxBufferSize / maxStorageBufferBindingSize / maxComputeWorkgroups
  PerDimension so the W*H-sized work buffers fit past the 128MB baseline.

VulkanTriangle ported to the new API and renders bit-identical to the
megakernel baseline at maxDepth=1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-05-31 16:24:41 +00:00

3 commits