fix(webgpu): reshape wavefront TRACE/SHADE to 2-D to survive >4.19M rays #12

Merged
catbot merged 1 commit from claude/issue-11 into master 2026-06-01 13:10:05 +02:00
Member

Summary

The WebGPU wavefront ray tracer renders a black screen (only non-RT passes survive) once the render surface exceeds ~4K (≈8.3M pixels), while 1080p renders fine.

Root cause: the TRACE/SHADE stages were dispatched as a 1-D indirect dispatch of ceil(W·H/64) workgroups, which overflows maxComputeWorkgroupsPerDimension (65535 on Dawn/Firefox) past ~4.19M rays (~2560×1640). Per the WebGPU spec, an indirect dispatch exceeding the per-dimension limit is silently skipped — no validation error — so the world is never traced.

Found while triaging 3DForts issue #35. 3DForts consumes this wavefront path and has no control over the dispatch shape, so the fix belongs here.

Fix

  • _wfPrep() (additional/dom-webgpu.js) now spreads the workgroups across a 2-D grid: gx = min(wg, 65535), gy = ceil(wg/65535) (= 1 below the threshold, so 1080p is byte-identical).
  • The wfTrace/wfShade entry points (implementations/Crafter.Graphics-PipelineRTWebGPU.cpp) rebuild the linear ray index from (global_invocation_id, num_workgroups): i = gid.y · nwg.x · 64 + gid.x. This is a contiguous bijection onto [0, gx·gy·64); the existing i >= _wfCurCount() guard absorbs the grid overshoot.
  • GENERATE/RESOLVE already use a 2-D tile dispatch — unchanged.

Verification

No automated test suite exists (crafter-build test → "No tests matched"), so verified by exercising the real renderer. Ran examples/RTStress in Firefox/WebGPU and forced the surface to 3449×1739 = 5,997,811 rays → 93,716 workgroups (well over the 65535 cap, the regime that black-screened on master). The full cube grid renders correctly with no validation errors and TRACE time scales with the larger ray count.

Screenshots

RTStress at a 5.99M-ray (93716-workgroup) surface — full grid renders where master is black

Resolves #11

🤖 Generated with Claude Code

## Summary The WebGPU wavefront ray tracer renders a **black screen** (only non-RT passes survive) once the render surface exceeds ~**4K** (≈8.3M pixels), while 1080p renders fine. Root cause: the `TRACE`/`SHADE` stages were dispatched as a **1-D indirect** dispatch of `ceil(W·H/64)` workgroups, which overflows `maxComputeWorkgroupsPerDimension` (65535 on Dawn/Firefox) past ~4.19M rays (~2560×1640). Per the WebGPU spec, an indirect dispatch exceeding the per-dimension limit is **silently skipped** — no validation error — so the world is never traced. Found while triaging 3DForts issue #35. 3DForts consumes this wavefront path and has no control over the dispatch shape, so the fix belongs here. ## Fix - `_wfPrep()` (`additional/dom-webgpu.js`) now spreads the workgroups across a **2-D grid**: `gx = min(wg, 65535)`, `gy = ceil(wg/65535)` (= 1 below the threshold, so 1080p is byte-identical). - The `wfTrace`/`wfShade` entry points (`implementations/Crafter.Graphics-PipelineRTWebGPU.cpp`) rebuild the linear ray index from `(global_invocation_id, num_workgroups)`: `i = gid.y · nwg.x · 64 + gid.x`. This is a contiguous bijection onto `[0, gx·gy·64)`; the existing `i >= _wfCurCount()` guard absorbs the grid overshoot. - `GENERATE`/`RESOLVE` already use a 2-D tile dispatch — unchanged. ## Verification No automated test suite exists (`crafter-build test` → "No tests matched"), so verified by exercising the real renderer. Ran `examples/RTStress` in Firefox/WebGPU and forced the surface to **3449×1739 = 5,997,811 rays → 93,716 workgroups** (well over the 65535 cap, the regime that black-screened on master). The full cube grid renders correctly with no validation errors and `TRACE` time scales with the larger ray count. ## Screenshots ![RTStress at a 5.99M-ray (93716-workgroup) surface — full grid renders where master is black](https://forgejo.catcrafts.net/attachments/d9b81b62-b80a-4aff-a19e-f5f70d04a2a8) Resolves #11 🤖 Generated with [Claude Code](https://claude.com/claude-code)
6.9 KiB
A 1-D indirect dispatch of ceil(W*H/64) workgroups for the wavefront
TRACE/SHADE stages overflows maxComputeWorkgroupsPerDimension (65535 on
Dawn/Firefox) once the surface exceeds ~4.19M rays (~2560x1640). Per the
WebGPU spec such a dispatch is silently dropped — no validation error —
so at 4K the world is never traced and the accumulator stays black while
non-RT passes survive.

_wfPrep now spreads the workgroups across a 2-D grid (x clamped to 65535,
y = ceil(wg/65535)), and the wfTrace/wfShade entry points rebuild the
linear ray index from (global_invocation_id, num_workgroups). The existing
`i >= _wfCurCount()` guard absorbs the grid overshoot. GENERATE/RESOLVE
already use a 2-D tile dispatch and are unchanged.

Verified in Firefox/WebGPU with RTStress at a 3449x1739 surface (5.99M
rays, 93716 workgroups — well over the 65535 cap): renders the full cube
grid where master shows a black screen.

Resolves #11

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
catbot merged commit d7b9a41b4f into master 2026-06-01 13:10:05 +02:00
catbot deleted branch claude/issue-11 2026-06-01 13:10:06 +02:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Catcrafts/Crafter.Graphics!12
No description provided.