WebGPU wavefront: 1-D indirect TRACE/SHADE dispatch overflows maxComputeWorkgroupsPerDimension at ~4K → black screen #11
Labels
No labels
claude:done
claude:in-progress
claude:ready
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
Catcrafts/Crafter.Graphics#11
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
The WebGPU wavefront ray tracer renders a black screen (only non-RT passes such as a UI overlay survive) once the render surface reaches roughly 4K (≈ 8.3 M pixels). 1080p renders correctly. Root cause: the
TRACE/SHADEstages are dispatched as a 1‑D indirect dispatch ofceil(W·H/64)workgroups, which overflowsmaxComputeWorkgroupsPerDimension(65535 on Dawn/Firefox). Per the WebGPU spec, an indirect dispatch whose workgroup count exceeds the per‑dimension limit is silently skipped — no validation error — so the world is never traced and the accumulator stays black.Found while triaging 3DForts (
Catcrafts/3DForts) issue #35 "Browser rendering goes black at high resolutions". 3DForts consumes Crafter.Graphics' wavefront WebGPU path and has no control over the dispatch shape, so the fix belongs here.Version:
master@afb9e320e128abc2ce1e10a9a46db34a3a493f1f.Reproduction
width·height ≈ 3840×2160).ceil(3840·2160 / 64) = 129600workgroups;ceil(1920·1080 / 64) = 32400. The threshold is65535·64 = 4 194 240rays ≈ a 2560×1640 surface.Empirical confirmation
Standalone WebGPU probe on the same Firefox + GPU, dispatching a trivial
@workgroup_size(64)atomic-increment kernel viadispatchWorkgroupsIndirect:[x,1,1]129600)65535)32400)adapter.limits.maxComputeWorkgroupsPerDimension === 65535. Note the device cap is 65535 — it cannot be raised by requesting more inrequiredLimits, so simply bumping the requested limit does not help; the dispatch must be reshaped.Root cause (file references)
additional/dom-webgpu.js—_wfPrep()publishes a 1‑D indirect dispatch:implementations/Crafter.Graphics-PipelineRTWebGPU.cpp— theTRACE/SHADEentry points read the ray index straight fromgid.x, so they assume a 1‑D grid:The
clamp("maxComputeWorkgroupsPerDimension", 65535)indom-webgpu.js(its comment says it requests the adapter max "for headroom", but the device cap is 65535) cannot save this — 2‑D reshaping is required.GENERATE/RESOLVEare unaffected: they use a 2‑D(tileX, tileY)screen-tile dispatch.Proposed fix
Spread the
TRACE/SHADEworkgroups across a 2‑D grid in_wfPrep, and reconstruct the linear ray index from(global_invocation_id, num_workgroups)in the entry points._wfPrep()indom-webgpu.js:Entry points in
Crafter.Graphics-PipelineRTWebGPU.cpp(the64uis the@workgroup_size(64)x-extent):gid.x ∈ [0, gx·64),gid.y ∈ [0, gy), soi = gid.y·(nwg.x·64) + gid.xis a contiguous bijection onto[0, gx·gy·64); the existingif (i >= _wfCurCount()) { return; }guard in_wfTrace/_wfShadeabsorbs the overshoot. No other stage needs changes.Secondary, related scaling limit (not the cause of the 4K black screen, but worth flagging)
The per-pixel work buffers in
ensureWavefrontBuffersscale withW·H. At exactly 4K (dpr 1) the largest,wf.payload = 2·cap·64, is ≈ 1012 MiB — just under the 1 GiB (1<<30) requested formaxBufferSize/maxStorageBufferBindingSize, so it still allocates and the failure is purely the workgroup-dimension drop. But at 4K withdevicePixelRatio > 1, or 5K+,payload/hitscross 1 GiB and creation will fail with a real validation error. Consider raising those requested limits toward the adapter max and/or capping render scale.