WebGPU rayQuery TLAS traversal hardcodes leaf-start (16383) — picks always miss below 8193 instances #25

Closed
opened 2026-06-04 15:12:58 +02:00 by catbot · 0 comments
Member

WebGPU rayQuery TLAS traversal uses a hardcoded leaf-start, so picks miss for any realistic instance count

additional/dom-webgpu.js provides a software rayQuery* shim for compute shaders (rayQueryFlag pipelines). Its TLAS traversal _rqTraverseTlas detects BVH leaves with a compile-time constant:

const TLAS_BVH_N_PADDED: u32 = 16384u;
const TLAS_BVH_LEAVES_START: u32 = TLAS_BVH_N_PADDED - 1u;   // 16383
...
fn _rqTraverseTlas(rq) {
    ...
    if (nodeIdx >= TLAS_BVH_LEAVES_START) {            // <-- always 16383
        let leafIdx = nodeIdx - TLAS_BVH_LEAVES_START;
        let i = tlasEntryOrder[leafIdx];
        ...

But the megakernel traversal _rtwTraverseTlas correctly uses the per-frame dynamic value:

let leavesStart = wfParams.tlasNPadded - 1u;            // nextPow2(instanceCount) - 1
if (nodeIdx >= leavesStart) { ... }

tlasNPadded = wfNextPow2(instanceCount) (see the TLAS build, ~line 3220). For any scene with fewer than 8193 instances, tlasNPadded is far below 16384, so in the rayQuery shim no node index ever reaches 16383 — every node is treated as internal and the descent walks 2*nodeIdx+1 into zeroed/out-of-tree AABBs, which fail the slab test. Result: _rqTraverseTlas reports a permanent miss. The rayQuery path can only ever hit when instanceCount happens to land in (8192, 16384].

This breaks every rayQuery=true compute shader on the WebGPU backend (builder picking, splash queries, …). The hardware-RT path on Vulkan is unaffected because it uses the real driver ray query.

Repro

3DForts builder picking: enter a match (~2000 RT instances), hover any node/brace — the GPU picker's PickerResult.hit stays 0 for the whole screen, while the identical request returns hits on Vulkan.

Suggested fix

Have the rayQuery shim derive leavesStart from the active TLAS the same way the megakernel does — pass tlasNPadded (or the BVH leaf count) into the rayQuery group(1) bindings / a small uniform, instead of the hardcoded TLAS_BVH_N_PADDED - 1u. The leaf count is already known at TLAS-build time (currentEntryOrder length / wfNextPow2(instanceCount)).

Workaround in the meantime

Downstream (3DForts #92) we fell back to a CPU ray test for builder picking on the WebGPU target so selection/build works in the browser; the GPU rayQuery picker can be re-enabled once this is fixed.

## WebGPU rayQuery TLAS traversal uses a hardcoded leaf-start, so picks miss for any realistic instance count `additional/dom-webgpu.js` provides a software `rayQuery*` shim for compute shaders (`rayQueryFlag` pipelines). Its TLAS traversal `_rqTraverseTlas` detects BVH leaves with a **compile-time constant**: ```wgsl const TLAS_BVH_N_PADDED: u32 = 16384u; const TLAS_BVH_LEAVES_START: u32 = TLAS_BVH_N_PADDED - 1u; // 16383 ... fn _rqTraverseTlas(rq) { ... if (nodeIdx >= TLAS_BVH_LEAVES_START) { // <-- always 16383 let leafIdx = nodeIdx - TLAS_BVH_LEAVES_START; let i = tlasEntryOrder[leafIdx]; ... ``` But the megakernel traversal `_rtwTraverseTlas` correctly uses the **per-frame dynamic** value: ```wgsl let leavesStart = wfParams.tlasNPadded - 1u; // nextPow2(instanceCount) - 1 if (nodeIdx >= leavesStart) { ... } ``` `tlasNPadded = wfNextPow2(instanceCount)` (see the TLAS build, ~line 3220). For any scene with fewer than 8193 instances, `tlasNPadded` is far below 16384, so in the rayQuery shim **no node index ever reaches 16383** — every node is treated as internal and the descent walks `2*nodeIdx+1` into zeroed/out-of-tree AABBs, which fail the slab test. Result: `_rqTraverseTlas` reports a permanent miss. The rayQuery path can only ever hit when `instanceCount` happens to land in `(8192, 16384]`. This breaks **every** `rayQuery=true` compute shader on the WebGPU backend (builder picking, splash queries, …). The hardware-RT path on Vulkan is unaffected because it uses the real driver ray query. ### Repro 3DForts builder picking: enter a match (~2000 RT instances), hover any node/brace — the GPU picker's `PickerResult.hit` stays 0 for the whole screen, while the identical request returns hits on Vulkan. ### Suggested fix Have the rayQuery shim derive `leavesStart` from the active TLAS the same way the megakernel does — pass `tlasNPadded` (or the BVH leaf count) into the rayQuery group(1) bindings / a small uniform, instead of the hardcoded `TLAS_BVH_N_PADDED - 1u`. The leaf count is already known at TLAS-build time (`currentEntryOrder` length / `wfNextPow2(instanceCount)`). ### Workaround in the meantime Downstream (3DForts #92) we fell back to a CPU ray test for builder picking on the WebGPU target so selection/build works in the browser; the GPU rayQuery picker can be re-enabled once this is fixed.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Catcrafts/Crafter.Graphics#25
No description provided.