Adds an 8^3 = 512-instance TLAS pick test that shoots one analytically
determined ray through a rayQuery=true PlainComputeShader and checks the
read-back committed hit (customIndex 484, t 40.75). 512 instances sit in
the < 8193 regime that the hardcoded 16384-leaf start used to miss, so the
example fails fast if the shim regresses. Verified in Firefox/WebGPU:
"[RayQueryPick] PASS".
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The software rayQuery shim's _rqTraverseTlas detected BVH leaves with a
compile-time constant TLAS_BVH_LEAVES_START = 16384 - 1, while the actual
TLAS sweep tree is built at depth log2(next_pow2(instanceCount)). For any
scene with fewer than 8193 instances the padded leaf count is far below
16384, so no node index ever reached 16383: every node looked internal,
the descent walked into zeroed out-of-tree AABBs, and the pick reported a
permanent miss. This broke every rayQuery=true compute shader (builder
picking, splash queries) on the WebGPU backend.
Pass the per-build padded leaf count to the shim the same way the
megakernel _rtwTraverseTlas reads wfParams.tlasNPadded: a small uniform
(RqTlasMeta.nPadded) at @group(1) @binding(10), written each wgpuBuildTLAS
from wfNextPow2(instanceCount), and bound by both rayQuery dispatch paths.
_rqTraverseTlas now computes leavesStart = nPadded - 1 dynamically.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>