Commit graph

79 commits

Author SHA1 Message Date
catbot
b645746c8c test(webgpu-rt): RayQueryPick example exercising the rayQuery TLAS shim (#25)
Adds an 8^3 = 512-instance TLAS pick test that shoots one analytically
determined ray through a rayQuery=true PlainComputeShader and checks the
read-back committed hit (customIndex 484, t 40.75). 512 instances sit in
the < 8193 regime that the hardcoded 16384-leaf start used to miss, so the
example fails fast if the shim regresses. Verified in Firefox/WebGPU:
"[RayQueryPick] PASS".

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 13:33:04 +00:00
catbot
d08c7cea11 docs(vulkan-rt): document dynamic descriptor_heap-index hit-shader fault (#23)
Indexing a `layout(descriptor_heap)` array with a runtime (non-constant)
index inside a ray-tracing hit shader device-losts on NVIDIA 610.43.02,
for both SSBO and sampled-image descriptors. A constant/spec-constant
index is fine, and the same dynamic pattern works in fragment shaders, so
it's an RT-stage-specific driver fault — the same family as #7/#15
(descriptor-heap AS reads) and #21/#22 (RT recursion + compute TLAS push).

Unlike the AS-read fault, this cannot be worked around transparently: a
sampled image has no device-address escape hatch the way an acceleration
structure does (OpConvertUToAccelerationStructureKHR), and a buffer-only
buffer_reference rewrite would need a whole address-table architecture
while still leaving the texture half broken. So the resolution is the
documented-limitation path (the precedent set by #7).

Records the fault and its isolation in README's Native RT status and in
the Sponza example README (the textured-closest-hit example, which already
reads its albedo through a spec-constant slot for exactly this reason).
Documents the recommended consumer pattern: bind one resource and index
*within* it dynamically (single geometry SSBO / buffer_reference at a
spec-constant slot; one texture2DArray indexed by layer) rather than
selecting a descriptor dynamically — what the WebGPU path already does.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 20:05:12 +00:00
catbot
45ecc91424 fix(vulkan-rt): merge TLAS push constant into existing block (#18)
The NVIDIA descriptor-heap AS-read workaround (#15) rewrote heap
acceleration-structure reads into a load of the TLAS device address from
a push-constant block. It always *synthesized a new* push-constant block,
so any ray-tracing shader that already declared one ended up with two —
which SPIR-V forbids ("at most one push constant block statically used per
entry point"), and vkCreateShaderModule's spirv-val check rejected:

    Entry point id '4' uses more than one PushConstant interface.

WorkaroundNvidiaAS::Patch now detects an existing PushConstant variable and,
when present, appends a single ulong member (the TLAS address) to that
block instead of adding a second one, reading the address through the
shader's own push-constant variable. The append offset is the end of the
user's block, computed from the members' explicit Offset/ArrayStride/
MatrixStride decorations (correct under both scalar and std140 layout) and
rounded up to 8. Shaders with no push constant of their own keep getting a
freshly synthesized single-member block at offset 0, exactly as before.

That offset is published via Device::workaroundTlasPushOffset and RTPass
feeds it to vkCmdPushDataEXT so the address lands where the rewritten load
reads it (0 for the synthesized case, preserving prior behaviour).

Verified on the affected driver (NVIDIA 610.43.02, RTX 4090): VulkanTriangle
ray-traces correctly and validation-clean both with and without a
user-declared raygen push constant.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 02:28:02 +00:00
catbot
950059c86e fix(vulkan-rt): work around NVIDIA descriptor-heap AS-read device-loss (#15)
Reading an acceleration structure through VK_EXT_descriptor_heap aborts
with VK_ERROR_DEVICE_LOST on NVIDIA 610.43.02 — a brand-new-extension
driver fault isolated in #7 (engine setup is correct and validation-clean;
images/buffers through the same heap work, and both traceRayEXT and inline
rayQuery fault identically on the AS read).

An acceleration structure can equally be reached by its device address via
OpConvertUToAccelerationStructureKHR, which reads no descriptor and so never
touches the faulting heap path. glslang has no GLSL spelling for that
conversion, so VulkanShader rewrites the compiled SPIR-V at module-load
time: every `OpLoad %accelStruct <heap-ptr>` becomes a load of the TLAS
device address from a synthesized push-constant block followed by the
convert. RTPass pushes the active frame's TLAS address into that push
constant. User GLSL and example code are unchanged; acceleration structures
still bind into the heap normally.

The workaround is gated on Device::workaroundDescriptorHeapAS (true only on
the NVIDIA proprietary driver) and confined to one fenced block in
Crafter.Graphics-ShaderVulkan.cppm plus the RTPass push and the shaderInt64
feature toggle — delete those once a fixed NVIDIA driver ships and the heap
AS read becomes the direct path again.

Verified: VulkanTriangle ray-traces correctly on native NVIDIA (RTX 4090),
validation-layer-clean, no device loss. The SPIR-V rewrite was independently
validated with spirv-val on both the VulkanTriangle and Sponza raygen
modules.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 01:59:54 +00:00
catbot
5dd1086f08 docs(webgpu-rt): add RTVolume example (procedural spheres + any-hit cut-out)
A 3x3x3 grid of AABB-geometry spheres rendered through an analytic
ray-sphere intersection shader, with an any-hit spherical-checkerboard
cut-out so the background shows through. Exercises both features end to
end on the WebGPU wavefront tracer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 22:09:30 +00:00
catbot
464cb66063 docs(vulkan-rt): record native descriptor-heap AS read as a driver fault
Investigated the VK_ERROR_DEVICE_LOST on the native VulkanTriangle (#7).
Verified the engine side is correct and validation-clean: the BLAS/TLAS
build finishes before render (FinishInit waits), the built instance is
well-formed (identity transform, mask=0xFF, correct BLAS ref), and
vkWriteResourceDescriptorsEXT stores the TLAS device address at the
expected heap offset (confirmed by dumping the heap bytes). Khronos
validation 1.4.350 reports zero errors.

The fault is isolated to reading the acceleration structure through
VK_EXT_descriptor_heap:
- images/buffers via the same heap render fine (trace disabled -> the
  raygen imageStore path renders a full gradient);
- both traceRayEXT and inline rayQueryEXT (no SBT) fault identically on
  the AS read;
- reproduces with the AS descriptor at heap byte 0 / shader index 0 (no
  offset/stride ambiguity) and regardless of pAddressRange size.

NVIDIA 610.43.02 is the only descriptor_heap implementation available
(llvmpipe lacks the extension), so there is no second implementation to
cross-check. Conclusion: driver-side fault in NVIDIA's brand-new
VK_EXT_descriptor_heap acceleration-structure path; should be reported to
NVIDIA. The traceRayEXT call is left active so the example stays a
faithful reproducer. Documented in both READMEs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 22:21:57 +00:00
catbot
376e66aeed WebGPU RT: port Sponza to wavefront (shadow ray in SHADE)
Restructure Sponza for the wavefront model: raygen emits the primary ray;
closesthit (in SHADE) gathers albedo/normal, accumulates ambient, and
emits a shadow ray carrying the pending direct term; miss adds the sky
(primary) or the direct term (shadow miss). resolve.wgsl applies the same
Reinhard+gamma the megakernel raygen did inline. User bindings moved to
group 3 (groups 0..2 reserved). RTPass maxDepth=2.

Renders the atrium correctly through the wavefront pipeline (textures,
two-sided shading, sun+ambient, shadows, tonemap).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 20:16:04 +00:00
catbot
f4d6493d91 wip: uncommitted changes from claude run on issue #3 2026-05-31 16:28:38 +00:00
catbot
4e42d663a6 WebGPU RT: wavefront tracer core (GENERATE/PREP/TRACE/SHADE/RESOLVE)
Replace the megakernel @compute entry with five wavefront kernels sharing
one module, connected by GPU ray/hit/payload buffers and a GPU-driven
indirect bounce loop:

  GENERATE -> (PREP -> TRACE -> SHADE) x maxDepth -> RESOLVE

- TRACE contains zero user code (pure _rtwTraverseTlas/Blas, opaque-only).
- PREP publishes dispatchWorkgroupsIndirect args from the live ray count;
  the indirect-args buffer lives in its own bind group so it is never
  bound read-write in the same dispatch that consumes it as INDIRECT.
- New emit/accumulate API: rtEmitPrimaryRay / rtEmitRay / rtAccumulate,
  plus an optional user Resolve stage (tonemap hook; identity by default).
- Per-pass WfParams via a dynamic-offset uniform ring (curIsA/bounce vary
  between passes within one submit).
- Payload-typed wfPayload binding emitted in the codegen region after the
  user's struct Payload; payload travels with each ray (2*W*H slots).
- Request maxBufferSize / maxStorageBufferBindingSize / maxComputeWorkgroups
  PerDimension so the W*H-sized work buffers fit past the 128MB baseline.

VulkanTriangle ported to the new API and renders bit-identical to the
megakernel baseline at maxDepth=1.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 16:24:41 +00:00
b5d0f52da0 webgpu sponza 2026-05-19 00:27:09 +02:00
5553ded476 webgpu triangle 2026-05-18 18:43:30 +02:00
64116cd980 custom shader webgpu 2026-05-18 05:39:17 +02:00
dedf6b0467 webgpu support 2026-05-18 04:58:52 +02:00
5352ef69a2 browser DOM support 2026-05-18 02:07:48 +02:00
3859c43ce3 compression example 2026-05-12 00:27:55 +02:00
ac2eb7fb0a new input system 2026-05-12 00:24:48 +02:00
c054f1e0b3 update 2026-05-03 02:45:38 +02:00
1f5697326c UI rewrite 3rd attempt 2026-05-02 21:08:20 +02:00
c9fd1b1585 animated example 2026-05-02 00:03:24 +02:00
216972e73a new UI system 2026-05-01 23:35:37 +02:00
d840a81448 bugfixes 2026-04-30 23:15:43 +02:00
bc669b5e05 crafter-build V2 2026-04-30 01:30:08 +02:00
8a2fd33efc crafter-build V2 2026-04-30 01:29:17 +02:00
ea18f32300 vulkan2d fixes 2026-04-11 23:18:41 +02:00
1c1a142f52 rendertargetvulkan 2026-04-11 18:48:00 +02:00
3f4ad87746 writing ui descriptors 2026-04-10 22:25:55 +02:00
5427867fff descriptor heap static offset method 2026-04-10 20:30:58 +02:00
177f873639 vulkan UI 2026-04-09 00:15:09 +02:00
f8e142fb06 descriptor heap rewrite 2026-04-05 22:53:59 +02:00
d661c88ee2 cleaned up renderer 2026-03-09 20:10:19 +01:00
72bf6c83e1 windows build fix 2026-03-02 23:53:13 +01:00
b5e69c90fa Win32 vulkan window 2026-02-24 02:32:37 +01:00
415e523a45 runtime pipeline 2026-02-22 00:46:38 +01:00
d1b8e45fb6 new descriptor layout 2 2026-02-03 21:03:11 +01:00
74832c6824 templated pipeline 2026-01-31 21:08:42 +01:00
db1616ff0e vulkan animation 2026-01-30 00:09:37 +01:00
1d0b7a615b descriptor fixed 2026-01-29 23:43:54 +01:00
5e3a7738ed fixed descriptors 2026-01-29 23:31:56 +01:00
83bb8ebd61 instance transform 2026-01-29 20:35:55 +01:00
642cef78b1 buffered TLAS 2026-01-29 19:46:53 +01:00
550590d6ee rgba8 2026-01-29 19:26:02 +01:00
96b5d1a299 vulkan triangle 2026-01-29 19:18:47 +01:00
8b2fd773b1 working RT 2026-01-29 02:05:18 +01:00
e4e7c66808 RT descriptors 2026-01-29 01:31:17 +01:00
7b24f52764 RT pipeline 2026-01-29 00:45:02 +01:00
962cc4b8cb descriptor 2026-01-28 23:45:33 +01:00
9f62233a07 shader 2026-01-28 23:37:12 +01:00
e08d5f104a TLAS 2026-01-28 19:16:28 +01:00
2e11ac6484 improved vulkanbuffer 2026-01-28 18:51:11 +01:00
b49abf3d1a BLAS 2026-01-28 01:06:48 +01:00