docs(vulkan-rt): native descriptor-heap AS read is an NVIDIA driver fault (#7) #10

Merged
catbot merged 1 commit from claude/issue-7 into master 2026-06-01 00:22:52 +02:00
Member

Summary

Resolves #7 — investigation of VK_ERROR_DEVICE_LOST on the native ray-traced triangle.

Conclusion: this is a driver-side fault in NVIDIA's brand-new VK_EXT_descriptor_heap acceleration-structure path, not an engine bug. (The issue noted a verified driver error is an acceptable answer.) No engine fix is possible — you cannot obtain an accelerationStructureEXT in a shader except from a descriptor / the heap. This PR records the finding in the docs; the example is left as a faithful reproducer.

How it was verified

  • Engine setup is correct & validation-clean. Instrumented dumps show the BLAS/TLAS build finishes before render (FinishInit does vkQueueWaitIdle), the built TLAS instance is well-formed (identity transform, mask=0xFF, correct BLAS device address), and vkWriteResourceDescriptorsEXT stores the TLAS device address at the expected heap byte offset (confirmed by dumping the raw heap bytes). Khronos validation 1.4.350 reports zero errors for the whole frame, including the SBT regions.
  • The same descriptor heap works for images/buffers. With traceRayEXT removed, the raygen imageStore path renders a full gradient (see screenshot) — heap binding, image descriptor and present are all sound.
  • Isolated to the AS-via-heap read. Both the RT pipeline (traceRayEXT) and inline ray query (rayQueryEXT, which uses no SBT) fault identically the moment they read the acceleration structure from the heap.
  • Not an offset/stride bug. Reproduces with the AS descriptor written at heap byte 0 and read at shader index 0 (zero offset/stride ambiguity), and is unaffected by the pAddressRange size.
  • VK_EXT_device_fault reports an invalid GPU read (~0xffff…) + instruction-pointer faults inside the RT shader.
  • NVIDIA 610.43.02 is the only implementation here advertising VK_EXT_descriptor_heap (llvmpipe lacks it), so there is no second conformant implementation to cross-check against.

Next step: report to NVIDIA. WebGPU RT is unaffected.

Changes

  • examples/VulkanTriangle/README.md — replace the stale "trace is commented out" note with the full native-status investigation.
  • README.md — add a native-RT status callout.
  • examples/VulkanTriangle/main.cpp — short comment by the RT pass pointing at the writeup.

Screenshots

Control render with the AS read disabled (proves the descriptor-heap image + present path is sound; only the AS read faults):

control

## Summary Resolves #7 — investigation of `VK_ERROR_DEVICE_LOST` on the native ray-traced triangle. **Conclusion: this is a driver-side fault in NVIDIA's brand-new `VK_EXT_descriptor_heap` acceleration-structure path, not an engine bug.** (The issue noted a verified driver error is an acceptable answer.) No engine fix is possible — you cannot obtain an `accelerationStructureEXT` in a shader except from a descriptor / the heap. This PR records the finding in the docs; the example is left as a faithful reproducer. ## How it was verified - **Engine setup is correct & validation-clean.** Instrumented dumps show the BLAS/TLAS build finishes before render (`FinishInit` does `vkQueueWaitIdle`), the built TLAS instance is well-formed (identity transform, `mask=0xFF`, correct BLAS device address), and `vkWriteResourceDescriptorsEXT` stores the TLAS device address at the expected heap byte offset (confirmed by dumping the raw heap bytes). Khronos validation **1.4.350** reports **zero** errors for the whole frame, including the SBT regions. - **The same descriptor heap works for images/buffers.** With `traceRayEXT` removed, the raygen `imageStore` path renders a full gradient (see screenshot) — heap binding, image descriptor and present are all sound. - **Isolated to the AS-via-heap read.** Both the RT pipeline (`traceRayEXT`) **and** inline ray query (`rayQueryEXT`, which uses no SBT) fault identically the moment they read the acceleration structure from the heap. - **Not an offset/stride bug.** Reproduces with the AS descriptor written at heap byte 0 and read at shader index 0 (zero offset/stride ambiguity), and is unaffected by the `pAddressRange` size. - `VK_EXT_device_fault` reports an invalid GPU read (`~0xffff…`) + instruction-pointer faults inside the RT shader. - NVIDIA `610.43.02` is the only implementation here advertising `VK_EXT_descriptor_heap` (llvmpipe lacks it), so there is no second conformant implementation to cross-check against. **Next step:** report to NVIDIA. WebGPU RT is unaffected. ## Changes - `examples/VulkanTriangle/README.md` — replace the stale "trace is commented out" note with the full native-status investigation. - `README.md` — add a native-RT status callout. - `examples/VulkanTriangle/main.cpp` — short comment by the RT pass pointing at the writeup. ## Screenshots Control render with the AS read disabled (proves the descriptor-heap image + present path is sound; only the AS read faults): ![control](https://forgejo.catcrafts.net/attachments/d821ba63-da94-479a-987f-11232f953d36)
Investigated the VK_ERROR_DEVICE_LOST on the native VulkanTriangle (#7).
Verified the engine side is correct and validation-clean: the BLAS/TLAS
build finishes before render (FinishInit waits), the built instance is
well-formed (identity transform, mask=0xFF, correct BLAS ref), and
vkWriteResourceDescriptorsEXT stores the TLAS device address at the
expected heap offset (confirmed by dumping the heap bytes). Khronos
validation 1.4.350 reports zero errors.

The fault is isolated to reading the acceleration structure through
VK_EXT_descriptor_heap:
- images/buffers via the same heap render fine (trace disabled -> the
  raygen imageStore path renders a full gradient);
- both traceRayEXT and inline rayQueryEXT (no SBT) fault identically on
  the AS read;
- reproduces with the AS descriptor at heap byte 0 / shader index 0 (no
  offset/stride ambiguity) and regardless of pAddressRange size.

NVIDIA 610.43.02 is the only descriptor_heap implementation available
(llvmpipe lacks the extension), so there is no second implementation to
cross-check. Conclusion: driver-side fault in NVIDIA's brand-new
VK_EXT_descriptor_heap acceleration-structure path; should be reported to
NVIDIA. The traceRayEXT call is left active so the example stays a
faithful reproducer. Documented in both READMEs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
catbot merged commit afb9e320e1 into master 2026-06-01 00:22:52 +02:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Catcrafts/Crafter.Graphics!10
No description provided.