2026-05-19 00:27:09 +02:00
|
|
|
# Sponza example
|
|
|
|
|
|
|
|
|
|
Loads the Sponza atrium as a `.cmesh` + one albedo `.ctex` and renders
|
|
|
|
|
it via ray tracing on both Vulkan (native) and WebGPU (wasm). Same
|
|
|
|
|
`main.cpp`, `#ifdef CRAFTER_GRAPHICS_WINDOW_DOM` selects the backend.
|
|
|
|
|
|
|
|
|
|
## What this example proves
|
|
|
|
|
|
|
|
|
|
- `.cmesh` and `.ctex` decompression round-trip on both backends
|
|
|
|
|
(GPU via `VK_EXT_memory_decompression` on Vulkan, CPU via
|
|
|
|
|
`Compression::DecompressCPU` on WebGPU).
|
|
|
|
|
- A single texture binding flowing from `Image2D<RGBA8>` through the
|
|
|
|
|
RT pipeline's closest-hit on both backends. The closest-hit samples
|
|
|
|
|
at the barycentric attribs as UVs — proof-of-binding, not visually
|
|
|
|
|
accurate. Per-vertex UV interpolation is the next step.
|
|
|
|
|
|
2026-06-03 20:05:12 +00:00
|
|
|
The closest-hit reads its texture through a **spec constant**
|
|
|
|
|
(`albedo[albedoSlot]`), not a runtime index. That is deliberate — see
|
|
|
|
|
below.
|
|
|
|
|
|
|
|
|
|
## Native RT limitation: dynamic `descriptor_heap` indexing in hit shaders
|
|
|
|
|
|
|
|
|
|
On NVIDIA driver `610.43.02` (Vulkan 1.4), indexing a
|
|
|
|
|
`layout(descriptor_heap)` array with a **runtime (non-constant)** index
|
|
|
|
|
inside a ray-tracing **hit** shader aborts the device with
|
|
|
|
|
`VK_ERROR_DEVICE_LOST` (an instruction-pointer / `READ_INVALID`
|
|
|
|
|
device-fault) with validation off. GPU-Assisted Validation masks it —
|
|
|
|
|
the scene runs fine under GPU-AV — which is why a validated run doesn't
|
|
|
|
|
catch it. It is a **driver-side fault**, the same family as the
|
|
|
|
|
descriptor-heap AS-read fault (#7 / #15) and the RT recursion / compute
|
|
|
|
|
TLAS-push issues (#21 / #22), but here for plain **SSBO and
|
|
|
|
|
sampled-image** descriptors read with a non-constant heap index
|
|
|
|
|
(issue #23).
|
|
|
|
|
|
|
|
|
|
### What was isolated (NVIDIA RTX 4090, driver `610.43.02`)
|
|
|
|
|
|
|
|
|
|
Driving a native bindless RT scene headlessly and bisecting the
|
|
|
|
|
closest-hit:
|
|
|
|
|
|
|
|
|
|
- A closest-hit that reads only `lightHeap[lightSlot]` where `lightSlot`
|
|
|
|
|
is a **spec constant** survives indefinitely. ✅ (This example's
|
|
|
|
|
`albedo[albedoSlot]` is exactly this case.)
|
|
|
|
|
- Reading `indexHeap[assetIndexStart + gl_InstanceCustomIndexEXT]` /
|
|
|
|
|
`vertexHeap[...]` — a heap index offset by a **runtime** value —
|
|
|
|
|
device-losts on the first geometry hit. ❌
|
|
|
|
|
- Reading a **texture** dynamically,
|
|
|
|
|
`textureHeap[assetColorStart + gl_InstanceCustomIndexEXT]`, also
|
|
|
|
|
device-losts. ❌ So it is SSBO *and* sampled-image descriptors.
|
|
|
|
|
- `nonuniformEXT()` on the dynamic index does **not** help.
|
|
|
|
|
- The identical dynamic-heap-index pattern works fine in **fragment**
|
|
|
|
|
shaders (the UI renderer indexes `uiTextures[]` / `ui*Heap[]` by
|
|
|
|
|
per-item runtime slots), so this is **RT-stage-specific**, not a
|
|
|
|
|
general `descriptor_heap` problem.
|
|
|
|
|
- Reading a spec-constant-indexed SSBO in **raygen** works; only the
|
|
|
|
|
*dynamic* index in the hit stage faults.
|
|
|
|
|
|
|
|
|
|
### Why there is no transparent engine workaround
|
|
|
|
|
|
|
|
|
|
The AS-read fault (#15) is worked around transparently because an
|
|
|
|
|
acceleration structure can be reached two ways: through a descriptor, or
|
|
|
|
|
through its **device address** via
|
|
|
|
|
`OpConvertUToAccelerationStructureKHR` (which reads no descriptor). There
|
|
|
|
|
is exactly one TLAS, so the engine rewrites the heap AS read into an
|
|
|
|
|
address load and feeds the address in as push data.
|
|
|
|
|
|
|
|
|
|
Neither half of that applies here:
|
|
|
|
|
|
|
|
|
|
- **Sampled images have no device-address path.** A texture *must* be
|
|
|
|
|
reached through a descriptor; there is no `OpConvertUToImage`. A
|
|
|
|
|
dynamic heap texture index cannot be rewritten into anything that
|
|
|
|
|
avoids dynamic descriptor selection.
|
|
|
|
|
- **There are many buffers, dynamically selected.** SSBOs *can* be
|
|
|
|
|
reached by address (`buffer_reference` / `OpConvertUToPtr`), but a
|
|
|
|
|
per-mesh array selected by `gl_InstanceCustomIndexEXT` would need the
|
|
|
|
|
engine to maintain and bind an address-table buffer and a SPIR-V
|
|
|
|
|
rewrite far larger than the single-TLAS AS case — and it would still
|
|
|
|
|
leave the texture half broken.
|
|
|
|
|
|
|
|
|
|
So the engine cannot paper over this the way it does the AS read. The
|
|
|
|
|
fix is on the **consumer** side: avoid dynamically selecting a
|
|
|
|
|
*descriptor* in a hit shader.
|
|
|
|
|
|
|
|
|
|
### Recommended pattern
|
|
|
|
|
|
|
|
|
|
The fault is dynamic selection of a **descriptor**. Indexing *within* a
|
|
|
|
|
single bound resource — an element offset into one SSBO, a layer into
|
|
|
|
|
one array texture — is ordinary memory / layer addressing and is
|
|
|
|
|
**unaffected**. So bind one resource and index inside it, rather than
|
|
|
|
|
indexing the heap:
|
|
|
|
|
|
|
|
|
|
- **Geometry** — pack all meshes' vertices/indices into a single SSBO
|
|
|
|
|
bound at a **spec-constant** slot and index it by a runtime element
|
|
|
|
|
offset, or reach each mesh's buffer via `buffer_reference` (a device
|
|
|
|
|
address loaded from one bound table). Either way the *descriptor* is
|
|
|
|
|
constant; only the offset/address is dynamic.
|
|
|
|
|
|
|
|
|
|
```glsl
|
|
|
|
|
// ❌ faults in a hit shader on NVIDIA: dynamic descriptor selection
|
|
|
|
|
layout(descriptor_heap) buffer Verts { Vertex v[]; } vertexHeap[];
|
|
|
|
|
Vertex vtx = vertexHeap[assetVertexStart + gl_InstanceCustomIndexEXT].v[i];
|
|
|
|
|
|
|
|
|
|
// ✅ one descriptor (spec-constant slot), dynamic element offset
|
|
|
|
|
layout(constant_id = 0) const uint16_t vertexSlot = 0us;
|
|
|
|
|
layout(descriptor_heap) buffer Verts { Vertex v[]; } vertexHeap[];
|
|
|
|
|
uint base = assetVertexStart[gl_InstanceCustomIndexEXT]; // from a bound SSBO
|
|
|
|
|
Vertex vtx = vertexHeap[vertexSlot].v[base + i];
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
- **Materials / textures** — put them in one `texture2DArray` (or a small
|
|
|
|
|
number of arrays bucketed by format/size) bound at a spec-constant
|
|
|
|
|
slot and index by **layer**:
|
|
|
|
|
|
|
|
|
|
```glsl
|
|
|
|
|
// ✅ one array texture (spec-constant slot), dynamic layer index
|
|
|
|
|
layout(constant_id = 1) const uint16_t materialArraySlot = 0us;
|
|
|
|
|
layout(descriptor_heap) uniform sampler2DArray materials[];
|
|
|
|
|
uint layer = materialLayer[gl_InstanceCustomIndexEXT]; // from a bound SSBO
|
|
|
|
|
vec3 albedo = texture(materials[materialArraySlot], vec3(uv, layer)).rgb;
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
This is precisely what the WebGPU path already does — bucketed texture
|
|
|
|
|
arrays plus a single geometry buffer — so it is a proven, cross-backend
|
|
|
|
|
pattern, and it sidesteps the NVIDIA RT fault on the native path.
|
|
|
|
|
|
|
|
|
|
Remove this section once a fixed NVIDIA driver ships and dynamic
|
|
|
|
|
`descriptor_heap` indexing in hit shaders stops faulting.
|
|
|
|
|
|
2026-05-19 00:27:09 +02:00
|
|
|
## Asset fetch
|
|
|
|
|
|
|
|
|
|
`project.cpp` calls `Crafter::GitFetch(...)` on
|
|
|
|
|
[https://github.com/jimmiebergmann/Sponza](https://github.com/jimmiebergmann/Sponza)
|
|
|
|
|
(pinned to commit `222338979d32f4f4818466291bdbc29f192b86ba`). The
|
|
|
|
|
clone lands in the per-user crafter-build cache; first build pulls
|
|
|
|
|
~280 MB once, subsequent builds reuse it.
|
|
|
|
|
|
|
|
|
|
`cfg.assets` then picks two files out of that clone:
|
|
|
|
|
|
|
|
|
|
| Source | Compressed output |
|
|
|
|
|
|-----------------------------------------|-------------------------|
|
|
|
|
|
| `sponza.obj` | `sponza.cmesh` |
|
|
|
|
|
| `textures/sponza_arch_diff.tga` | `sponza_arch_diff.ctex` |
|
|
|
|
|
|
|
|
|
|
Both land flat in the example's bin directory.
|
|
|
|
|
|
|
|
|
|
## Building
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
crafter build # native Vulkan
|
|
|
|
|
crafter build --target=wasm32-wasip1 # WebGPU / wasm
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## License & attribution
|
|
|
|
|
|
|
|
|
|
Sponza geometry, materials, and textures are licensed under
|
|
|
|
|
[CC BY 3.0](https://creativecommons.org/licenses/by/3.0/).
|
|
|
|
|
|
|
|
|
|
- **Original model:** Frank Meinl, Crytek (2010).
|
|
|
|
|
- **OBJ packaging / cleanup:** Morgan McGuire, McGuire Computer
|
|
|
|
|
Graphics Archive — https://casual-effects.com/data.
|
|
|
|
|
- **GitHub mirror used here:** Jimmie Bergmann's roof-material fixup —
|
|
|
|
|
https://github.com/jimmiebergmann/Sponza.
|
|
|
|
|
|
|
|
|
|
When redistributing builds of this example that bundle the compressed
|
|
|
|
|
Sponza outputs (`*.cmesh`, `*.ctex`), the CC BY 3.0 attribution
|
|
|
|
|
requirement applies. Quoting the original credit somewhere visible to
|
|
|
|
|
end users (about-screen, credits page, etc.) is enough.
|
|
|
|
|
|
|
|
|
|
The Crafter.Graphics library code itself is LGPL-3.0; the two
|
|
|
|
|
licenses are compatible for data + code distribution.
|