Investigated the VK_ERROR_DEVICE_LOST on the native VulkanTriangle (#7). Verified the engine side is correct and validation-clean: the BLAS/TLAS build finishes before render (FinishInit waits), the built instance is well-formed (identity transform, mask=0xFF, correct BLAS ref), and vkWriteResourceDescriptorsEXT stores the TLAS device address at the expected heap offset (confirmed by dumping the heap bytes). Khronos validation 1.4.350 reports zero errors. The fault is isolated to reading the acceleration structure through VK_EXT_descriptor_heap: - images/buffers via the same heap render fine (trace disabled -> the raygen imageStore path renders a full gradient); - both traceRayEXT and inline rayQueryEXT (no SBT) fault identically on the AS read; - reproduces with the AS descriptor at heap byte 0 / shader index 0 (no offset/stride ambiguity) and regardless of pAddressRange size. NVIDIA 610.43.02 is the only descriptor_heap implementation available (llvmpipe lacks the extension), so there is no second implementation to cross-check. Conclusion: driver-side fault in NVIDIA's brand-new VK_EXT_descriptor_heap acceleration-structure path; should be reported to NVIDIA. The traceRayEXT call is left active so the example stays a faithful reproducer. Documented in both READMEs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
188 lines
9.6 KiB
Markdown
188 lines
9.6 KiB
Markdown
# Crafter.Graphics
|
||
|
||
Vulkan + WebGPU graphics library built around C++20 modules and
|
||
bindless heaps. Provides window management, ray tracing, and a
|
||
compute-shader-driven UI on a single, opinionated stack. Native
|
||
builds use Vulkan with `VK_EXT_descriptor_heap`; `wasm32-*` builds
|
||
target the browser via WebGPU and a DOM window backend.
|
||
|
||
## Backends
|
||
|
||
Backends are chosen at build time by the target triple:
|
||
|
||
| Target | Window | Renderer | Shaders |
|
||
|---------------------|------------------------|---------------------|---------|
|
||
| native Linux | Wayland | Vulkan (heap-bound) | GLSL → SPIR-V |
|
||
| native Windows | Win32 | Vulkan (heap-bound) | GLSL → SPIR-V |
|
||
| `wasm32-*` (any) | DOM (canvas + JS env) | WebGPU | WGSL (loaded at runtime) |
|
||
|
||
The two backends share the same C++ surface for the high-level pieces
|
||
(`UIRenderer`, `Mesh`, `RenderingElement3D`, `RTPass`, item structs,
|
||
`FontAtlas`, `Image2D`, `ComputeShader`). Backend-typed pieces
|
||
(`*Vulkan` vs `*WebGPU`) live behind `#ifdef CRAFTER_GRAPHICS_WINDOW_DOM`.
|
||
Vulkan ray tracing is hardware (`VK_KHR_ray_tracing_pipeline`); WebGPU
|
||
ray tracing is a library-built software path (BVH + traceRay in a
|
||
compute pipeline composed from user-supplied WGSL stages).
|
||
|
||
> **Native RT status:** reading an acceleration structure through
|
||
> `VK_EXT_descriptor_heap` currently aborts with `VK_ERROR_DEVICE_LOST` on
|
||
> NVIDIA driver `610.43.02` — a driver-side fault in the brand-new
|
||
> descriptor-heap acceleration-structure path, not an engine bug. The
|
||
> engine setup (build, descriptors, SBT) is correct and validation-clean,
|
||
> and images/buffers through the same heap work. See
|
||
> [examples/VulkanTriangle/README.md](examples/VulkanTriangle/README.md)
|
||
> for the full investigation. WebGPU RT is unaffected.
|
||
|
||
## What's in here
|
||
|
||
- **Window** — Wayland, Win32, and DOM backends, swapchain ring / canvas
|
||
framing, input events. Pick a backend at build time via the target
|
||
triple. The DOM backend routes every dynamic symbol through
|
||
[additional/dom-env.js](additional/dom-env.js) and
|
||
[additional/dom-webgpu.js](additional/dom-webgpu.js).
|
||
- **Device** *(Vulkan only)* — single-instance bring-up targeting
|
||
`VK_EXT_descriptor_heap`; pipelines are created with
|
||
`VK_PIPELINE_CREATE_2_DESCRIPTOR_HEAP_BIT_EXT` so there are no
|
||
descriptor-set layouts and push constants travel via
|
||
`vkCmdPushDataEXT`.
|
||
- **DescriptorHeapVulkan / DescriptorHeapWebGPU** — bindless slot
|
||
allocators. Vulkan side allocates image/buffer/sampler slots in a
|
||
`VK_EXT_descriptor_heap`; WebGPU side resolves slots to JS-side
|
||
handle-table cookies that the dispatch bridge binds per pass.
|
||
- **VulkanBuffer\<T, Mapped\> / WebGPUBuffer\<T\>** — typed buffer.
|
||
Vulkan variant has optional host mapping and a `FlushDevice` that
|
||
issues the right host-write barrier; WebGPU variant goes through
|
||
`queue.writeBuffer` over the JS bridge.
|
||
- **ImageVulkan\<Pixel\> / Image2D\<Pixel\> / Image2DArray\<Pixel\>** —
|
||
image + staging buffer with mip-chain support on Vulkan; on WebGPU,
|
||
`rgba8unorm` 2D / 2D-array textures created and written via the
|
||
bridge. Atlas (`r8unorm`, sub-region writes) is a separate path.
|
||
- **PipelineRTVulkan / PipelineRTWebGPU / ShaderBindingTableVulkan /
|
||
ShaderBindingTableWebGPU / RTPass** — ray-tracing pipelines. Vulkan
|
||
uses native RT pipelines + SBTs; WebGPU compiles a **wavefront /
|
||
streaming** software tracer — five `@compute` kernels
|
||
(`GENERATE → PREP → TRACE → SHADE → RESOLVE`) sharing one module,
|
||
connected by GPU ray/hit/payload buffers and a GPU-driven indirect
|
||
bounce loop (`dispatchWorkgroupsIndirect`). TRACE carries zero user
|
||
code (traversal + intersection only); user raygen calls
|
||
`rtEmitPrimaryRay`, and closesthit / miss run in SHADE where they
|
||
`rtEmitRay` continuation/shadow rays and `rtAccumulate` radiance. An
|
||
optional Resolve shader tonemaps the linear accumulator. See
|
||
[WAVEFRONT-DESIGN.md](WAVEFRONT-DESIGN.md).
|
||
- **ComputeShader / WebGPUComputeShader** — Tier 1 wrapper used by the
|
||
UI system. Vulkan loads a `.spv` and dispatches with
|
||
`vkCmdPushDataEXT`; WebGPU loads a user-supplied `.wgsl` blob at
|
||
runtime via `wgpuLoadCustomShader`. Use it directly for any custom
|
||
compute.
|
||
- **UI** — three-tier UI system; see below. The standard shaders ship
|
||
as four `.spv` blobs on native and four WGSL strings baked into the
|
||
WebGPU dispatcher.
|
||
- **FontAtlas** — single-channel SDF atlas (1024×1024, 32pt base,
|
||
shelf-packed, lazy `Ensure` per codepoint, dirty-flush via `Update`).
|
||
Backend-agnostic.
|
||
- **Mesh / RenderingElement3D / Animation** — BLAS/TLAS construction
|
||
and 3D scene plumbing. Vulkan calls `vkCmdBuildAccelerationStructures`;
|
||
WebGPU registers BLAS data (verts, idx, BVH nodes, primRemap, optional
|
||
per-vertex attribs) into global mesh heaps and builds the TLAS in a
|
||
library compute pass.
|
||
- **Clipboard / Input / Gamepad / Router / Dom** — input plumbing.
|
||
Gamepad uses libudev+libevdev on Linux and WGI on Windows; the DOM
|
||
backend exposes the host page DOM (`Dom::HtmlElement`) and a router
|
||
for hash-routed wasm apps.
|
||
|
||
## UI system (three tiers)
|
||
|
||
The UI is *deliberately* layered to balance no-boilerplate against
|
||
no-lock-in:
|
||
|
||
- **Tier 1 — `ComputeShader`.** Load any `.spv`, dispatch with push
|
||
constants, library inserts inter-dispatch barriers. The escape hatch:
|
||
if the standard shaders don't fit, write your own compute and
|
||
dispatch it next to them.
|
||
- **Tier 2 — `UIRenderer` + standard shaders.** Four shipped compute
|
||
shaders (`drawQuads`, `drawCircles`, `drawImages`, `drawText`), POD
|
||
item structs (`QuadItem`, `CircleItem`, `ImageItem`, `GlyphItem`), a
|
||
shared GLSL contract in [shaders/ui-shared.glsl](shaders/ui-shared.glsl),
|
||
and helpers (`RegisterBuffer`, `RegisterImage`, `RegisterSampler`,
|
||
`FillHeader`, `Dispatch*`, `ShapeText`). You build your own per-shader
|
||
SSBOs (manual batching) and call one `Dispatch*` per shader type per
|
||
frame. Item array order = draw order.
|
||
- **Tier 3 — stateless presentation functions.** `DrawButton`,
|
||
`DrawCheckbox`, `DrawSlider`, `DrawProgressBar`. Each is a small
|
||
function that *appends* items to your buffers — they don't dispatch.
|
||
Colors come in as small inline `*Colors` aggregates, no library
|
||
`Theme` type. **The source is the customization API**: if a
|
||
component doesn't fit, copy its body and edit it. No virtual hooks,
|
||
no extension points.
|
||
|
||
What's *not* in the UI: widget tree, layout engine (just a `Rect::SubRect`
|
||
carving helper), theming, hit-testing, focus management. State for
|
||
interactive components (hover, drag, focus) lives in user-owned POD
|
||
structs, not the library.
|
||
|
||
### UI dispatch model
|
||
|
||
Standard shaders dispatch one workgroup per 8×8 *screen tile* — each
|
||
thread iterates every item in the SSBO in array order, accumulating
|
||
into a local `dst`, and stores once. Total cost is `O(W·H·N)`; works
|
||
well up to a few hundred items at 1080p. Splitting one buffer into
|
||
multiple dispatches doesn't help — the same total work plus barrier
|
||
overhead. If you need to render thousands of UI items, you want a
|
||
different shader (tile binning, per-item-list resolve), not more
|
||
dispatches.
|
||
|
||
## Build
|
||
|
||
The repository is built with `crafter-build` (a project-config based
|
||
build system; the project description lives in `project.cpp`):
|
||
|
||
```bash
|
||
crafter-build # native: Wayland on Linux, Win32 on Windows
|
||
crafter-build --target=wasm32-wasip1 # browser: DOM window + WebGPU renderer
|
||
crafter-build -r # build and run (in an example directory)
|
||
```
|
||
|
||
The build picks the window + renderer pair automatically from the
|
||
target triple: any `wasm32-*` triple flips to DOM + WebGPU (no Vulkan
|
||
loader linked), everything else stays on the native Vulkan path. Each
|
||
example with both backends ships GLSL *and* WGSL copies of its shaders
|
||
side-by-side (e.g. [raygen.glsl](examples/Sponza/raygen.glsl) +
|
||
[raygen.wgsl](examples/Sponza/raygen.wgsl)); `project.cpp` selects the
|
||
right set per target.
|
||
|
||
## Examples
|
||
|
||
See [examples/](examples/). Quick map:
|
||
|
||
- [HelloWindow](examples/HelloWindow/) — minimal native window, no rendering.
|
||
- [HelloDom](examples/HelloDom/) — wasm-only smoke test of the DOM
|
||
partition: page-level events, `HtmlElement::CreateInBody`, and
|
||
`Router::PushState`-driven SPA navigation. No GPU work.
|
||
- [VulkanTriangle](examples/VulkanTriangle/) — ray-traced triangle on
|
||
both Vulkan and WebGPU. The smallest test of the bindless + RT path
|
||
on each backend.
|
||
- [RTStress](examples/RTStress/) — wavefront RT benchmark: an N×N×N grid
|
||
of a cube mesh (instance-count knob `kGrid`, 512 → 8000) shaded with
|
||
primary + shadow rays. Prints a GPU timestamp-query per-pass breakdown
|
||
each second. WebGPU/DOM only.
|
||
- [Sponza](examples/Sponza/) — ray-traced Sponza atrium on both
|
||
backends. Exercises `.cmesh` / `.ctex` decompression (GPU
|
||
`VK_EXT_memory_decompression` on Vulkan, CPU on WebGPU) and a
|
||
textured closest-hit. See [its README](examples/Sponza/README.md)
|
||
for asset provenance.
|
||
- [HelloUI](examples/HelloUI/) — UI smoke test using all three tiers
|
||
(background quad, slider, progress bar, button with text label,
|
||
cursor-tracking circle).
|
||
- [CustomShader](examples/CustomShader/) — Tier 1 demo: a user-authored
|
||
compute shader inverting RGB under a list of item-circles, dispatched
|
||
alongside the standard `drawQuads`. Shipped as both
|
||
[`.comp.glsl`](examples/CustomShader/inverse-circle.comp.glsl) and
|
||
[`.comp.wgsl`](examples/CustomShader/inverse-circle.comp.wgsl).
|
||
- [Decompression](examples/Decompression/) — `Crafter::Compression`
|
||
CPU round-trip smoke test (used by the WebGPU asset path).
|
||
- [InputSystem](examples/InputSystem/) — keyboard / mouse / gamepad
|
||
event surface check.
|
||
|
||
## License
|
||
|
||
LGPL 3.0. See per-file headers and `LICENSE`.
|