Crafter.Graphics/README.md

223 lines
12 KiB
Markdown
Raw Permalink Normal View History

2026-05-02 00:03:24 +02:00
# Crafter.Graphics
2026-05-19 01:43:46 +02:00
Vulkan + WebGPU graphics library built around C++20 modules and
bindless heaps. Provides window management, ray tracing, and a
compute-shader-driven UI on a single, opinionated stack. Native
builds use Vulkan with `VK_EXT_descriptor_heap`; `wasm32-*` builds
target the browser via WebGPU and a DOM window backend.
## Backends
Backends are chosen at build time by the target triple:
| Target | Window | Renderer | Shaders |
|---------------------|------------------------|---------------------|---------|
| native Linux | Wayland | Vulkan (heap-bound) | GLSL → SPIR-V |
| native Windows | Win32 | Vulkan (heap-bound) | GLSL → SPIR-V |
| `wasm32-*` (any) | DOM (canvas + JS env) | WebGPU | WGSL (loaded at runtime) |
The two backends share the same C++ surface for the high-level pieces
(`UIRenderer`, `Mesh`, `RenderingElement3D`, `RTPass`, item structs,
`FontAtlas`, `Image2D`, `ComputeShader`). Backend-typed pieces
(`*Vulkan` vs `*WebGPU`) live behind `#ifdef CRAFTER_GRAPHICS_WINDOW_DOM`.
Vulkan ray tracing is hardware (`VK_KHR_ray_tracing_pipeline`); WebGPU
ray tracing is a library-built software path (BVH + traceRay in a
compute pipeline composed from user-supplied WGSL stages). The WebGPU
path supports triangle and AABB (procedural, `VK_GEOMETRY_TYPE_AABBS_KHR`)
geometry, closest-hit / miss / any-hit / intersection shaders — see
[examples/RTVolume](examples/RTVolume/README.md) for procedural spheres
shaded through an intersection shader with an any-hit cut-out.
2026-05-02 00:03:24 +02:00
> **Native RT status:** reading an acceleration structure through
fix(vulkan-rt): work around NVIDIA descriptor-heap AS-read device-loss (#15) Reading an acceleration structure through VK_EXT_descriptor_heap aborts with VK_ERROR_DEVICE_LOST on NVIDIA 610.43.02 — a brand-new-extension driver fault isolated in #7 (engine setup is correct and validation-clean; images/buffers through the same heap work, and both traceRayEXT and inline rayQuery fault identically on the AS read). An acceleration structure can equally be reached by its device address via OpConvertUToAccelerationStructureKHR, which reads no descriptor and so never touches the faulting heap path. glslang has no GLSL spelling for that conversion, so VulkanShader rewrites the compiled SPIR-V at module-load time: every `OpLoad %accelStruct <heap-ptr>` becomes a load of the TLAS device address from a synthesized push-constant block followed by the convert. RTPass pushes the active frame's TLAS address into that push constant. User GLSL and example code are unchanged; acceleration structures still bind into the heap normally. The workaround is gated on Device::workaroundDescriptorHeapAS (true only on the NVIDIA proprietary driver) and confined to one fenced block in Crafter.Graphics-ShaderVulkan.cppm plus the RTPass push and the shaderInt64 feature toggle — delete those once a fixed NVIDIA driver ships and the heap AS read becomes the direct path again. Verified: VulkanTriangle ray-traces correctly on native NVIDIA (RTX 4090), validation-layer-clean, no device loss. The SPIR-V rewrite was independently validated with spirv-val on both the VulkanTriangle and Sponza raygen modules. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 01:59:54 +00:00
> `VK_EXT_descriptor_heap` aborts with `VK_ERROR_DEVICE_LOST` on NVIDIA
> driver `610.43.02` — a driver-side fault in the brand-new descriptor-heap
> acceleration-structure path, not an engine bug (the setup is correct and
> validation-clean; images/buffers through the same heap work). The engine
> **works around it transparently** (issue #15): on the NVIDIA driver only,
> `VulkanShader` rewrites the compiled SPIR-V so heap AS reads become a
> TLAS-device-address + `OpConvertUToAccelerationStructureKHR` path (which
> reads no descriptor), and `RTPass` supplies the address as push data.
> Shaders and example code are unchanged, and it's a single fenced block
> gated on `Device::workaroundDescriptorHeapAS`, removable once a fixed
> driver ships. See
> [examples/VulkanTriangle/README.md](examples/VulkanTriangle/README.md)
> for the full investigation. WebGPU RT is unaffected.
2026-06-03 20:05:12 +00:00
> **Native RT limitation — dynamic `descriptor_heap` indexing in hit
> shaders:** on the same NVIDIA driver, indexing a `descriptor_heap`
> array with a **runtime (non-constant)** index inside a ray-tracing
> **hit** shader also device-losts (`VK_ERROR_DEVICE_LOST`), for plain
> SSBO **and** sampled-image descriptors. A **constant / spec-constant**
> index is fine (that's why [Sponza](examples/Sponza/README.md)'s
> closest-hit reads `albedo[albedoSlot]` through a spec constant), and
> the identical dynamic pattern works in fragment shaders (the UI
> renderer indexes `uiTextures[]` by per-item runtime slots) — so this
> is **RT-stage-specific**, not a general heap problem. Unlike the
> AS-read fault above this **cannot** be worked around transparently:
> sampled images have no device-address escape hatch the way an
> acceleration structure does (`OpConvertUToAccelerationStructureKHR`).
> The recommended pattern for bindless per-mesh geometry/material is to
> **bind one resource and index *within* it dynamically** rather than
> selecting a descriptor dynamically: pack geometry into a single SSBO
> (or reach it via `buffer_reference`) at a spec-constant slot and index
> by element offset, and put materials in one `texture2DArray` indexed
> by layer. Dynamic addressing *inside* a bound resource is ordinary
> memory/layer addressing and is unaffected; only dynamic selection of a
> *descriptor* faults. This is exactly what the WebGPU path already does
> (bucketed texture arrays + a single buffer). Full investigation and
> GLSL in [examples/Sponza/README.md](examples/Sponza/README.md) (issue
> #23). WebGPU RT is unaffected.
2026-05-02 21:08:20 +02:00
## What's in here
2026-05-02 00:03:24 +02:00
2026-05-19 01:43:46 +02:00
- **Window** — Wayland, Win32, and DOM backends, swapchain ring / canvas
framing, input events. Pick a backend at build time via the target
triple. The DOM backend routes every dynamic symbol through
[additional/dom-env.js](additional/dom-env.js) and
[additional/dom-webgpu.js](additional/dom-webgpu.js).
- **Device** *(Vulkan only)* — single-instance bring-up targeting
`VK_EXT_descriptor_heap`; pipelines are created with
2026-05-02 21:08:20 +02:00
`VK_PIPELINE_CREATE_2_DESCRIPTOR_HEAP_BIT_EXT` so there are no
descriptor-set layouts and push constants travel via
`vkCmdPushDataEXT`.
2026-05-19 01:43:46 +02:00
- **DescriptorHeapVulkan / DescriptorHeapWebGPU** — bindless slot
allocators. Vulkan side allocates image/buffer/sampler slots in a
`VK_EXT_descriptor_heap`; WebGPU side resolves slots to JS-side
handle-table cookies that the dispatch bridge binds per pass.
- **VulkanBuffer\<T, Mapped\> / WebGPUBuffer\<T\>** — typed buffer.
Vulkan variant has optional host mapping and a `FlushDevice` that
issues the right host-write barrier; WebGPU variant goes through
`queue.writeBuffer` over the JS bridge.
- **ImageVulkan\<Pixel\> / Image2D\<Pixel\> / Image2DArray\<Pixel\>** —
image + staging buffer with mip-chain support on Vulkan; on WebGPU,
`rgba8unorm` 2D / 2D-array textures created and written via the
bridge. Atlas (`r8unorm`, sub-region writes) is a separate path.
- **PipelineRTVulkan / PipelineRTWebGPU / ShaderBindingTableVulkan /
ShaderBindingTableWebGPU / RTPass** — ray-tracing pipelines. Vulkan
uses native RT pipelines + SBTs; WebGPU compiles a **wavefront /
streaming** software tracer — five `@compute` kernels
(`GENERATE → PREP → TRACE → SHADE → RESOLVE`) sharing one module,
connected by GPU ray/hit/payload buffers and a GPU-driven indirect
bounce loop (`dispatchWorkgroupsIndirect`). TRACE carries zero user
code (traversal + intersection only); user raygen calls
`rtEmitPrimaryRay`, and closesthit / miss run in SHADE where they
`rtEmitRay` continuation/shadow rays and `rtAccumulate` radiance. An
optional Resolve shader tonemaps the linear accumulator. See
[WAVEFRONT-DESIGN.md](WAVEFRONT-DESIGN.md).
2026-05-19 01:43:46 +02:00
- **ComputeShader / WebGPUComputeShader** — Tier 1 wrapper used by the
UI system. Vulkan loads a `.spv` and dispatches with
`vkCmdPushDataEXT`; WebGPU loads a user-supplied `.wgsl` blob at
runtime via `wgpuLoadCustomShader`. Use it directly for any custom
compute.
- **UI** — three-tier UI system; see below. The standard shaders ship
as four `.spv` blobs on native and four WGSL strings baked into the
WebGPU dispatcher.
2026-05-02 21:08:20 +02:00
- **FontAtlas** — single-channel SDF atlas (1024×1024, 32pt base,
shelf-packed, lazy `Ensure` per codepoint, dirty-flush via `Update`).
2026-05-19 01:43:46 +02:00
Backend-agnostic.
2026-05-02 21:08:20 +02:00
- **Mesh / RenderingElement3D / Animation** — BLAS/TLAS construction
2026-05-19 01:43:46 +02:00
and 3D scene plumbing. Vulkan calls `vkCmdBuildAccelerationStructures`;
WebGPU registers BLAS data (verts, idx, BVH nodes, primRemap, optional
per-vertex attribs) into global mesh heaps and builds the TLAS in a
library compute pass.
- **Clipboard / Input / Gamepad / Router / Dom** — input plumbing.
Gamepad uses libudev+libevdev on Linux and WGI on Windows; the DOM
backend exposes the host page DOM (`Dom::HtmlElement`) and a router
for hash-routed wasm apps.
2026-05-02 00:03:24 +02:00
2026-05-02 21:08:20 +02:00
## UI system (three tiers)
2026-05-02 00:03:24 +02:00
2026-05-02 21:08:20 +02:00
The UI is *deliberately* layered to balance no-boilerplate against
no-lock-in:
- **Tier 1 — `ComputeShader`.** Load any `.spv`, dispatch with push
constants, library inserts inter-dispatch barriers. The escape hatch:
if the standard shaders don't fit, write your own compute and
dispatch it next to them.
- **Tier 2 — `UIRenderer` + standard shaders.** Four shipped compute
shaders (`drawQuads`, `drawCircles`, `drawImages`, `drawText`), POD
item structs (`QuadItem`, `CircleItem`, `ImageItem`, `GlyphItem`), a
shared GLSL contract in [shaders/ui-shared.glsl](shaders/ui-shared.glsl),
and helpers (`RegisterBuffer`, `RegisterImage`, `RegisterSampler`,
`FillHeader`, `Dispatch*`, `ShapeText`). You build your own per-shader
SSBOs (manual batching) and call one `Dispatch*` per shader type per
frame. Item array order = draw order.
- **Tier 3 — stateless presentation functions.** `DrawButton`,
`DrawCheckbox`, `DrawSlider`, `DrawProgressBar`. Each is a small
function that *appends* items to your buffers — they don't dispatch.
Colors come in as small inline `*Colors` aggregates, no library
`Theme` type. **The source is the customization API**: if a
component doesn't fit, copy its body and edit it. No virtual hooks,
no extension points.
What's *not* in the UI: widget tree, layout engine (just a `Rect::SubRect`
carving helper), theming, hit-testing, focus management. State for
interactive components (hover, drag, focus) lives in user-owned POD
structs, not the library.
### UI dispatch model
Standard shaders dispatch one workgroup per 8×8 *screen tile* — each
thread iterates every item in the SSBO in array order, accumulating
into a local `dst`, and stores once. Total cost is `O(W·H·N)`; works
well up to a few hundred items at 1080p. Splitting one buffer into
multiple dispatches doesn't help — the same total work plus barrier
overhead. If you need to render thousands of UI items, you want a
different shader (tile binning, per-item-list resolve), not more
dispatches.
## Build
The repository is built with `crafter-build` (a project-config based
build system; the project description lives in `project.cpp`):
2026-05-02 00:03:24 +02:00
```bash
2026-05-19 01:43:46 +02:00
crafter-build # native: Wayland on Linux, Win32 on Windows
crafter-build --target=wasm32-wasip1 # browser: DOM window + WebGPU renderer
crafter-build -r # build and run (in an example directory)
2026-05-02 00:03:24 +02:00
```
2026-05-19 01:43:46 +02:00
The build picks the window + renderer pair automatically from the
target triple: any `wasm32-*` triple flips to DOM + WebGPU (no Vulkan
loader linked), everything else stays on the native Vulkan path. Each
example with both backends ships GLSL *and* WGSL copies of its shaders
side-by-side (e.g. [raygen.glsl](examples/Sponza/raygen.glsl) +
[raygen.wgsl](examples/Sponza/raygen.wgsl)); `project.cpp` selects the
right set per target.
2026-05-02 00:03:24 +02:00
## Examples
2026-05-02 21:08:20 +02:00
See [examples/](examples/). Quick map:
2026-05-02 00:03:24 +02:00
2026-05-19 01:43:46 +02:00
- [HelloWindow](examples/HelloWindow/) — minimal native window, no rendering.
- [HelloDom](examples/HelloDom/) — wasm-only smoke test of the DOM
partition: page-level events, `HtmlElement::CreateInBody`, and
`Router::PushState`-driven SPA navigation. No GPU work.
- [VulkanTriangle](examples/VulkanTriangle/) — ray-traced triangle on
both Vulkan and WebGPU. The smallest test of the bindless + RT path
on each backend.
- [RTStress](examples/RTStress/) — wavefront RT benchmark: an N×N×N grid
of a cube mesh (instance-count knob `kGrid`, 512 → 8000) shaded with
primary + shadow rays. Prints a GPU timestamp-query per-pass breakdown
each second. WebGPU/DOM only.
2026-05-19 01:43:46 +02:00
- [Sponza](examples/Sponza/) — ray-traced Sponza atrium on both
backends. Exercises `.cmesh` / `.ctex` decompression (GPU
`VK_EXT_memory_decompression` on Vulkan, CPU on WebGPU) and a
textured closest-hit. See [its README](examples/Sponza/README.md)
for asset provenance.
2026-05-02 21:08:20 +02:00
- [HelloUI](examples/HelloUI/) — UI smoke test using all three tiers
(background quad, slider, progress bar, button with text label,
cursor-tracking circle).
- [CustomShader](examples/CustomShader/) — Tier 1 demo: a user-authored
compute shader inverting RGB under a list of item-circles, dispatched
2026-05-19 01:43:46 +02:00
alongside the standard `drawQuads`. Shipped as both
[`.comp.glsl`](examples/CustomShader/inverse-circle.comp.glsl) and
[`.comp.wgsl`](examples/CustomShader/inverse-circle.comp.wgsl).
- [Decompression](examples/Decompression/) — `Crafter::Compression`
CPU round-trip smoke test (used by the WebGPU asset path).
- [InputSystem](examples/InputSystem/) — keyboard / mouse / gamepad
event surface check.
2026-05-02 00:03:24 +02:00
## License
2026-05-02 21:08:20 +02:00
LGPL 3.0. See per-file headers and `LICENSE`.