-
Lets say I take a mesh, and convert it into meshlets with a max of 64 vertices per meshlet [0, 64), and 64 triangles per meshlet [0, 64). The data looks like this: pub struct Meshlets {
pub meshlets: Vec<meshopt_Meshlet>,
pub vertices: Vec<u32>,
pub triangles: Vec<u8>,
}
pub struct meshopt_Meshlet {
pub vertex_offset: u32,
pub triangle_offset: u32,
pub vertex_count: u32,
pub triangle_count: u32,
} Now I have a compute shader where each workgroup renders a single meshlet. Each workgroup has 64 threads. Lets assume the meshlet in this example has a full 64 vertices and triangles. Each thread [0, 64) will first load a vertex from the meshlet, and then write it to a workgroup-shared memory (LDS) array of 64 vertices. That's easy enough to do: var<workgroup> workgroup_vertices: array<VertexData, 64>;
let meshlet_vertex_i = vertices[meshlet.vertex_offset + thread_index];
workgroup_vertices[thread_index] = mesh_vertex_data[meshlet_vertex_i];
workgroupBarrier(); Now, we switch modes. With all the mesh vertices needed to render the meshlet loaded into LDS, we can switch to have each thread handle 1 triangle. Each thread will want to figure out its triangle indices, and use that to load the correct vertex data from LDS. let triangle_indices = meshlet.triangle_offset + (thread_index * 3u) + vec3(0u, 1u, 2u);
let indices = vec3(triangles[triangle_indices.x], triangles[triangle_indices.y], triangles[triangle_indices.z]);
let vertices = vec3(workgroup_vertices[TODO], workgroup_vertices[TODO], workgroup_vertices[TODO]); Once we have the three u8's from the meshlet triangles buffer ( |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
So in your case the TODO would refer to |
Beta Was this translation helpful? Give feedback.
indices
in the example above would be suitable to address LDS directly I believe. The 8-bit indices in meshlet data point to meshlet-local vertex data. https://github.com/zeux/niagara/blob/master/src/shaders/meshlet.mesh.glsl may be helpful as an example.So in your case the TODO would refer to
indices.x
et al (and indices would probably useivec3
although maybe the code above magically deduces component types). Note that you would only want triangle corner data for something like culling; if you are not doing triangle level culling then you probably don't need to index LDS data and can just output corrected index data for rasterization.