glTF rendering
Some notes on loading and rendering glTF scene files.
glTF loading
If you are using C or C++, the cgltf library is a single-header / single-source C library for reading glTF files.
Top-down vs Bottom-up Loading
After having implemented a fair amount of glTF loading and rendering, it seems there are basically two general strategies to go about this:
- Loading the scene top-down, starting from the scenes down to buffers.
- Loading the scene bottom-up, from buffers up to scenes.
I started and stayed with the latter, but the former has some benefits too.
The top-down approach gives you context. For example, and as explained in the next section, you will be able to determine how a texture is used based on the material that references it, and consequently, you will know what colour space to expect the texture in. The down-side of the top-down approach is that since resources like textures or buffers can be shared among meshes, you will need to implement some sort of caching to avoid loading the same resource twice. The bottom-up approach has the opposite trade-offs: resources can be loaded in a linear fashion, but resources like textures would have to be loaded lazily until the context is known.
Textures and Colour Spaces
Albedo textures are given in RGB space. Normal maps, metallic/roughness, and other non-colour textures should be in linear space. Make sure to use the right format (RGB vs SRGB in OpenGL). It is also best to load the textures in the right format instead of performing a Linear->sRGB conversion in shader code. [discussion].
Note that the intended use of a texture is not immediately clear when looking at the textures, images and samplers sections of a glTF file. See the DamagedHelmet sample:
"images" : [
{
"uri" : "Default_albedo.jpg"
},
{
"uri" : "Default_metalRoughness.jpg"
},
{
"uri" : "Default_emissive.jpg"
},
{
"uri" : "Default_AO.jpg"
},
{
"uri" : "Default_normal.jpg"
}
],
...
"samplers" : [
{}
],
...
"textures" : [
{
"sampler" : 0,
"source" : 0
},
{
"sampler" : 0,
"source" : 1
},
{
"sampler" : 0,
"source" : 2
},
{
"sampler" : 0,
"source" : 3
},
{
"sampler" : 0,
"source" : 4
}
]
Intead, the use becomes apparent when parsing the materials of a mesh
primitive. Below, each index
refers to a texture in the
snippet above:
"materials" : [
{
"emissiveFactor" : [
1.0,
1.0,
1.0
],
"emissiveTexture" : {
"index" : 2
},
"name" : "Material_MR",
"normalTexture" : {
"index" : 4
},
"occlusionTexture" : {
"index" : 3
},
"pbrMetallicRoughness" : {
"baseColorTexture" : {
"index" : 0
},
"metallicRoughnessTexture" : {
"index" : 1
}
}
}
],
My implementation loads textures lazily. It first scans all textures up front, but instead of uploading them to GPU memory, which requires knowing the format, it instead returns a list of “load commands”. These commands describe how to read the texture (disk or memory) and what parameters to use, with a default format of sRGB. Later, when loading materials, if a normal texture is detected, the relevant load command is patched to use a linear colour space instead. The textures are then uploaded to GPU memory also during material loading.
Tangents
The DamagedHelmet sample has a normal map but no tangent vectors. In such cases, the glTF spec currently says:
“When tangents are not specified, client implementations SHOULD calculate tangents using default MikkTSpace algorithms with the specified vertex positions, normals, and texture coordinates associated with the normal texture.”
However, the MikkTSpace documentation also says:
// Note that the results are returned unindexed. It is possible to generate a new index list
// But averaging/overwriting tangent spaces by using an already existing index list WILL produce INCRORRECT results.
// DO NOT! use an already existing index list.
In other words, if the model has vertex indices like DamagedHelmet, then we should not compute tangents with the model as is. At the very least, we should unindex the model, compute the tangents, and then re-index it. [discussion]
What the glTF Sampler Viewer implementation does instead is to approximate the tangent in screen space when the model has no normals: [source]
vec3 uv_dx = dFdx(vec3(UV, 0.0));
vec3 uv_dy = dFdy(vec3(UV, 0.0));
vec3 t_ = (uv_dy.t * dFdx(v_Position) - uv_dx.t * dFdy(v_Position))
/ (uv_dx.s * uv_dy.t - uv_dy.s * uv_dx.t);
Animation Data Structures
Some of the terminology here seems confusing. glTF also has a very flexible animation specification. I found it easier to implement my engine based on how glTF works than to try to bend glTF into the animation framework I had in my head. My engine only loads glTF anyway.
A glTF skin
is what most people call the skeleton. The
glTF skeleton
is the root node of the hierarchy. And
behold, it’s optional; a glTF scene need not specify a root. Also, the
skeletons in a glTF scene generally form a forest, not just a single
tree. For example, you could have a single-bone skeleton animating a
door, a full-blown skeleton for a character, and various other skeletons
for other elements of the scene.
Quaternion Interpolation
Rotations in a glTF skeleton are given using quaternions. You will have to interpolate them. This is my interpolation function:
/// Interpolate two unit quaternions using spherical linear interpolation.
///
/// Note: You might want to normalize the result.
static inline quat qslerp(quat a, quat b, R t) {
assert(0.0 <= t);
assert(t <= 1.0);
const R eps = 1e-5;
(void)eps;
assert(R_eq(qnorm2(a), 1.0, eps));
assert(R_eq(qnorm2(b), 1.0, eps));
R dot = qdot(a, b);
// Make the rotation path follow the "short way", i.e., ensure that:
// -90 <= angle <= 90
if (dot < 0.0) {
dot = -dot;
b = qneg(b);
}
// For numerical stability, perform linear interpolation when the two
// quaternions are close to each other.
R ta, tb;
if (1.0 - dot > 1e-6) {
const R theta = acos(dot);
const R sin_theta = sqrt(1 - dot * dot);
ta = sin(theta * (1.0 - t)) / sin_theta;
tb = sin(theta * t) / sin_theta;
} else { // Linear interpolation.
ta = 1.0 - t;
tb = t;
}
return qadd(qscale(a, ta), qscale(b, tb));
}
(Functions like qscale
and add
have trivial
implementations to scale and add.)
The implementation is based on the article in Wikipedia:Slerp and relevant chapter in Mathematics for 3D Game Programming and Computer Graphics. One addition not mentioned in those references is the check:
if (1.0 - dot > 1e-6) {
The idea is that if the two quaternions are close to each other, we fall back to linear interpolation. This is not just for speed: I ran into garbage values otherwise.