Joint Error Propagation

17/08/2025

Most people who have worked with animation systems will know that one of the most annoying aspect of working with an articulated skeleton is that small rotational errors propagate down the joint chain:

Joints at the root of the character are much more sensitive to rotational errors compared to joints at the end of the chain, as even a small angular difference can cause a big difference in the final positions of limbs. For Machine Learning systems it means that if you represent your pose using local joint angles, your predictions need to be much more accurate on the joints closer to the root than on the others.

And by much I mean much. On a typical human character, an error of 5 degrees in the estimation of the hip rotation can cause both hands to move by something like 10 centimetres. An error of 5 degrees on a wrist however, only moves one of the hands by a centimetre or two.

Forward Kinematics Loss

How can we deal with this? Well one method which is common (and which I've used in the past) is to include a forward kinematics loss in your model. Put simply - you compute the global joint locations and rotations from the local ones differentiably and also include the difference of these global positions and rotations in your loss.

We can see the difference this makes if we do a little experiment. Here I've trained a simple two-layer auto-encoder on poses represented via local joint rotations. The latent space has only 150 dimensions, so the auto-encoder has to learn to compress the pose information in a lossy way.

This is what the reproductions look like without any forward kinematics loss:

Ouch. Even though the training loss converged to a small value, visually there are still quite significant issues with the reproductions due to the error propagation.

In comparison, here is the result I get using exactly the same setup, but with a forward kinematics loss added to the training:

Visually the result is, although not perfect, a whole lot better - as the network has been encouraged to use its capacity to reproduce the joint rotations near the root more accurately, at the cost of the reproductions of other joints nearer the leaves.

There is, however, one huge downside to the forward kinematics loss - it slows down training a lot. For a character with many joints like this one, training with the forward kinematics loss can easily be 100x slower than without it.

I've also found that forward kinematics losses can occasionally produce poor gradients. In particular if you need to do anything like converting between rotation representations, or extracting a full rotation matrix from the 2-column representation. Due to that, I've sometimes seen them make training unstable or have difficulty converging.

Local Joint Weighting

A much more simple idea than the forward kinematic loss is to weight and/or scale up/down the error contribution from each joint when computing the loss on the local joint rotations. In this way we could make joints nearer the root contribute more to the error, and joints at the leaves less. This is easy to implement and doesn't affect training time at all - so why isn't it more popular?

Well, the question becomes: how do we actually compute appropriate weights for each joint to compensate for the error propagation? This isn't obvious - as the correct weight for a joint could depend on a bunch of things: the actual pose of the character, the size of the limbs, the skinned mesh being used itself.

While such weights can be set by hand, it would be nice to have at least some kind of heuristic to find initial values for them automatically.

Well here are a couple of simple heuristics I've used in the past which have worked fairly well and you might find useful if you are taking this approach over the forward kinematics loss.

Total Descendants Length

The first idea is simple - we weigh each joint based on the total length of all the descendant joints.

This can be implemented in a few lines of python like this:

discount = 0.9

# default 1cm radius assumed
joint_weights_lengths = 0.01 * np.ones([nbones], dtype=np.float32)

for i in range(nbones)[::-1]:
    assert parents[i] < i
    if parents[i] != -1:
        joint_weights_lengths[parents[i]] += (
            discount * (lengths[i] + joint_weights_lengths[i]))

joint_weights_lengths = joint_weights_lengths / joint_weights_lengths.sum()

This is a kind of dynamic programming approach to implementation. First we give all joints some kind of base weighting. This is important for end-effectors who have no descendants. In this case we can give them the weight they would have assuming they had a single child joint 1cm away.

Next we iterate backwards over joints and add each joint's weight to its parent's weight multiplied by a discount factor - some amount by which we scale the contribution from the descendants. This biases our weights a little bit towards a uniform weighting, which accounts for the fact that this is an imperfect approximation of their importance.

For the Geno character this is the kind of weighting it produces:

weights_lengths = {
    "Hips":             0.10380474,
    "Spine":            0.06853730,
    "Spine1":           0.07324211,
    "Spine2":           0.07805575,
    "Spine3":           0.08339483,
    "Neck":             0.00752727,
    "Neck1":            0.00599905,
    "Head":             0.00430102,
    "HeadEnd":          0.00145849,
    "RightShoulder":    0.03873121,
    "RightArm":         0.03989225,
    "RightForeArm":     0.03893000,
    "RightHand":        0.03654566,
    "RightHandThumb1":  0.00623394,
    "RightHandThumb2":  0.00481741,
    "RightHandThumb3":  0.00318426,
    "RightHandThumb4":  0.00145849,
    "RightHandIndex1":  0.00621329,
    "RightHandIndex2":  0.00469081,
    "RightHandIndex3":  0.00311762,
    "RightHandIndex4":  0.00145849,
    "RightHandMiddle1": 0.00633869,
    "RightHandMiddle2": 0.00475611,
    "RightHandMiddle3": 0.00313095,
    "RightHandMiddle4": 0.00145849,
    "RightHandRing1":   0.00627206,
    "RightHandRing2":   0.00475611,
    "RightHandRing3":   0.00313095,
    "RightHandRing4":   0.00145849,
    "RightHandPinky1":  0.00607202,
    "RightHandPinky2":  0.00463750,
    "RightHandPinky3":  0.00311762,
    "RightHandPinky4":  0.00145849,
    "RightForeArmEnd":  0.00145849,
    "RightArmEnd":      0.00145849,
    "LeftShoulder":     0.03885792,
    "LeftArm":          0.03997777,
    "LeftForeArm":      0.03906523,
    "LeftHand":         0.03654114,
    "LeftHandThumb1":   0.00622891,
    "LeftHandThumb2":   0.00481182,
    "LeftHandThumb3":   0.00317804,
    "LeftHandThumb4":   0.00145849,
    "LeftHandIndex1":   0.00621329,
    "LeftHandIndex2":   0.00469081,
    "LeftHandIndex3":   0.00311762,
    "LeftHandIndex4":   0.00145849,
    "LeftHandMiddle1":  0.00633869,
    "LeftHandMiddle2":  0.00475611,
    "LeftHandMiddle3":  0.00313095,
    "LeftHandMiddle4":  0.00145849,
    "LeftHandRing1":    0.00627206,
    "LeftHandRing2":    0.00475611,
    "LeftHandRing3":    0.00313095,
    "LeftHandRing4":    0.00145849,
    "LeftHandPinky1":   0.00607202,
    "LeftHandPinky2":   0.00463750,
    "LeftHandPinky3":   0.00311762,
    "LeftHandPinky4":   0.00145849,
    "LeftForeArmEnd":   0.00145849,
    "LeftArmEnd":       0.00145849,
    "RightUpLeg":       0.02027336,
    "RightLeg":         0.01533394,
    "RightFoot":        0.00813920,
    "RightToeBase":     0.00373863,
    "RightToeBaseEnd":  0.00145849,
    "RightLegEnd":      0.00145849,
    "RightUpLegEnd":    0.00145849,
    "LeftUpLeg":        0.02028956,
    "LeftLeg":          0.01535193,
    "LeftFoot":         0.00815920,
    "LeftToeBase":      0.00376085,
    "LeftToeBaseEnd":   0.00145849,
    "LeftLegEnd":       0.00145849,
    "LeftUpLegEnd":     0.00145849
}

And here are what the reproductions looks like if we use this weighting to train the auto-encoder rather than using the forward kinematics loss.

Not bad! We can already see it gives a massive improvement over the uniform weighting in particular around the arms and hands of the character - and generally looks on-par with the forward kinematics loss - but (just like the forward kinematics loss) there are still some errors on the head and legs. Is there anything we can do about that?

Cylinder Surface Area

The previous heuristic works pretty well but has one big issue: it doesn't take into account the fact that different limbs have different thicknesses. It assumes the character is a kind of stick-figure. This means it tends to over-estimate the importance of thin branching sections of the character like fingers, and under-estimate the importance of the feet and head.

Intuitively we would expect the hand to have somewhat similar weighting to the foot. But with our previous scheme the hand will often be assigned a much higher importance due to all of the long, thin joint chains below it used for the fingers.

To get around the stick-figure assumption we can try to approximate the mesh using some basic primitives. For example, if we assume each joint can be represented by a cylinder then we can derived a nice weighting scheme based on the amount by which the surface areas of each of these cylinders moves in accordance with the rotation of each joint.

We just need some kind of "radius" for each joint of the character - which isn't too hard to set by hand:

Then, for any point on the cylinder surface, the amount of displacement induced by a small rotation around an ascendant joint will be directly proportional to the distance of that point from the ascendant.

Therefore, the total displacement of the cylinder surface is the integral of all the distances from the ascendant joint to all points on the surface.

This integral isn't easy to solve analytically (at least it wasn't for me!) but we can solve everything numerically fairly easily by sampling a large number of random points on the surface of the cylinders (excluding the caps) and treating these as the kind of "surface" of our character. Once we have these points all we need to do is compute their distance to all ascendant joints, and multiply it by the cylinder surface area.

So the first thing we need is something which will tell us if one joint is a descendant of another. For this we can compute the descendants_matrix:

def is_descendant(b, c):
    while b != -1:
        if parents[b] == c:
            return True
        else:
            b = parents[b]
    return False

descendants = []
for b in range(nbones):
    dec = []
    for c in range(nbones):
        if is_descendant(c, b):
            dec.append(c)
    descendants.append(dec)

descendants_mask = np.zeros([nbones, nbones], dtype=np.bool)
for b in range(nbones):
    if len(descendants[b]) > 0:
        descendants_mask[b,np.asarray(descendants[b])] = True

The descendants_matrix is actually really neat. Each row tells us if any of the other joints are a descendant of that joint:

>>> print(descendants_matrix[names.index('RightHand')])
[False False False False False False False False False False False False
 False  True  True  True  True  True  True  True  True  True  True  True
  True  True  True  True  True  True  True  True  True  True False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False]

While each column tells us which joints are an ascendant of that joint:

>>> print(descendants_matrix.T[names.index('RightHand')])
[ True  True  True  True  True False False False False  True  True  True
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False False False False False False False False False False
 False False False]

Then, we can take our cylinders, attach them to the character in the rest pose, and sample a fixed number of points on each cylinder. For each of these points we compute a mask saying if it is a descendant of any of the joints, and we compute a weighting for that point based on the total surface area of the cylinder:

points_per_cylinder = 1000

cylinder_points = []
cylinder_weights = []
cylinder_joints = []
for b in range(1, nbones):
    p = parents[b]
    cs, ce = global_positions[p], global_positions[b]
    
    radius = cylinder_radii[b]
    length = np.sqrt(np.sum((ce - cs)**2, axis=-1)) / 2
    
    direction = (ce - cs) / (length + 1e-8)
    position = (cs + ce) / 2
    rotation = quat.normalize(quat.between(np.array([0,0,1]), direction))
    
    points = sample_points_on_cylinder_nocap(length, radius, size=[points_per_cylinder])
    points = quat.mul_vec(rotation, points) + position
    
    weights = (cylinder_surface_area_nocap(length, radius) * 
        np.ones([points_per_cylinder])) / points_per_cylinder
    
    mask = descendants_matrix[:,b][None].repeat(points_per_cylinder, axis=0)
    
    cylinder_points.append(points)
    cylinder_weights.append(weights)
    cylinder_joints.append(mask)
    
cylinder_points = np.concatenate(cylinder_points, axis=0)
cylinder_weights = np.concatenate(cylinder_weights, axis=0)
cylinder_joints = np.concatenate(cylinder_joints, axis=0)

Once we have this we can compute the final weighting by finding the pairwise distances between each joint and each point, and multiplying it by the weights and the mask.

vertex_pairwise_distances = np.sqrt(
    np.sum((global_positions[:,None] - cylinder_points[None,:])**2, axis=-1))

joint_weights_cylinder = (cylinder_joints.T * 
    cylinder_weights * vertex_pairwise_distances).sum(axis=1) + 0.001
joint_weights_cylinder = joint_weights_cylinder / joint_weights_cylinder.sum()

This is what the final weights look like for this method:

weights_cylinders = {
    "Hips":             0.27297633,
    "Spine":            0.13206451,
    "Spine1":           0.11033192,
    "Spine2":           0.08819939,
    "Spine3":           0.07330287,
    "Neck":             0.00442493,
    "Neck1":            0.00326475,
    "Head":             0.00231714,
    "HeadEnd":          0.00057553,
    "RightShoulder":    0.02876344,
    "RightArm":         0.02260639,
    "RightForeArm":     0.00906809,
    "RightHand":        0.00220601,
    "RightHandThumb1":  0.00067190,
    "RightHandThumb2":  0.00062165,
    "RightHandThumb3":  0.00058660,
    "RightHandThumb4":  0.00057553,
    "RightHandIndex1":  0.00066820,
    "RightHandIndex2":  0.00060967,
    "RightHandIndex3":  0.00058383,
    "RightHandIndex4":  0.00057553,
    "RightHandMiddle1": 0.00068704,
    "RightHandMiddle2": 0.00061507,
    "RightHandMiddle3": 0.00058420,
    "RightHandMiddle4": 0.00057553,
    "RightHandRing1":   0.00067742,
    "RightHandRing2":   0.00061514,
    "RightHandRing3":   0.00058436,
    "RightHandRing4":   0.00057553,
    "RightHandPinky1":  0.00064885,
    "RightHandPinky2":  0.00060524,
    "RightHandPinky3":  0.00058378,
    "RightHandPinky4":  0.00057553,
    "RightForeArmEnd":  0.00057553,
    "RightArmEnd":      0.00057553,
    "LeftShoulder":     0.02932036,
    "LeftArm":          0.02301557,
    "LeftForeArm":      0.00944353,
    "LeftHand":         0.00221284,
    "LeftHandThumb1":   0.00067240,
    "LeftHandThumb2":   0.00062124,
    "LeftHandThumb3":   0.00058642,
    "LeftHandThumb4":   0.00057553,
    "LeftHandIndex1":   0.00066835,
    "LeftHandIndex2":   0.00060955,
    "LeftHandIndex3":   0.00058377,
    "LeftHandIndex4":   0.00057553,
    "LeftHandMiddle1":  0.00068761,
    "LeftHandMiddle2":  0.00061551,
    "LeftHandMiddle3":  0.00058439,
    "LeftHandMiddle4":  0.00057553,
    "LeftHandRing1":    0.00067797,
    "LeftHandRing2":    0.00061555,
    "LeftHandRing3":    0.00058441,
    "LeftHandRing4":    0.00057553,
    "LeftHandPinky1":   0.00064840,
    "LeftHandPinky2":   0.00060536,
    "LeftHandPinky3":   0.00058382,
    "LeftHandPinky4":   0.00057553,
    "LeftForeArmEnd":   0.00057553,
    "LeftArmEnd":       0.00057553,
    "RightUpLeg":       0.05388537,
    "RightLeg":         0.01976189,
    "RightFoot":        0.00317685,
    "RightToeBase":     0.00108218,
    "RightToeBaseEnd":  0.00057553,
    "RightLegEnd":      0.00057553,
    "RightUpLegEnd":    0.00057553,
    "LeftUpLeg":        0.05387436,
    "LeftLeg":          0.01964283,
    "LeftFoot":         0.00318803,
    "LeftToeBase":      0.00109653,
    "LeftToeBaseEnd":   0.00057553,
    "LeftLegEnd":       0.00057553,
    "LeftUpLegEnd":     0.00057553
}

And here are how the reproductions look with this weighting scheme:

Nice! As we hoped for this method gives higher weighting and therefore produces much better results on the feet, spine and head of the character - this makes a huge difference - not just for this character - but also for characters with extreme proportions such as large heads (as we can scale up the radius of the cylinder to make this importance clear).

Skinned Mesh Weighting

If we have the actual skinned mesh of the character we don't need to rely on the cylinder approximation. Instead we can directly compute a weighting based on how much each joint moves each vertex on the surface.

First we need to compute as mask saying how much each vertex is assigned to each joint using the skinning weights:

mesh_joints = np.zeros([vert_num, nbones])
for vi in range(vert_num):
    for bi, bw in zip(vert_bind[vi], vert_bwei[vi]):
        mesh_joints[vi, bi] += bw
        mesh_joints[vi, descendants_matrix[:,bi].astype(np.bool)] += bw

Then we can compute the weighting using the pairwise distances again, multiplied by the area of the mesh represented by each vertex and our mask.

vertex_pairwise_distances = np.sqrt(
    np.sum((global_positions[:,None] - vert_pos[None,:])**2, axis=-1))

joint_weights_mesh = (mesh_joints.T * 
    vert_areas * vertex_pairwise_distances).sum(axis=1) + 0.001
joint_weights_mesh = joint_weights_mesh / joint_weights_mesh.sum()

This is what the weights produced by this method look like:

weights_mesh = {
    "Hips":             0.27088639,
    "Spine":            0.12776886,
    "Spine1":           0.10730254,
    "Spine2":           0.08733685,
    "Spine3":           0.07508411,
    "Neck":             0.00838600,
    "Neck1":            0.00639638,
    "Head":             0.00515253,
    "HeadEnd":          0.00063045,
    "RightShoulder":    0.02654437,
    "RightArm":         0.02060832,
    "RightForeArm":     0.00825604,
    "RightHand":        0.00213240,
    "RightHandThumb1":  0.00073802,
    "RightHandThumb2":  0.00066565,
    "RightHandThumb3":  0.00063558,
    "RightHandThumb4":  0.00063045,
    "RightHandIndex1":  0.00070377,
    "RightHandIndex2":  0.00064898,
    "RightHandIndex3":  0.00063289,
    "RightHandIndex4":  0.00063045,
    "RightHandMiddle1": 0.00072178,
    "RightHandMiddle2": 0.00065547,
    "RightHandMiddle3": 0.00063321,
    "RightHandMiddle4": 0.00063045,
    "RightHandRing1":   0.00070793,
    "RightHandRing2":   0.00065231,
    "RightHandRing3":   0.00063322,
    "RightHandRing4":   0.00063045,
    "RightHandPinky1":  0.00067184,
    "RightHandPinky2":  0.00063829,
    "RightHandPinky3":  0.00063110,
    "RightHandPinky4":  0.00063045,
    "RightForeArmEnd":  0.00063045,
    "RightArmEnd":      0.00063045,
    "LeftShoulder":     0.02739252,
    "LeftArm":          0.02113067,
    "LeftForeArm":      0.00849728,
    "LeftHand":         0.00210641,
    "LeftHandThumb1":   0.00071845,
    "LeftHandThumb2":   0.00065790,
    "LeftHandThumb3":   0.00063489,
    "LeftHandThumb4":   0.00063045,
    "LeftHandIndex1":   0.00069211,
    "LeftHandIndex2":   0.00064446,
    "LeftHandIndex3":   0.00063293,
    "LeftHandIndex4":   0.00063045,
    "LeftHandMiddle1":  0.00071069,
    "LeftHandMiddle2":  0.00065042,
    "LeftHandMiddle3":  0.00063314,
    "LeftHandMiddle4":  0.00063045,
    "LeftHandRing1":    0.00070524,
    "LeftHandRing2":    0.00065236,
    "LeftHandRing3":    0.00063302,
    "LeftHandRing4":    0.00063045,
    "LeftHandPinky1":   0.00067250,
    "LeftHandPinky2":   0.00064092,
    "LeftHandPinky3":   0.00063160,
    "LeftHandPinky4":   0.00063045,
    "LeftForeArmEnd":   0.00063045,
    "LeftArmEnd":       0.00063045,
    "RightUpLeg":       0.05690333,
    "RightLeg":         0.02043630,
    "RightFoot":        0.00305942,
    "RightToeBase":     0.00080056,
    "RightToeBaseEnd":  0.00063045,
    "RightLegEnd":      0.00063045,
    "RightUpLegEnd":    0.00063045,
    "LeftUpLeg":        0.05668447,
    "LeftLeg":          0.02033588,
    "LeftFoot":         0.00289429,
    "LeftToeBase":      0.00078392,
    "LeftToeBaseEnd":   0.00063045,
    "LeftLegEnd":       0.00063045,
    "LeftUpLegEnd":     0.00063045
}

The fact that these weights are similar to our cylinder weights is a good sanity check for both methods.

And here is how this weighting method looks with our reproductions:

Even better - now the reproductions are relatively close to perfect and certainly even better than the forward kinematics loss. It really is night and day compared to our uniformly weighted auto-encoder.

If you're using the Geno character for your research the weights above are probably close to the optimal per-joint weighting you can use if you want to avoid joint error propagation. If you're not using the Geno character I'm pretty sure that you could still take these weights and map them to the joints on your own character and it would work equally well.

But mainly I hope this little study has been insightful and convinced you about the importance of dealing with error propagation when training any models that work in the local joint space.

The full source code for this article can be found here.