Inertialization Transition Cost
Created on June 2, 2022, 3:21 p.m.
Something which is often required by animation systems (Motion Matching being a key example) is a way to compute a "cost" associated with a particular transition between two frames of animation.
In Motion Matching this is typically done by taking the difference between "features" of both the source and destination animations. Usually, the positions of a few key joints (such as the feet), as well as their velocities. The magnitudes (or squared magnitudes) of the differences of these feature values is computed, and then added together using some user specified or automatically computed weights.
And although both simple and fast, there are two limitations to this notion of a "transition cost". The first problem is that we're mixing units (such as positions and velocities). This is why we need some kind of weighting terms to balance the contribution from each of these things. Setting good weights by hand is quite difficult, so we often resort to some kind of statistical normalization instead - something which can be a little brittle to changes in our data.
The second problem is that this does not account for the relationships between these features. And having some particular combination of joint position and velocity can often result in an unexpectedly good transition, which is not accounted for when we just add up those differences separately (as we will see later).
While I was thinking about this problem, I wondered if we could do something better by assuming our transitions are performed using inertialization. The idea I came up with was this: let's define our transition cost as the total displacement caused by an inertialized transition. i.e. The area of the graph between the source and destination animations:
When we're using a critically damped spring to produce this offset between source and destination, something like this area can be computed directly using the integral of the spring equation.
For example, here's the integral of a critically damped spring, with target position zero and target velocity zero (as used to decay the offset of the inertializer), where the initial position is given as \( x \), the initial velocity as \( v \) and half-damping \( y \) is as defined here:
\begin{align*} \int (e^{-y \cdot t} \cdot (x + (v + x \cdot y) \cdot t))\, dt = -e^{-y \cdot t} \cdot \tfrac{1}{y^2} \cdot (t \cdot v \cdot y + x \cdot y \cdot (t \cdot y+2) + v)) \end{align*}
Then, by taking the definite integral of this equation from zero to infinity, we can get an equation which tells us the total area under the graph:
\begin{align*} \int_0^\infty (e^{-y \cdot t} \cdot (x + (v + x \cdot y) \cdot t))\, dt &= \\ -e^{-y \cdot \infty} \cdot \tfrac{1}{y^2} \cdot (\infty \cdot v \cdot y + x \cdot y \cdot (\infty \cdot y+2) + v)) &- \\ -e^{-y \cdot 0} \cdot \tfrac{1}{y^2} \cdot (0 \cdot v \cdot y + x \cdot y \cdot (0 \cdot y+2) + v)) &= \\ 0 + \tfrac{1}{y^2} \cdot (x \cdot y \cdot 2 + v)) &= \frac{2 \cdot x \cdot y + v}{y^2} \end{align*}
However this isn't exactly what we're looking for - and it wont exactly give us the area of the graph between source and destination animations in all cases. The first reason for this is that this number can be negative: springs which start with a negative displacement will produce a graph primarily below zero, resulting in a negative integral:
The second, worse problem, is that some graphs can even cross zero - producing an area both above and below zero, leading to an integral which underestimates the total area since it subtracts one from the other:
In these cases we need to split the integral into two parts - first computing the absolute value of the area before the graph crosses zero, and then the part after.
To find the time at which the graph crosses zero we can set the spring equation to equal zero and solve for time:
\begin{align*} e^{-y \cdot t} \cdot (x + (v + x \cdot y) \cdot t) = 0 \\ x + (v + x \cdot y) \cdot t = 0 \\ (v + x \cdot y) \cdot t = -x \\ t = \frac{-x}{v + x \cdot y} \end{align*}
The trick here is to remove the \( e^{-y \cdot t} \) term as for any value of \( t \) it simply scales the whole of the left hand side of the equation and so does not affect the point at which 0 is crossed.
Then, we need three different definite integrals, first from \( 0 \) to \( t \), from \( t \) to \( \infty \), and finally the one we already computed from \( 0 \) to \( \infty \):
\begin{align*} I(x, v, y, t) &= (e^{-y \cdot t} \cdot (x + (v + x \cdot y) \cdot t)) \\ \int_0^t I(x, v, y, t)\, dt &= \frac{2 \cdot x \cdot y + v}{y^2} - e^{-y \cdot t} \cdot \tfrac{1}{y^2} \cdot (t \cdot v \cdot y + x \cdot y \cdot (t \cdot y+2) + v)) \\ \int_t^\infty I(x, v, y, t)\, dt &= e^{-y \cdot t} \cdot \tfrac{1}{y^2} \cdot (t \cdot v \cdot y + x \cdot y \cdot (t \cdot y+2) + v)) \\ \int_0^\infty I(x, v, y, t)\, dt &= \frac{2 \cdot x \cdot y + v}{y^2} \\ \end{align*}
With these three integrals we can get an accurate estimate of the area of our transition. Given an initial positional difference x
, velocity difference v
and a halflife
, we can compute the total displacement as follows:
def decay_spring_damper_intersection(x, v, halflife, eps=1e-8):
y = halflife_to_damping(halflife) / 2.0
return -x / (v + x * y)
def decay_spring_damper_displacement(x, v, halflife):
y = halflife_to_damping(halflife) / 2.0
t = decay_spring_damper_intersection(x, v, halflife)
int_0 = (x*y*2 + v) / (y*y)
int_t = (np.exp(-y*t) / (y*y)) * (t*v*y + x*y*(t*y+2) + v)
return np.where(t > 0.0,
abs(int_0 - int_t) + abs(int_t),
abs(int_0))
But given we want to use this as a cost function, what does it actually look like? Well here's a 2D plot, with difference in position on the x-axis and difference in velocity on y-axis (the halflife
here is set to 0.15
):
We can compare this to the cost function we would get if we just add the summed absolute differences in position and velocity (here the velocity difference is scaled by 0.5
):
The main difference we can see is that our new transition cost resembles more of a skewed valley than an inverted pyramid - the cost of a positive offset can actually be low - as long as it's combined with a negative velocity offset of the right magnitude. This makes sense - a large velocity offset can initialize the spring in a way such that it corrects itself back to zero more quickly than an initial velocity offset of zero:
But the problem with this cost function in practice is that since it's not a normal euclidean distance, it's not totally clear how we might integrate it with the rest of our system and exploit many of the common acceleration structures used in Motion Matching.
It would be nice if instead there were a "feature" we could compute which emulated the behavior of this cost function.
Well, as long as we're willing to accept the approximation \( \left| \tfrac{2 \cdot x \cdot y + v}{y^2} \right| \) which somewhat under-estimates the cost for oscillations around zero, there is!
The first step is to expand out the position difference \( x \) and velocity difference \( v \) from the source \( s \) and destination animations \( d \) into \( s_{pos} \), \( d_{pos} \), \( s_{vel} \), and \( d_{vel} \), giving:
\begin{align*} \left| \frac{2 \cdot (s_{pos} - d_{pos}) \cdot y + (s_{vel} - d_{vel})}{y^2} \right| \end{align*}
Then, we just need to re-arrange it until we've got all the \( s \) variables on one side of a difference, and all the \( d \) on the other:
\begin{align*} &= \left| \frac{2 \cdot y \cdot s_{pos} - 2 \cdot y \cdot d_{pos} + s_{vel} - d_{vel}}{y^2} \right| \\ &= \left| \frac{(2 \cdot y \cdot s_{pos} + s_{vel}) - (2 \cdot y \cdot d_{pos} + d_{vel})}{y^2} \right| \\ &= \left| \frac{(2 \cdot y \cdot s_{pos} + s_{vel})}{y^2} - \frac{(2 \cdot y \cdot d_{pos} + d_{vel})}{y^2} \right| \\ &= \left| \left( \frac{2 \cdot s_{pos}}{y} + \frac{s_{vel}}{y^2} \right) - \left( \frac{2 \cdot d_{pos}}{y} + \frac{d_{vel}}{y^2} \right) \right| \end{align*}
Which tells us that to emulate this approximate cost function the feature we need to put into our database is exactly the following:
\begin{align*} \frac{2 \cdot pos}{y} + \frac{vel}{y^2} \end{align*}
where \( pos \) is the bone position, \( vel \) is the bone velocity, and \( y \) is the half-damping which we compute from the half-life.
And although I'm certain there are some pathological cases which assign a small cost to transitions with large oscillations, in practice, if you use a relatively small half-life it seems to work. Here I've visualized the error of this function in comparison to our more exact cost function given previously.
You can see that at least the error is limited to a fairly small part of the space where we have a large velocity offset and small, exact positional offset.
To test this feature, I added it to my previous Motion Matching demo, using the original setup and acceleration structure, but replacing the existing bone position and bone velocity features with this feature:
Not bad! And although the difference is not hugely significant, to me it looks like an overall improvement, and I was surprised at how well this idea worked out-of-the-box with no tweaking of velocity and positional weights required. So while perhaps not a silver bullet, to me this still seems like a fairly good way to potentially reduce the database size and remove one more weight for users to tweak!
As a disclaimer I should say that while I thought this was an interesting idea worth sharing, it's not something I've tested myself extensively, so in practice your mileage may vary.
Also worth noting is that although I've provided the derivation here for a spring-based inertializer, I think a similar derivation should be perfectly possible for inertializers that blend out the offset using a polynomial (and in fact it may well be easier).
Finally, it's worth mentioning that when we compute the total displacement here we are making an assumption that the inertialized movement of a joint's position in character space is going to be similar to what we would get if we individually inertialized all of the local joint rotations down the chain - something that may well not often be true.