Control of complex, physically simulated robot groups

David C. Brogan

University of Virginia Charlottesville, Virginia

4. BICYCLE SIMULATION NAVIGATION CONTROLLER

In the previous section, we described simulated characters that generate physically accurate movements, capture subtleties in bicycling actions, and convey important dynamic information to the viewer. The physical simulation algorithms that generate the accurate movements of these characters not only produce convincing animations, but they also constrain each character's ability to avoid collisions, follow a path, and ride as a group. These mobility constraints limit such capabilities as how quickly a bicyclist can turn, how far it can lean when cornering, and how fast it can ride. The navigation controller, which specifies a character's desired velocity, must operate within these constraints to generate character trajectories that satisfy high-level goals while avoiding undesirable collisions and crashes. Because the human and alien bicyclists have different mass distributions and limb leverage, they also have dissimilar mobility constraints. In this section we develop a navigation controller that easily adjusts to operate for characters with these different physical abilities.

The navigation controller uses data from a character's state, neighbors, and the graphical world to compute a desired velocity, It has three parts: a perceptual algorithm, a desired position algorithm, and a desired velocity

Figure 6. In the left half of the figure, one character is visible to another if it is one of the n closest characters (n is 6 for this example). The black circles represent the set of visible characters, N , and the white those that cannot be seen by A. In the right half of the figure, the locations of the visible characters are used to compute a global desired position for the individual under consideration. The algorithm computes a desired position with respect to each visible character, xd(i) , by finding the point on the line between the individual and the visible character that is a constant distance D away from the visible character. These desired positions are averaged with a weighting equal to 1/d2 where di is the distance between the two characters.

algorithm. The perception algorithm determines the set of characters and components from the physical world that are visible to each individual in the group. For this navigation controller, each individual in the group can perceive the relative locations and velocities of the set of visible characters, N (figure 6). The desired position algorithm uses the locations and velocities of the N visible characters and obstacles to calculate a desired position for each individual. The desired velocity algorithm uses the output of the desired position algorithm and the nominal userspecified velocity to compute a desired velocity for each character. The locomotion controller for each character uses the desired velocity to compute joint torques that adjust speed and direction of travel, thereby reducing velocity errors.

4.1. Computing Desired Positions

In this stage of the navigation control algorithm, a set of desired positions is computed for each individual in the group. These desired positions correspond to the herding and path following goals of the navigation controller. Each character computes a desired position relative to its visible neighbors that preserves a close grouping while avoiding collisions. Each individual would be in the perfect position according to the grouping goals of the navigation controller if it could move to this position instantaneously. However, each character must also follow the path, and a desired position is computed that satisfies this goal. In this subsection we describe the algorithms that specify these two desired positions.

4.1.1. Desired position relative to neighbors

Each character's desired position algorithm uses the positions of the n visible neighbors (figure 6) to compute a desired position, (xd(i) , yd(i) ), relative to each of these characters. This position is a distance D away from visible character, i, on the line between the two characters (figure 6):

y= yi - y A x, xi - x A (5)

where (xi , yi ) is the current position of character i, and (xA , yA ) is the current position of character A. When xi = xA , the two characters are on a line parallel to the y-axis and the slope of the line is assumed to be a large, but not infinite, number. The desired separation distance, D, specifies that (x d(i) , yd(i) ) is D meters away from character i:

D = xd(i) 2 + yd(i) 2 . (6)

Using equations 5 and 6, we can solve for (xd(i) , yd(i) ):

xd(i) = D
yi -yA xi -xA 2 +1 (7)

yd(i) is computed in a similar fashion. After computing the desired position relative to each character in N , the set of desired positions is weighted by the actual distance between each pair of characters, d i , and averaged to compute a global desired position:

xd(i) sgn(xA - xi ) di 2 herdx = xA +iN 1 di 2 iN (8)

sgn(p) returns -1 when p is negative and 1 otherwise. The desired y-position, herd y , is computed similarly.

4.1.2. Desired position relative to path

In addition to avoiding collisions with other individuals in the group, the characters must also ride along a path. The path is constrained to be on a plane, but it is free to twist left and right. The center of the path is modeled as a Catmull-Rom spline and the path edges are 5.0 meters to either side of the center.

The bicyclist path following algorithm seeks to follow the curves of the path while providing bicyclists with the flexibility to ride anywhere within the path edges. By projecting a vector from the bicyclist's current position in its direction of travel, we are able to compute the position of a look-ahead point, (look x , looky ) that is t seconds away, based on the current speed:

lookx == x + speed · t cos(yaw) (9)

looky = x + speed · t sin(yaw) (10)

We compute (pathx , pathy ), the closest point on the path's center spline to the look-ahead point, (look x , looky ) (figure 7 (a)). By steering towards (pathx , pathy ), each bicyclist attempts to track the path. However, the value of t greatly affects the performance of the navigation controller. For this reason, t is tuned through a series of offline trials. In these trials, the bicyclist must follow five paths of varying curvature. By using these trials to iteratively tune the value of t, a value is determined that minimizes the accumulated distance between the bicyclist and the path. In the experiments reported in this paper, a look-ahead time of 4.1 seconds is used for the human bicyclist and 1.1 seconds is used for the alien.

The bicyclist should ride towards (pathx , pathy ) to prevent straying from the center of the path. However, the bicyclists are unable to avoid collisions with neighbors when they all desire to be at the path's center. Instead, the path following algorithm attempts to position the group centroid at the center of the path. The algorithm computes the shortest distance between the bicyclist and the center of the path spline. It also computes the average position of all the visible neighbors and the shortest distance between this group centroid and the center of the path. The desired position for the bicyclist is defined to lie on a line that intersects, and is normal to, the path at (path x , pathy ). The desired position along this line is cast from the center of the path by an offset distance equal to the bicyclist's current distance from the path center minus the group centroid's distance. This offset distance is capped to be no greater than half the path's width. By aiming towards this point, the relative positions of the group members are preserved and the bicyclist can follow the path without collisions (figure 7 (b)).

4.2. Computing Desired Velocities

The desired velocity algorithm must combine a character's desired positions, current state, neighbor velocities, and the user-specified nominal velocity to compute an appropriate desired velocity. This desired velocity must preserve the character's balance while effectively moving towards the desired positions. Therefore, the navigation control algorithm must take into account each character's physical qualities and locomotion controller. For example, human and alien bicyclists possess different mass distributions, which in turn affect how these characters must compute velocities to eliminate position errors. The navigation controller requires that the desired velocity algorithms be tuned to account for unique attributes of each character.

Figure 7. In frame (a), a line is projected from the bicyclist's current position along its direction of travel. A look-ahead point is computed t · speed meters along this line to evaluate the future position on the path. Frame (b) demonstrates how a distance g is computed to represent the distance between the bicyclist's current distance from the group centroid. The navigation controller aims towards a future position that places the group centroid on the path center and preserves the character's offset distance, g, from the center of the path.

4.2.1. Desired velocity for herding

The desired velocity algorithm first determines a desired riding direction and speed that will eliminate the error in position, (ex , ey ), between the character's current position and the desired position relative to its neighbors, (herdx , herdy ):

ex = herdx - x (11)

ey = herdy - y (11) (12)

The character's desired riding direction is:

herdyaw = arctan y + k p ey + k d e y x + k p ex + k d e x , (13)

where (x, y) is the character's current velocity and the spring gain, kp , and damping gain, kd , are multiplied by the character's error in position and the velocity of the position error, respectively.

The spring and damping gains are tuned by an offline optimization process using simulated annealing. In this process, a bicyclist is commanded to shift its position 4.0 m to the right. The bicyclist must steer to the right and then to the left to complete this ess-maneuver, but the bicyclist's desired riding direction, yaw nom , remains unchanged during the experiment, so the initial and terminal riding directions should be the same. The experiment terminates in a successful state when the character has arrived within ±0.1 m of the desired position, yaw is within ±0.1 radians of yawnom , and yaw is within ±0.1 radians/s. The simulated annealing evaluation function penalizes the fork velocities that accumulate during the experiment with a weight of 0.0001. The time to complete the experiment and the terminal errors in velocity, riding direction (yaw), and lateral position are penalized with weights 1.0, 10.0, 1.0, and 1.0 respectively. Although these error terms are measured with different units, most of their values range between ±0.1 upon successful termination. The error weights of the cumulative fork velocity error and the terminal bicycle velocity error are scaled to produce similar contributions to the evaluation metric. A successful termination

Table 1. Simulated annealing experiments were used to automatically select spring and damping gains for the bicyclists' velocity computation algorithms.

completes the evaluation without any additional penalty, but a bicyclist crash, or a lack of success after 15 s results in a large penalty. Results of the simulated annealing experiments are summarized in table 1.

The system specifies a constant nominal speed, speednom , for all the bicyclists, but this speed must be adjusted to improve each character's ability to reach a desired position. The character's desired speed is increased when it is behind the desired position and it is decreased when it is ahead. The angle, orient measures the angle between the character's yaw and the line between the character and the desired group position, (herd x , herdy ). If orient is between 0.0 and , the character is behind the desired position and must speed up, otherwise, the character must slow down:

ey . (14) orient = yaw - arctan ex

The degree to which the character changes speed depends on the value of orient and the distance between the character and the desired position:

speederror = kv · cos(orient) e2 + e 2 , x y (15)

where kv is a scaling gain set to 0.5. The error in speed, speederror , will be the largest when the character is traveling directly towards or away from the desired position. If the character is traveling directly alongside the desired position, the error in speed will be 0.0. The error in speed is added to the nominal speed to determine the speed, herd speed , that moves the character towards its desired group position:herdspeed = speednom + speed error .

4.2.2. Desired velocity for path following

To follow a path, a desired position ahead of the character was computed. The desired velocity that accomplishes the path following behavior must control the character towards this desired position. First, we compute the angle, pathyaw , between the character's current position (x, y) and the desired position, (path x , pathy ):

pathyaw = arctan pathy - y pathx - x . (16)

This direction will keep the bicyclist on the path. The desired path following speed, path speed is set to be the nominal speed:

pathspeed = speednom . (17)

4.2.3. Computing the unified velocity

Finally, the two desired angles are averaged, as are the two desired speeds, to generate desired x and y velocities that move the character towards the two desired positions:

xd yd = = cos sin 1 1 (herdyaw + pathyaw ) · (herdspeed + pathspeed ) 2 2 1 1 (herdyaw + pathyaw ) · (herdspeed + pathspeed ) 2 2 xd yd = = cos sin 1 1 (herdyaw + pathyaw ) · (herdspeed + pathspeed ) 2 2 1 1 (herdyaw + pathyaw ) · (herdspeed + pathspeed ) 2 2

Figure 8. In this figure, the effects of using incorrect values for the bicyclist's look-ahead value, t, are demonstrated. Bicyclist (a) has a large value for t and as a result aims towards points on the center of the path that cause it to turn into corners early, perhaps riding off the road. Bicyclist (b) has a value of t that is too small and as a result, it has not begun to steer into the turn, despite its proximity to the apex. Bicyclist (b) may ride beyond the outside edge of the road as a result of this delayed reaction.