Technical Report No. VSR-04.02


End-of-Year Technical Report

For Project

Digital Human Modeling and Virtual Reality for FCS

By

The Virtual Soldier Research (VSR) Program
Center for Computer-Aided Design
College of Engineering
The University of Iowa
116 Engineering Research Facility
Iowa City, IA 52242-1000

VSR TEAM


K. Abdel-Malek, J. Arora, S. Beck, M. Bhatti, J. Carroll (Clarkson University), T. Cook, S. Dasgupta, N. Grosland, R. Han, H. Kim, J. Lu, C. Swan, A. Williams, J. Yang


K. Farrell, R. Vignes, T. Sinokrot, A. Mathai, T. Marler, J. Muhs, Q. Wang, X. Zhou, J. Lee, J. Kim, X. Man, S. Rahmatala, S. Dandach, R. Fetter, E. Horn, A. Patrick, Z. Mi,


Dated: October 25, 2004

CONTRACT/PR NO. DAAE07-03-D-L003/0001


a.     Current technologies

An intelligent avatar that exists in an interactive, real-time environment requires a visual persona on par with the sophistication of the processes that predict its posture, motion, and balance.  This not only means photo realistic human appearance but also realistic movement of human skin, muscles, and skeletal system.


Fortunately the entertainment industry has been working diligently on this problem for over 15 years.  This industry has expended a great deal of time, effort, and money to help develop the technology and craft necessary to create digital content that is indistinguishable from the real world - at least as seen on 75mm film.  Today, the hardware and software tools required to build, animate, and render photo realistic digital humans are very mature and can be obtained for a few thousand dollars.  As well, state of the art methodologies in digital human modeling are under constant development.  Results are freely and regularly published within internet communities interested in digital content creation and often include detailed tutorials for application-specific tool sets.  Most importantly, many of the concepts, techniques, and technologies used to create photo realistic digital humans can be leveraged for real time.


b.     Current gaming engines (available development engines, what is the difference between them, what are their capabilities)


As the power of commodity-level computers continues to increase, high-quality, real-time, interactive 3D visualization becomes less of a computer graphics research goal and more of a consumer-level expectation.  With this in mind, it is not surprising how sophisticated and plentiful the software tools used to create interactive, real-time 3D environments have become.  In fact, most 3D computer games now come with very powerful level editors<[1]> and state-of-the-art real-time 3D rendering capabilities.  As well, many traditional 3D modeling and animation tools (Maya, 3D Studio Max, Lightwave, etc) now have plug-in modules that take advantage of the real-time 3D display capabilities of commodity-level 3D graphics cards.  It seemed prudent then to explore as many different real-time environment development options as time allowed before making a decision at the outset of the Virtual Soldier Project.


Real-time environment development tools considered include:


3D Studio Max                  http://www.discreet.com/3Dsmax/

CG                                    http://www.nvidia.com/object/cg_toolkit.html

EON Reality                      http://www.eonreality.com/

HalfLife2                            http://www.valvesoftware.com/

Maya                                 http://www.alias.com/eng/products-services/maya/index.shtml

UnrealTournament2003      http://www.unrealtournament.com/index.php

Unrealty                             http://www.unrealty.net/

Real-Time Simulation code                              http://www.Real-Time Simulation code.com/

Others[3]                                         


Although all tools explored had the capability to render 3D geometry in real-time, some of the tools with the highest quality rendering capabilities were not designed to communicate with virtual reality (VR) hardware.   Developing our own application-specific drivers to address this issue was considered , as it would allow us to take advantage of these environments’ higher quality visual capabilities.   But the time required to develop these drivers was unknown and several development environments were designed specifically with the ability to communicate with VR hardware.  Real-Time Simulation code, through its modular approach to the various component engines (VR, Physics, AI, Rendering) appeared to provide the greatest flexibility in the foreseeable future.


 

c.     Developing a human model for a real-time environment

The development of any 3-dimensional model can be broken down into three parts:

(1)    Creating the Polygonal Mesh: Current limitations in real time technology require all 3-dimensional objects to be defined by a polygonal mesh.  (see beck-figure-1)

 

beck-figure-1


Although the polygonal mesh defines the overall shape of the subject, it lacks the visual queues required to differentiate material properties of, for example, skin from cloth. See beck-figure-2)


beck-figure-2


            (2)    Define Shaders: A shader contains data that determines what happens when simulated light strikes a surface.  For example, if an object is expected to appear as if it is made of glass, one of the attributes in that object’s shader would have to allow light to pass through its surfaces. If that object were made of solid glass, another attribute in that shader would be required to address its refractive index. Or if the glass object is not completely transparent then the object’s shader must provide some level of translucence and perhaps even color, etc. The visual queues required to convey convincing skin textures are a different type of challenge but the use of shaders to define skin characteristics is the same. (See beck-figure-3 and beck-figure-4)


beck-figure-3



beck-figure-4


            (2)    Create Joints: By themselves, three-dimensional models, shaders, and textures can be used to create photo-realistic renderings. But realistic movement of these models requires a hierarchical joint structure. (see beck-figure-5)



A hierarchical joint structure – commonly referred to as an inverse kinematics (IK) skeleton – is a series of interdependent local coordinate axes strategically positioned at locations within the three-dimensional model to provide rotational pivot points at the shoulders, elbows, knees, etc.  (See beck-figure-6)


beck-figure-6




To visually simulate the elasticity of real human skin as the joints are exercised, the amount of movement any region of the avatar’s skin has during a given joint’s rotation must be defined. This is a well-known animation technique called “skin weighting” and addresses the aesthetic issue that would otherwise cause a three-dimensional model to tear or break at the joints when rotated (see beck-figure-7 and beck-figure-8)



beck-figure-7                                                      beck-figure-8


Typically, this technique is accomplished subjectively through interactive tools – much like using a can of spray paint - that allow 8-bit gray level values to be applied to joint-specific regions of the skin. The higher the gray level value, the greater the effect a given joint has on that region.


beck-figure-9


Beck-figure-9 shows the skin weight associated with an immovable joint below the first spine joint. Note the high gray level values around the groin area, which cause the geometry in that region to be completely immoveable. Note also the decreasing gray level values above the groin area, allowing the skin geometry increasing elasticity the further it is from the groin region.


d.     Animating versus real-time


Calculating every movement made by an avatar in a real-time environment is not realistic.  Differentiation between critical movements (CM’s) and non-critical movement (NCM’s) was addressed.  For example, asking the avatar to move a 20lb object from point A to point B would represent a CM.  Waiting for the avatar to walk from its current location to where a 20lb object is located prior to picking it up *might* represent a NCM.  If the avatar’s interaction with the 20lb object is the subject of interest to the user, the avatar could simply disappear from its current location and then reappear in front of the 20lb object.  One of the goals of this project, however, is to provide the user with a certain amount of immersion into the avatar’s environment and towards this end it makes sense to have the avatar walk to the 20lb object before the CM begins.  The ability to use NCM’s – a walk-cycle or something as simple as making the avatar look as if it is breathing, for example - requires a series of avatar state awareness-related routines to be implemented in the real time environment in order for this functionality to exist.

 

Current issues regarding the use of NCM’s involve the practicality of creating human-like movement using traditional animation techniques versus the requirements of the mathematical models used to predict posture, motion, and path. Specifically, the mathematical model requires joints with multiple degrees of rotational freedom to be represented by component joints - each with a single degree of rotational freedom.Since skeletal joints are represented in three-dimensional space by a hierarchical structure of strategically located local coordinate axes, there is no such thing as a joint with only one degree of freedom.The mathematical model simply treats them that way by applying rotations to one, and only one axis for any given component joint. And as long as the mathematical models for posture, motion, and path prediction are always used to move the avatar, this is not a problem. However, computing all movements, all the time for any given avatar is not only impractical but also unnecessary.


Non-critical movements are created using traditional animation techniques outside the real-time environment.These techniques include motion capture, key frame animation, and inverse kinematics that make it difficult to maintain the “single degree of freedom per joint” restrictions required by the mathematical models in the real-time environment. In other words, once a NCM is implemented to move an avatar, the rotation of a component joint about the two axes not represented by its assigned single degree of freedom may no longer be equal to zero.The result of this is that the posture, motion, or path predicted will be inaccurate by an amount involving the cascading effect of all the rotations on all the axes not represented in the mathematical model of the avatar’s skeleton.This can be significant and we are currently pursuing three approaches to the problem:


            1. Compute the rotations required by the component joints for any given multiple degree of freedom joint such that an instantaneous dispersion of accumulated rotations to the single degree of freedom component joints allows the avatar to remain in its current position. This prepares the component joints for accurate posture, motion, and path prediction. For example, let’s assume that after a NCM is applied (making the avatar appear to scratch its head and look around the room) the component joints representing the right shoulder’s three degrees of freedom have rotations in x, y, and z as follows:


Shoulder-right-1 (-55, 2, -15)

Shoulder-right-2 (-9, 18, 2)

Shoulder-right-3 (-77, 5, -25)


Any computations for CM’s would now be inaccurate because the right shoulder has effectively been given nine degrees of freedom.  In order for the computations for CM’s to be accurate, the component joints for the shoulder must appear as follows:


Shoulder-right-1 (X, 0, 0)

Shoulder-right-2 (0, Y, 0)

Shoulder-right-3 (0, 0, Z)


. . . . where X, Y, and Z allow the shoulder to maintain a posture identical to the nine degree of freedom posture resulting from the use of the NCM that caused the avatar to scratch its head.


This will be the best solution but current attempts to implement this in the real time environment have not been successful.


            2.      Apply rotations to the component joints to force them into the single degree of freedom requirements of the mathematical models used to predict posture, motion, and path prior to computing CM’s.


This works but is visually confusing to the user as there is no reason for the avatar to make this (often strange looking) adjustment to posture prior to performing a given task.


            3.      Apply rotations to the component joints to force them into the single degree of freedom requirements of the mathematical models used to predict posture, motion, and path during implementation of a CM.


This works nicely visually and allows the avatar to accurately reach the destination posture but changes the predicted motion and path and is therefore not an acceptable solution for the long term.  However, this is currently the short-term solution that has been implemented, as the inaccuracies of motion and path are not as critical in the short-term if the end posture is correct.  We continue to work on solving approach (1) and expect to come up with the solution in the near future.

 

e.       Kinesiology for Santos™


There are two issues that require further study of the mechanics and anatomy in relation to human movement (kinesiology).  These issues involve 1) motion in the shoulder and 2) topology of the polygonal mesh that represents our current avatar’s skin.

 

1.) Motion in the Shoulder: While the current mathematical model can predict postures that accurately place an end-effector (a finger for example) at a specified target location, the results can cause the shoulder to appear dislocated at the extremes of the joint limits.  (See beck-image-10 and beck-image-11)

 

beck-image-10                                                         beck-image11

 

Research indicates that the system of joints in the shoulder region of our current skeletal model does not contain enough detail to address this issue.  Details follow:

 

The arrangement of joints in the current skeletal model is based on a feasibility study for a 15-degree of freedom joint structure.  In the current skeletal model (see beck-figure-14 and beck-figure-15 below), the shoulder system is approximated as follows:

-         The humerus (upper arm) is anchored to, and pivots at, a point at the top of the shoulder.

-         The shoulder joint is attached to, and pivots at, a point on the clavicle (collar bone).

-         And the clavicle is simply attached to the spine.


beck-figure-14                                                         beck-figure-15


Close examination of the shoulder region in an actual human skeleton shows that:


-         The humerus (upper arm) is anchored to, and pivots at, a point on the scapula (shoulder blade).


-         The scapula is anchored to,  and pivots at, a point on the clavicle (collar bone).


-         The clavicle  is anchored to,  and pivots at, a point on the sternum (breast bone).


-         The sternum is attached to the spine via the rib cage.


This arrangement allows the scapula to slide over the back of the rib cage and is critical in providing the shoulder system with a range of motion typical of physically normal humans as shown in the series of images below. (Figures 16 through 29 shown below are from the following web site http://staff.ci.qut.edu.au/~barkerc/Final%20PAN%20website/panindex.htm)

 

Beck-figure-16 and beck-figure-17 (below) show a human skeleton in a relaxed and standing state.


beck-figure-16			      beck-figure-17

Beck-figure-18 and beck-figure-19 (below) show the resulting movement in the system as the shoulders are pulled back. Compare the proximity of the scapula (shoulder blade) to the spine in this posture with the scapula in beck-figure-16 and beck-figure-17 (above).


beck-figure-18			      beck-figure-19

Beck-figure-20 and beck-figure-21 (below) the resulting movement in the system when the shoulders are pulled forward. Note again the change in proximity of the scapula to the spine


 

beck-figure-20			      beck-figure-21


Beck-figure-22 and beck-figure-23 (below) show the resulting movement in the system when the shoulders are pulled upwards. Note the coupled motion of the clavicle and scapula. The orientation of the scapula relative to the rib cage remains unchanged as the clavicle rotates upwards.


beck-figure-22			      beck-figure-23

Beck-figure-24 and beck-figure-25 (below) shows the resulting movement in the system when the arms are extended overhead. Note the coupled motion between the humerus and scapula.


beck-figure-24			     beck-figure-25

Beck-figure-26 and beck-figure-27 show the resulting movement in the system when the arms are extended behind the back.<


beck-figure-26			      beck-figure-27

Beck-figure-28 and beck-figure-29 show the resulting movement in the system when the arms reach forward.


beck-figure-28			      beck-figure-29

Each of the images in the series shown above (beck-figure-16 through beck-figure-29) demonstrate a range of motion in the shoulder that is not possible to replicate in our current skeletal model.A more detailed and sophisticated series of joints that more accurately replicate human shoulder movement is clearly indicated.


A more kinesthetically correct system of joints for the shoulder is shown in the skeletal model below.

beck-figure-30


2.) Topology of the Polygonal Mesh: Human skin moves in accordance to the stretching and contraction of the underlying musculature.  Research shows that in order for the skin of a digital human model to behave correctly and in concert with the movement of the joints, the topology of the polygonal skin mesh must reflect the shape and function of the underlying muscles.  Details follow:

 

Our current avatar’s polygonal skin mesh was created using full body scanning technology wherein the surface information of an actual human subject is recorded as a collection of coordinate points. This collection of points is then processed to create an organized polygonal mesh that can be used in 3-dimensional computer applications.


beck-figure-31

While the polygonal mesh shown in beck-figure-31 (above) is an accurate replication of the scanned human subject, the topology of the mesh is not able to stretch and compress satisfactorily at the extreme ranges of motion.



A polygonal topology optimized for a full range of motion will more closely reflect the arrangements of the underlying muscle groups. (Figures 32 through 39 shown below are from the following web site http://staff.ci.qut.edu.au/~barkerc/Final%20PAN%20website/panindex.htm )


Note the way each muscle in the anatomical drawings in beck-figure-32 and beck-figure-34 can be described as a shape that begins at a point, widens at the muscle belly, and then ends at a point. This shape is consistent throughout the entire musculature system and must be reflected in the topology of the polygonal mesh in order to replicate the change in shape of the muscles as they stretch and contract.

 

beck-figure-32

beck-figure-33

beck-figure-34


beck-figure-35


A more optimized polygonal mesh topology allows a more natural replication of muscle stretch and contraction as the avatar demonstrates a range of shoulder motion typical of physically normal humans. (See beck-figure-36 through beck-figure-39)

 

beck-figure-36



beck-figure-37



beck-figure-38


beck-figure-39




f.       Interacting with Santos™(keyboard, mouse, clicking)

VSR researchers modified and increased functionality, it became clear that the user interface would have to be designed specifically to minimize the delay between advances in functionality and the time it would take the application developers to implement those advances. At the same time, the interface should maximize the interactive on-screen real estate and provide an intuitive means of interacting with a sophisticated digital human.

The current interface provides the user with a moveable PDA-like window that can be shown or hidden quickly though the use of a single keystroke. Making the PDA window moveable allows the user to control over the best position of the window while working. Providing the user with the ability to hide or show the window quickly maximizes the useable on-screen space.The PDA window was also designed to be transparent or translucent wherever possible to ensure minimum occlusion of the user’s field of view when the PDA window was deployed.

The PDA window design was chosen for its ability to reduce on-screen clutter, showing only the functionality relevant at the current level of the menu.It also provides the user with access to existing functionality through an intuitively obvious hierarchy of menu choices.

Beck-image-40 shows the main level of the function interface window.Use of this interface is obvious to any user familiar with a modern computer. Help is available if necessary and a minimum of time is required to update the interface by application developers.

beck-figure-40

 

Beck-figure-41 (below) shows what the user would see after selecting the SETTINGS button (beck-figure-40) and moving to the next level of the interface window.  Here the user could select one of five different options.

beck-figure-41

Beck-figure-42 (below) shows what the user would see after selecting the CM (critical movement) button shown in beck-figure-41 (above).  All buttons at this level are relevant to computations involved in critical movement.

beck-figure-42

 

Beck-figure-43 shows what the user would see had user elected to predict the MOTION required to use the avatar’s RIGHT arm to touch a preselected target while minimizing a cost function called DISCOMFORT.

 

beck-figure-43

 

Beck-figure-44 shows a more detailed view of the available cost functions available.

 

beck-figure-44

 

Again, this interface design’s primary purpose was to ensure that VSR’s researchers could add to and modify existing functionality with a minimum of effort (and therefore minimum delay) from the application developers. As can be seen in the previous series of images, implementation of new functionality is simply a matter of adding the appropriate buttons and a new screen. While this has provided VSR with the ability to increase functionality quickly and easily, it is clear that a more detailed investigation into how users will work with a sophisticated avatar is required. Towards this end, VSR is actively recruiting industry partners with real- world needs for this technology.

Executive Summary



[1] Level Editors provide artists and designers with interactive tools to create content that will appear in a given real-time game environment.


[2] HalfLife2 is a game that is expected to use a very sophisticated rendering, physics, and AI engine. It is not currently available, but much has been written about its capabilities. This game’s capabilities remain an attractive option and will be evaluated when the game is released.


[3] Many well-known environments like VRML and ViewPoint Platform were considered briefly, but quickly proved to be inadequate for VSR’s requirements.