RedShark News - Video technology news and analysis

Synthesising photoreal humans has now been made easier than ever before

Written by Adrian Pennington | Jul 31, 2019 11:00:00 PM

VFX simulations, especially of photoreal digital humans, involve painstaking scanning, modelling, texturing and lighting, but the latest technologies might soon allow anyone to generate high-fidelity 3D avatars out of a single picture and create animations in real-time.

Motion capture, for example, has been a highly expensive exercise requiring specialised hardware, suits, trackers, controlled studio environments and an army of experts to make it all work.

New artificial intelligence and machine learning tools are changing all of that. Among the latest developments is an innovation in facial tracking that is claimed to not just speed production but deliver even greater fidelity.

Unleashed at computer graphics show Siggraph, the proof of concept is a joint venture of French software developer Dynamixyz and Pixelgun Studios, a Californian 3D high-end scanning agency.

Dynamixyz core technology is a markerless tracking software called Performer that uses machine learning to annotate key poses of an actor simply from a video of them performing. An animator then takes over to tweak the data to fit a new CG actor or creature.

VFX facility Framestore used Performer to animate Smart Hulk shots in Avengers: Endgame.

More accuracy

Adding in Pixelgun’s scanning process replaces the annotation step and is claimed to be far more accurate since it comes from the specific facial information of the actor included in the scans.

Until relatively recently, facial scanned data which gives information on geometry and textures like wrinkles was once very hard to collect. Even the larger studios weren’t equipped and the budgets involved were very high. Scanning technology has since evolved to become more accessible and affordable and many studios have developed pipelines to include photogrammetry or 3D scanning.

The Dynamixyz / Pixelgun solution scanned data with textures covering 80 expressions taken from 63 cameras trained on the head for expression capture, and 145 cameras trained on the subject for body capture. The data is then extracted automatically and put onto a digital model, eliminating the need for an animator.

“With these images generated as if they were taken with a Head-Mounted Camera and geometry information, we were able to build a tracking profile as if it had been annotated manually,” explained Vincent Barrielle, R&D engineer. “It brings higher precision as it has been generated with very high-quality scans. It also gives the opportunity to have a high volume of expressions”.

As lighting conditions are key when training the tracking, manual annotation is still required for two to three frames extracted from the production shots to retrieve the correct illumination pattern.

The technology is still in R&D, but the companies plan on making it available next year.

“We think other companies may have already developed such workflows in-house, as a tailor-made solution, not as a packaged software,” says Nicolas Stoiber head of R&D and CTO of Dynamixyz. “We make technologies accessible and usable for the whole industry.”

Also at Siggraph, the LA-based studio ICVR is showing their digital human project - a photorealistic 3D human being created from hundreds of 3D scans in partnership with The Scan Truck. One of the most impressive applications being shown is the ability for an actor to drive the model real-time in Unreal Engine through a combination of Dynamixyz facial tracking software and an XSens motion-capture suit.

Seattle-based motion capture facility Mocap Now demonstrated a real-time playback of full body (Optitrack), face (Dynamixyz again) and finger (Stretchsense) live and realtime into Unreal Engine.

Data, of course, is the foundational element. Whether that’s in a character sim and animation workflow, a render pipeline or project planning, innovations like these are granting the capability to implement ML systems that are able to add to the quality of work and the predictability of output.

RADiCAL, for example, can record video of an actor, even from a smartphone, upload it to the Cloud where the firm’s AI will send back motion-captured animation of the movements. The latest version promises 20x faster processing and a dramatic increase in the range of motion from athletic to combat.

San Francisco’s DeepMotion also uses AI to re-target and post-process motion-capture data. Its cloud application, Neuron, allows developers to upload and train their own 3D characters — choosing from hundreds of interactive motions available via an online library. The service is also claimed to free up time for artists to focus on the more expressive details of an animation.

Pinscreen is working on algorithms capable of building a photo-realistic 3D animatable avatar based on just a single still image. Its facial simulation AI tool is based on Generative Adversarial Networks, a technique for creating new, believable 2D and 3D imagery from a dataset of millions of real 2D photo inputs. One striking example on synthesising photoreal human faces can be seen at thispersondoesnotexist.com.