A Brief History of Motion Tracking Technology and How it is Used Today

Rachel Lum
6 min readJan 9, 2019
Benedict Cumberbatch as his role of Smaug, the dragon in two of “The Hobbit” films.

Motion capture is the process of tracking the intricacies of live movement, and processing this information either live or by recording. It is used in many different fields ranging from military implementation to medical applications to entertainment.

The first sketch of Max Fleisher’s idea for a rotoscope.

Rotoscoping in the early 1900s

For some context, let’s dive down in a bit of history.

Early on in the film industry, motion capture was accomplished via rotoscoping, invented by Animator Max Fleischer in 1919. Rotoscoping is the process of recording actual footage, and tracing over this footage frame-by-frame. It plays a key role in a majority of the animated films we know and love to this day, including “Alice in Wonderland”, and “Snow White and the Seven Dwarves”. Even with digital animation making its big debut in the early 1940s-1960s, keyframing was still used to animate characters.

Beginnings of Motion Capture in the 1980s

In the 1980s, there was a growth in the study of human motion in biomechanics labs. Tom Calvert, a professor of kinesiology and computer science at Simon Fraser University was one of the first to use potentiometers to track knee flexion and extension. A potentiometer is “a manually adjustable variable resistor with three terminals (Santos, Rui).” Soon after, MIT developed its “graphical marionette”. Here, they attached LED lights to a bodysuit worn by an actor and captured his motion with Op-eye, an optical motion capture system and two tracking cameras, of which whose footage was traced back to render 2D motion of the actor. Shortly following, the advancement of motion capture technology slowed due to expense and slow computer processing.

Virtual Cinematography used in pre and post production

Motion Capture Today

Today, there are many different kinds of motion capture. We have access to both optical and non-optical systems (including inertial, magnetic, mechanical). Optical systems operate by tracking physical markers, such as LED lights, reflectors, ping-pong-ball-looking adhesives, or even just face paint.

Non-optical systems do not use any sort of physical marker. Instead, they use match-moving software to still track the movement of an actor, but this software operates by identifying key features of a human such as a nose or piece of clothing. Cinematographers create a quick sketch in computer graphics of whatever character they want to bring to life, and they map the skeleton of the character onto the live action footage, accounting for position, scale, orientation, and motion.

This is much more affordable since it is mostly software based and requires much less equipment. On a movie set, the digitally animated character is first created, and as they record, they can map this character over the actor using motion capture, so they can see instantly how the movement translates to the character, along with preferred lighting and angles. This is known as virtual cinematography.

A table comparison of differing motion capture (mocap) systems

The Kinect

We are familiar with some of the modern day motion capture equipment such as the Kinect by Microsoft for the Xbox. It uses a natural user interface, and so aimed to broaden its target audience beyond the typical gaming community. The Kinect was ahead of its time, and so not as widely-accepted upon its release in 2010. But because of its low cost and ease of use, Kinect technology is being used for a diverse range of fields, including the medical, tech, business management, sports therapy, robotic, and research industries.

Kinect sensor

The Kinect sensor consists of an RGB camera, a depth sensor (the powerful duo of an infrared projector and a monochrome complementary metal-oxide semiconductor), and a multi-array microphone. Microsoft released an open source software development kit to use with the Kinect, to read the input of the camera and smooth out any bumps, and fine-tune animation.

headset and controllers for the Oculus Rift

Virtual Reality

A newer, more advanced motion tracking system is used in virtual reality as of 2016. Yes, the Oculus Rift made its grand entry not long ago into the gaming industry, and remains to be one of the most popular implementation of motion tracking and near-instant rendering.

The hardware of the Oculus Rift consists of the following:

  • cable: HDMI and optional adapters send video to the oculus rift headset
  • headset: contains the motherboard, including an ARM processor, control chips for projecting infrared LEDS, Adjacent Reality Tracker (polled 1000x/ second), gyroscope, magnetometer, and ports for inserting controllers and headphones. This also contains the most important feature, the screen.
  • Constellation Tracking System: a small positional tracker placed in front of the player that monitors infrared LEDS projected 360 degrees from the headset.
  • feedback loop: continuous communication between positional tracker, headset, and software (included in the free SDK)
  • 3d audio: uses Head-Related Transfer Function conbined with head tracking to produce 3D audio spatialisation
  • Oculus Touch (the controllers): controllers that allow “hand presence” that also send out infrared LEDs. Reads the thumb touch

Augmented Reality

Augmented reality is the integration of virtual reality and the real world. So think of Snapchat and Instagram filters, or the up and coming Microsoft Hololens. Augmented reality applications like Snapchat use the technology from a Ukranian startup called Looksery, and they do not release many other details regarding how it works. In addition to this, they utilize a much more advanced combination of softwares to function, including face detection, and image processing. The Hololens is even more next-level. It is a wearable holographic computer that allows you to project and interact with holograms.

Conclusion

Virtual and augmented reality are not done expanding. Beyond entertainment, virtual and augmented reality can be implemented into business meetings, replace travel, real estate agents, save material in manufacturing, etc... When it comes down to it, humans are efficient, adaptive beings. If there is a way to condense the process of a certain event, we will find a way to do it. But where does this lead us? Well, if it becomes so advanced, so realistic, then at a point we may forget we are in a virtual reality in the first place…

Or are we already in it?

No, I don’t think so just yet.

Resources: https://en.wikipedia.org/wiki/Motion_capture, https://www.xsens.com/fascination-motion-capture/, https://www.pcmag.com/article/342537/the-best-virtual-reality-vr-headsets, https://www.engadget.com/2014/07/14/motion-capture-explainer/, https://www.xbox.com/en-us/Search?q=kinect, https://www.siggraph.org//education/materials/HyperGraph/animation/character_animation/motion_capture/history1.htm, https://randomnerdtutorials.com/electronics-basics-how-a-potentiometer-works/, https://en.wikipedia.org/wiki/Match_moving, https://www.prolificinteractive.com/2017/05/24/motion-capture-system-using-xbox-kinect/, https://developer.microsoft.com/en-us/windows/kinect, https://www.newgenapps.com/blog/how-vr-works-technology-behind-virtual-reality, https://www.wareable.com/vr/how-oculus-rift-works, https://www.microsoft.com/en-us/hololens, https://www.quora.com/What-are-some-future-applications-of-virtual-reality, https://www.youtube.com/watch?v=5yfRPbH1dh8, https://medium.com/cracking-the-data-science-interview/snapchats-filters-how-computer-vision-recognizes-your-face-9907d6904b91

--

--

Rachel Lum

Full stack web developer with passions for dance, fitness, health, math, communication, and technology.