Over the past few years, myriad new interface devices have sprung up promising to deliver on 3D gesture control of our technology. However, the applications we use everyday have had 50 years to optimize for the mouse and keyboard. We’re still at the very beginning of understanding how to use these new input devices productively and efficiently.
After working at Leap Motion for a year as an interaction designer, I have more than a few opinions about gestures. Though I think that 3D interactions are often the wrong answer, I’ve seen many people and companies dismiss it without considering if they might be the right answer. This post will shed some light on the strengths and weaknesses of 3D input and when it actually might be a good idea.
Hands as Input
In order to understand what 3D interactions are really good at, we first need to understand what hands are really good at.
In his brief rant on the future of interaction design, Bret Victor correctly points out that hands are really good at sensing things and manipulating things. To be more precise, hands are very good at precisely sensing tangential forces and precisely applying normal forces.
However, there another thing hands are also really good at that Bret Victor doesn’t explicitly mention: movement. Hands move when we speak, when we think and, of course, when we interact. We have 23 degrees of freedom within each hand (plus the 6 of moving it through space). That is a huge amount of data.
The most amazing thing about humans and our hands is our ability to learn and control analog interfaces.
A beginner picking up a guitar will sound pretty bad because the dynamic system that is the guitar is hard to control. It’s hard to precisely control volume, timing, pitch and attack when you’re starting out. However, given time, a guitarist can learn to use these controls to express their soul in sound.
Jimi Hendrix is actually harnessing all the elements of the guitar which are uncontrollable for a beginner to create such unique and powerful sounds—the very things that make a beginner sound bad make him sound good. In painting, beginners have a very hard time controlling the color and stroke width, but masters use these variations to great artistic effect.
Compare this to Guitar Hero. It’s true you can do some amazing things, but you’ll never have the same level of control.
“A child can’t understand Hamlet, but can understand Cat In The Hat. Yet, it’s Shakespeare, not Dr. Seuss, who is the centerpiece of our culture’s literature.”
— Responses to A Brief Rant on Interaction Design
Dimensional Collapse
In days of yore, when you wanted to mail a letter, you’d grab a pen and some paper, write it down, fold it up, grab an envelope, stick the letter inside, lick the envelope, seal it, put a stamp on it and take it to the post office. Now when you want to accomplish roughly the same task, you type it out and click ‘send’.
We have sacrificed the extreme control over the personality of the handwriting, the weight and feel of the paper, and the envelope-licking-thoroughness for the extreme convenience of a few key presses and the click of a mouse.
I call this tradeoff dimensional collapse. When you reach out and click a mouse, you collapse the 26 dimensions in your hand down to a binary value. When we digitize everyday tasks, we trade degrees of freedom and personalization for convenience and simplicity.
Here’s an original excerpt from the patent for a typewriter:
“An artificial machine or method for impressing or transcribing of letters, one after another, as in writing, whereby all writing whatsoever may be engrossed in paper or parchment so neat and exact as not to be distinguished from print.”
— Henry Mill’s patent for a Machine for Transcribing Letters
That’s right! Typewriters were explicitly invented to uniformize handwriting.
Dimensional collapse is driven by the available input devices. From a computer’s perspective, there’s no difference between a hard and soft mouse click, and there’s no tone of voice in typing.
Modern rich text editors have given us back some of our selves with various font and color choices. Further explorations of augmenting typed text have been done, but they’re still a far cry from being able to change letter shapes on the fly. And, of course, this uniformization of typewritten text means that our communication necessarily suffers.
The fact that most computers use just a keyboard and mouse (and maybe a touchscreen) means that all modern applications are designed with dimensionally collapsed interfaces. There are some augmentation devices available to people who need them, but they still only add a few dimensions.
The Point of 3D Interactions
So, enter 3D interactions. 3D interactions will never be as good at dimensionally collapsed tasks as the mouse and keyboard (the input devices for which those tasks were designed). But we now have the ability to consume input from all 26 degrees of freedom in each hand, so let’s use it!
Instead of trying to use a mouse and keyboard to do creative sculpting, let’s create a direct interaction that utilizes hand motion. Or, even better, instead of collapsing 26D motion down to 2D, create a paint simulator that you control with hand motion. Instead of simulating a DJ turntable to touch (collapsing to 1D), make a 26-dimensional vocoder.
These new input devices have given us the freedom to be expressive again. We need to figure out how to use it.