At the recent Interactive Technology Summit (erstwhile Touch Gesture Motion), gesture was featured more on the day I was checking out the TSensors summit. But I did get a chance to talk both to PointGrab and eyeSight to see what has transpired over the last year.
These two companies both aim at similar spaces, gunning for supremacy in laptops, phones, and other household electronics (HVAC, white goods, etc.). Part of the game right now is design wins, and frankly, their design win reports sound very similar. So there seems to be plenty of business to go around – even to the point that it seems that in some cases, a given company is using them both. I don’t know if that’s to check them both out over time or to make them both happy or to use them as negotiation fodder against each other. To hear them tell it, business is good for everyone.
Development continues apace as well. One key change that’s happened in the last year is a move away from using gestures simply to control a mouse. Using the mouse model, for example, if you want to shut off your Windows laptop, then you gesture the mouse to go down to the Start button and do the required clicks to shut down the machine*. The new model is simply to have a “shut down” gesture – the mouse is irrelevant.
PointGrab has already released this; eyeSight has it in the wings.
I discussed the issue of universal gestures with PointGrab. There is an ongoing challenge of developing gestures that are intuitive across cultures (there aren’t many – some say one, some say two…). PointGrab doesn’t actually see this as a big issue; there’s room for everyone to acquire a simple, well-thought out gesture “lexicon” even if it means acquiring some new gestures that weren’t already used in that culture. Their bigger worry is that different companies will use different lexicons, rather than everyone settling on one set of gestures.
PointGrab has also announced what they call Hybrid Action Recognition. This is a way of making gesture recognition smarter, and it consists of three elements (not to be confused with three sequential steps):
- Watching for movement that suggests that a gesture is coming
- Looking for specific shapes, like a finger in front of the face
- Disambiguating look-alike objects
This almost feels to me a bit like yet another form of context awareness: these three tasks establish a context that says, “Hey, this is a gesture; that last thing wasn’t.” At present, this is a static system; in the future, they will be able to make it learn in real time.
Meanwhile, eyeSight noted that, in the future, you may have several devices in a given room that are gesture-enabled. Perhaps a laptop, a TV, and a thermostat. If you gesture, which one are you talking to? Well, as humans, our primary indicator is by looking at the person we’re talking to. EyeSight is looking at providing this capability as well: a device would react to a gesture only if you’re looking at it.
They’re also looking farther down the road at more holistic approaches, including gaze, face recognition, and even speech. (As humans, we can talk to someone we’re not looking at, but we use speech to alert them that they’re who we’re talking to.) But this is a ways out…
As an aside, it was noted in a presentation that gaze in particular is good for broad-level use, but doesn’t work well for fine tracking since our eyes actually flit around at high speeds (saccadic movement) – activity that our brain smooths out so that we don’t notice it. A computer could tell that we’re looking at the computer easily enough, but it would have to do a similar smoothing thing in order to be able to identify, for example, which word we’re reading on the screen.
This whole gesture space seems to be moving extraordinarily quickly; there has been significant change in only one year. This is but one reason that it’s all done in software instead of hardware; updates can be anything but minor. The other reason, of course, is that this capability is going onto mainstream consumer devices. Requiring specific hardware would introduce a much higher barrier to inclusion.
This tension between hardware and software is actually going to be playing out in related spaces, but that’s a topic for another time.
*Unless, heaven help you, you’re on the original Windows 8, in which case you’ll gesture to move the mouse all over the place in a vain attempt to find where to shut things down; then you’ll give up and gesture to bring up your favorite browser to search for “How the #@$(&* do I shut down my @(#$&(# Windows 8 machine???” and find that you go in to Settings (???????) and a few more mouse clicks (really??) done by gestures and Bingo! In only 15 minutes, you’ve managed to shut it off, with only a 50 point rise in your blood pressure! I think that, by this whole Windows 8 fiasco, Microsoft is earning itself its own specific gesture. One that I won’t repeat here, this being a family newspaper and all.