New Sound in Town

Microphones are not for the faint of heart. There is a sordid history of MEMS microphones, replete with big companies crying “Uncle!” and with legal vitriol.

Unlike something as “simple” as an accelerometer (with apologies to anyone that’s worked damned hard on a fine accelerometer), there’s been less rush to compete once everyone figured out how hard microphones can be.

And so we have a few deeply entrenched incumbents manning the sound.

But microphones still look interesting as an opportunity. We saw some time ago that multiple microphones are becoming a thing. Why? For the same reason that high-quality sound recording uses them. By recording an orchestra and the audience with two mikes, for example, you now have two tracks, and you can subtract the audience track from the orchestra track to get a cleaner version of the orchestra.

Then there’s “audio zoom,” which is more or less a kind of reverse beam-forming. I’ve always found the concept of beam-forming of any kind to sound rather mysterious and cryptic. But then it occurred to me that I experienced what might be one of the first general-public uses of beam-forming at a music festival in the early 80s. They had this cool, novel setup that made it easier for folks in the back grass to hear: they put speakers way out there, and they delayed the signal to the speakers ever so slightly to give time for the sound from the near-in speakers to arrive.

Without that delay, the outer speakers would “play” first and would then get muddied up by the sound traveling through the air, which takes longer to arrive. By phase-aligning the speaker path and the over-the-air path, the sound was much clearer.

The same thing works in reverse for microphones, where you can adjust the relative timing of the mikes to “focus” attention on a particular spot. The more mikes you have, the more precise you can be and the stronger the signal-to-noise ratio (SNR).

So how many mikes are we talking about here? The iPhone 5 has four; new phone designs are moving to 5.

But it’s not just a matter of throwing mikes onto a phone willy-nilly. It takes processing to manipulate the multiple mike signals into a single audio stream. That processing has to take into account not only the timing of the various mike signals, but also the noise on each one, as well as any intrinsic mismatches between the installed mikes.

In other words, high-quality mikes that are well matched make the computation task easier and deliver a higher-fidelity result.

All of which could make this a tempting target for newcomers. But what about all of those bugaboos of the past?

I think the biggest bit of learning from the past is that, with microphones, you can’t separate the package from the MEMS die. The cavity inside the package and the porting and positioning all matter. That’s one of the intricacies that have kept some folks from diving in.

Image courtesy Vesper

But a relative newcomer, Vesper (the MEMS guys previously known as Baker Calling), has announced that it is making that dive. And they’re doing it by focusing on the MEMS and ASIC chips and then selling those dice to folks who already know how to do the packaging. So they’re making the bet that they won’t have to start wrestling with package issues.

That gets rid of one problem and lets them focus on the MEMS part. Their MEMS element is not just a tweak to existing microphone technology; they’re changing the way they detect sound as compared to what folks are doing today.

Current MEMS microphones are made of a diaphragm and a “backplate.” The diaphragm vibrates under the influence of sound, and its distance from the backplate is measured as a capacitance. This time-varying capacitance is translated into the audio signal.

Image courtesy Vesper

This arrangement has its challenges, according to Vesper. For one thing, there’s an air gap between the diaphragm and the backplate. Because air is what transports the sound in the first place, you can’t eliminate it. But, even though there are vent holes in the backplate, you still get some slight squishing (alternating with rarifying) of the air in the gap, and that damping affects the quality of the recorded sound.

And it’s not just the damping per se; this is also a big source of variation because this phenomenon is hard to match between microphones.

Vesper cites other challenges having to do with dust, water, debris, and even solder flux wandering in and landing on the diaphragm. Obviously that would change the diaphragm’s response.

So Vesper is exploiting a different physical property: the piezoelectric effect, as made manifest by a cantilever. Or make that a pair of cantilevers. Or better yet, an array of cantilever-like structures that have a shape different from that of your basic cantilever, but which we can talk about as if they were simple cantilevers. This is where we enter the realm of what Vesper considers their “secret sauce.” (Although the top view of the following figure may be telling…)

Image courtesy Vesper

The bottom part of the figure above shows two simple cantilevers facing each other. As they vibrate in response to sound, the field created by the deflection of the piezoelectric material becomes the signal that’s transported out for further processing.

Now, you might notice that there’s a gap between the two cantilevers. Why wouldn’t dust and debris factor here as well? The reason, according to Vesper, is that this gap is far smaller than that in a standard diaphragm-based mike; nothing’s going to lodge in there. It does seem like something could land on the cantilevers themselves – after all, cantilevers can be used to detect substances because their resonance changes when something lands on them and sticks. This gets me speculating as to whether the special sauce is partly about having some cantilever redundancy to reject any such anomalies.

However it is they do it, Vesper says that they do not suffer from dust and such the way diaphragm mikes do.

The result of all of this is what they claim to be the highest-SNR compact-for-phones microphone at 70 dB. Sensitivity is also well matched. And they claim a low-enough power solution that this can enable always-on use of the microphone, which will enable voice control of the phone even when it’s sleeping. And, if that weren’t enough, they’re available in a small, ruggedized, waterproof package.

They’re positioning this as a game-changer in the microphone world. We’ll have to watch to see whether the game does in fact change.

UPDATE: There was one loose end that I hadn’t been able to tie up by post time – and I now have my answers. The issue deals with the SNR spec and the fact that Vesper isn’t doing its own packaging. The SNR depends on the package, so I had this nagging question that said, if Vesper is selling without a package and others are selling with a package, then how do you compare SNR numbers?

Vesper CEO Matt Crowley agreed that the package makes a big difference – in particular, the volume of the package. The smaller the volume, the more it impinges on the SNR performance. The actual manufacturer of the package doesn’t matter so much: As long as you’re comparing equal volumes, then it can be considered apples-to-apples.

In this specific case, they’re touting a 3.35-mm x 2.5-mm package for their 70 dB SNR number. That’s the smallest package that can achieve that 70 dB rating.

More info:

Vesper