Hearables. A Lust for Translation

PublishedNovember 17, 2017February 14, 2021
in Hearables

And lo, they spake in many tongues. And verily, some of them even began to ship.

So beginneth the first chapter in the Book of Hearables. We have reached the point where you can buy a device that you put in your ear, which can translate what someone is saying to you in another language in real time. After several years of marketing and hype, consumers are suddenly spoilt for choice, with Waverly’s Pilot, Mymanu’s Clik and Google’s Pixel Buds all appearing within a few weeks of each other. It’s Google recent announcement which has caught the public imagination, but that’s mainly a result of the scale of their marketing machine. With less media attention, other startup companies have been quietly beavering away, mostly in the crowdfunded arena.

Anyone who’s been following the evolution of earbuds over the last few years will have been aware of the trend. After Bragi invented the hearable category with their Dash earbuds, others started to experiment with different features and applications, looking for ways to make the things we stick in our ears do more exciting things than just play music. A startup called Waverly Labs was the first to concentrate on translation, back in June 2016, when they launched a campaign on Indiegogo for their Pilot earpieces, which promised to translate between five languages – English, Spanish, French, Italian and Portuguese. Others weren’t far behind, with Doppler (sadly deceased), Mymanu, Human, Inspear, Bragi, Lingmo, TimeKettle and a host of others joining in the race to wean us off Spotify and Pandora and start us talking to our fellow mortals. (Although if we’d rather listen to music than talk to people who speak the same language, it’s questionable whether there’s a massive market in wanting to talk to those who don’t.)

From Waverly’s initial pitch for their Pilot, with just five languages, we’ve seen something of a multilingual arms race. Lingmo invoked the power of IBM’s Watson to offer 9, whereupon Waverly upped the ante to 15, Inspear trumped them with 25, Mymanu promised 37 and Travis proudly announced 80, along with a rather patronising statement that because of that “you can now travel”, suggesting that the travel industry has been deluding itself for years. Google, of course, points out that their translation algorithms can cope with 10,000 language pairs, but given there are currently around 8,000 languages spoken, giving 64 million language pairs, even they can’t afford to rest on their laurels quite yet.

Not that that puts people off. In one of the most bizarre crowdfunding campaigns I’ve ever seen on Indiegogo, a project from Kazakhstan offered an “INSTANT mobile online translator from any language. Just put the headset in any country of the world you will understand each other. You no longer need to learn the language of other countries. Automatic recognition of women’s and men’s voices. Just put yourself and your interlocutor earpiece and wherever you went you understand any language. Translator will work as via Wi-Fi or via Bluetooth” (sic). To achieve that ultimate goal, they asked for a single funder to give them $30,000. Had they pitched it at $300 they’d probably have acquired several hundred backers, all of whom would almost certainly have lost their money. But that’s the joy of crowdfunding.

That’s not the only nod to fiction. A lot of what we’re seeing in the wearable space can be traced back to science fiction. The recent blossoming and subsequent withering of digital health analytics in the form of the Tricorder prize is perhaps the most obvious, having been specifically named after the Star Trek precursor. In the case of automatic translators, the epitome is Douglas Adams’ Babelfish from The Hitchhiker’s guide to the Galaxy, although many others have independently invented similar technology to allow them to converse effortlessly with aliens. Which raises the interesting question of why is it always aliens we want to talk to and not fellow human beings? Maybe it’s just that the people who promote these ideas aren’t particularly good at talking to each other. Or don’t heed the health warning on the Babelfish packaging that “by effectively removing all barriers to communication between different races and cultures, it has caused more and bloodier wars than anything else in the history of creation”. Although most developers like to quote the Babelfish as part of their inspiration, none seem to remember the satirical nature of the original.

But back to reality and the recent entry of the 800 pound gorilla. Google was very much forced to bring out their Pixel Buds, once they made the decision to remove the physical 3.5mm headphone socket from the Pixel 2, just as Apple did when they designed the iPhone 7. As I explained when I first looked at the AirPods, that decision to remove the socket is driven by the fact that the 3.5mm socket takes up too much space and is difficult to waterproof. It’s not disappearing because of any fundamental belief from either manufacturer that wireless is better or can do more things. It’s happening now because wireless has evolved to the “good enough” point, both in terms of audio quality and the battery life of Bluetooth earbuds and headsets. Apple has played the “brave” card, Google the “me too” one, and over the course of the next few years, other phone manufacturers will go with the herd.

Google’s problem is that with the AirPods, Apple did something unexpected. Historically, Apple are a follower. They don’t invent product categories – they wait for other major companies to create the market, then come in with a slicker product which delights, because they’ve concentrated on integrating everything which is needed to make the customer believe that they invented the experience. Having claimed the high end of the market, they then create clear water between themselves and their competitors by constantly increasing the level of delight. The AirPod is arguably the first product where they’ve made the market by themselves. Before AirPods, there were a few earbuds, mostly from small, crowdfunded startups targeting niche applications. Total shipments were probably around 100,000 units. The success of the AirPod has probably surpassed Apple’s wildest sales expectations – it’s likely that they currently have over 90% of the consumer earbud market. With Apple as the leader, it raises the question of who the followers will be and how they will differentiate themselves.

Hence Google’s problem. How do they make a hearable device which isn’t just a me-too? Apple has done a very competent job with the AirPods. They’ve not been too clever; unlike most of the crowdfunded hearables Airpods aren’t overloaded with tech. They concentrate on being good at the basics of playing music with a decent battery life. That gives Google two options – either make something that’s cheap, which isn’t really an option if you’re trying to make money in a new hardware business, or add a snazzy feature.

I doubt it took them long to decide on the second path, and not much longer to jump on the translation bandwagon. There’s little doubt that Google have a significant lead in translation, not least with their new Neural Machine Translation systems, which they introduced last year. If you’re interested, there’s a good overview of that on their Research Blog. Suffice it to say that Google knows as much about machine translation as anyone. But will that knowledge make their Pixel Buds a winner?

I’ve played with prototypes of a number of translation hearables (not yet with the Pixel Buds) and they all bring a smile to the face when you try them and discover they work. They are not instant and probably never will be, as even with neural machine translation you need to listen to enough of a sentence to get context before you can start any reliable translation, so there is a time-lag. That doesn’t stop them being impressive. In my review of the Hearables market, translation comes up as an important sector, but is it an independent market, or just a sales differentiator?

It will probably be a considerable time before translating earpieces affect the professional market, where a human translator can spot a potentially important mistake in translation and correct it. For some good examples on why we need professional translators, Richard Brooks’ blog from 2005 is still an excellent read. The other limitation is that unless the person you’re conversing with is bilingual, then both parties to the conversation need to wear a translating earpiece. (If one of them is bilingual, then neither needs one.) That removes another large chunk of market. Most of these devices allow you to use your phone to translate your voice back to the other speaker, but that’s a clunky solution. It’s fine for short conversations and transactions, such as asking the way or buying a charger for your hearable, but it’s an occasional usage.

A further issue is that a number of these devices require a good mobile connection, so that translation can be performed in the cloud. Over time, that will move down to the handset, particularly as smartphones incorporate better neural processing hardware. But it is a caveat emptor for early purchasers – you may need the right phone and network.

That’s not the only challenge. To work well, these devices need to block external sound, as you want to hear the translation rather than the speaker’s native voice. So you need to wear a pair of these devices or stick a finger in your other ear. Hence they need to be a snug fit, as well as being comfortable if you plan to wear them for any length of time. Although that may be self-limiting. If you’re listening to someone for most of the time, you probably ought to try taking out your earpiece and learning their language.

Some of the offerings have picked up on the need for comfort; others haven’t. At one end of the scale, Human’s headphones offer long term comfort by proposing a totally new form factor. It will be interesting to try them. At the other end of the scale, early reviews of the Pixel Buds report that they are neither a great fit, nor overly comfortable, suggesting that Google may not have paid enough attention to the basics.

And therein lies the problem. Very few people are likely to buy translating hearables for that purpose alone. Translation will be the differentiating factor which persuades consumers to buy one brand over another, but what will keep the product being used is their comfort, their battery life and most importantly, how well they play music. These are features which Human and Mymanu emphasise for their products, suggesting that they understand what is needed for success in this market. After all, although translation in hearables is both attractive and clever, there are plenty of phone based translation apps, which may be almost as effective for occasional use. A product that only concentrated on translation would find itself competing with these.

We should also consider the social aspects of using a translating earpiece. As I’ve noted above, it needs both parties to wear one to be effective, otherwise it may be perceived as just a high tech equivalent of offering beads to the natives, probably reinforcing the stereotype of monolingual English speakers. As Willy Brandt, the former West German chancellor, allegedly said: “If I am selling to you, I speak your language. If I am buying, dann müssen Sie Deutsch sprechen.” There is also the question of having that other person in the way, even if they are just an AI. There’s a nice episode in Luke Rhinehart’s “The Search for the Diceman”, where a character goes to bed with a Japanese business man. She quite enjoys the experience, but can’t help feeling disconcerted by the fact that his translator is lying beside him, constantly whispering in his ear. Whilst language may be a barrier to some who venture outside their country today, automatic translation may well carry some of the social issues foreseen by Douglas Adams.

Finally, why stop with people? Dr Doolittle could talk to the animals. So, I thought, could another startup – Felcana, when they (unsuccessfully) marketed their product as “Listen to your pet”. Sadly, it’s not – they’re just talking about activity monitors. Anyone want to fund hearables for pets? I suspect it won’t be long before we see someone trying.

Read more of my articles on hearables at https://www.nickhunn.com

2 Comments

Nick says:

February 20, 2018 at 10:43 am

Apologies for the delay in approving this – it got caught in a junk folder I didn’t even know existed.

As far as the Bluetooth work is concerned, we’re trying to enable a decent quality audio stream for both voice and music which will not have a major impact on battery life. I suspect that many users will employ that for listening to TVs, as well as voice for phone calls. There’s very little delay, so if someone else is listening to the TV normally and you’re receiving the Bluetooth stream you won’t experience any delays. We’re also supporting broadcast audio, which will probably replace telecoils to provide good quality over larger areas, as well as making it much cheaper to provide public access.

The usage models have been driven by hearing aid manufacturers as well as consumer audio companies, so we’re focusing on a spectrum of qualities. At the end of the day, the specification is essentially a toolbox of capabilities. I don’t know which ones will be adopted first by hearing aid, phone and TV manufacturers, nor what other bells and whistles they may add. What I hope is that it provides an opportunity for innovation over the coming ten to twenty years. Nor can I predict when these products will come to market, as it needs manufacturers of both ends of the link – i.e. hearing aids and TVs or phones to support it. What I can say is that there is massive enthusiasm within the different industries, although getting standards into new products always takes longer than we would hope.

Loading...

By Post Author
John Jarosek says:

December 10, 2017 at 7:06 am

I think two people communicating via translation devices is more likely to be comical than useful. Maybe ‘where is the toilet?’ might be doable though (with added gesturing).

As someone with a hearing loss, I am most interested in communicating with people in my own language. I’ve ordered a crowd-funded product called a SnowOwl. The added wrinkle here is that the wearable is a small plastic box (which you can wear on your wrist or pin to your clothing) containing highly directional microphones and some industrial-grade noise reduction. You point your device at the person you are trying to communicate with (or maybe the table in the far corner of the restaurant) and listen via wired or Bluetooth ear buds.

Which brings me to my question. Is there anything that you feel able to share about your work with the Bluetooth SIG Hearing Aid Working Group?

Personally, my old aids are being held together by positive thoughts only and I’d love to know if and when I’ll be able to buy aids that can communicate with a smart device via an open standard. If you can’t comment on that, I fully understand. Is there anything else that you can say about what hearing aid users will be able to expect in the future?

Loading...