September 24, 2014

Captions is aiming big in its hopes to bring real-time language translation to Google Glass and other devices. Such an app could have a huge impact in education, travel, medicine and emergency services.

Best yet, they’ve caught the eye of the Google’s Glass team, who recently posted about Captions on the +GoogleGlass page.

In a conversation with lead developer Joel Lu, I learned much more about the app’s potential, its obstacles, and the small group working feverishly to make it a reality.

The company behind Captions is a Pittsburgh-based LLC of the same name formed by three Carnegie Mellon students. The idea came to Joel while talking to an American professor during a train ride from Mount Fuji to Tokyo. A misunderstanding arose between the professor and the train conductor during the ticket inspection. The conductor kept shaking his head and pointing to the ticket. Joel and the professor were perplexed. After several minutes of continued, helpless gesturing, they came to understand that the professor had unknowingly purchased the wrong ticket. I imagine it played out like this:

Frustrated, the professor turned to Joel and said “I love Japan but there are two things I hate: no trashcans and no subtitles.”

Joel’s response birthed the inspiration for Captions: “What if reality had subtitles?”

Less than two years later, the app is in alpha testing and the company is targeting 4-6 months for a private beta (January – March 2015). A sign-up is currently posted at and will form the public testing pool. First come, first served.

The alpha testing has experimented with German, Spanish, French and Italian translations. While admitting that the sensitivity of Glass’ microphone is a limiting factor for the app, the team has demonstrated Captions’ ability to function in public places like cafes and pubs without interference from ambient noise.


The current goal is to focus on one-on-one conversations and later experiment in group settings. There are plenty of immediately recognizable uses for translating a single voice. Traveling abroad would be a breeze as you can read the words of everyone you interact with. Academic lectures could be given in any language, attended and enjoyed by anyone with a Captions-enabled device. Hospital admission procedures and clinical visits could eliminate the need for human translators.

It’s not only language translation where this app could be useful. Sometimes hearing words in your own language can be challenging enough. Imagine sitting in a medical or engineering lecture where new, complex words are constantly being used but not defined. No need to scratch your head when the presenter mentions “seborrheic dermatitis.” Just tap on Glass and see a definition. You can stay focused on the lecture without having to divert you attention to typing in a search for words you’re not even sure you heard correctly or how to accurately spell.

This could also benefit the hearing impaired. There is much frustration experienced by individuals with hearing loss and those around them as the speaker must always be in front of the individual and often, he or she is not aware when someone behind them is speaking. Captions to the rescue! As text starts appearing on the display, the user is immediately notified that someone is speaking and they can already see what was said. No repeats necessary.

It’s not just Glass where translation can be useful. Wearable devices like Glass are simply a starting point. Joel stated that the team has already demonstrated that Captions can work on mobile devices and one can imagine many other devices, industries and applications that could benefit from real-time translation.


There are some steep challenges in a project dealing with programming machines to hear, interpret, understand, translate and present the spoken word – and do it all in real-time. Automatic speech recognition, pattern matching, natural language processing, machine learning: corporations and academic research have been tackling these issues for decades with suboptimal results.

Then there is the business question of how do you monetize such an app to attract investors and later keep the company alive and growing? Do you make it a paid app? Charge a subscription fee for its use? Joel stresses that function is the focus, not finance. However, the team is aware of the inevitable need for the app to generate revenue and have ideas about how to “game-ify” the experience, but assures that the app will be free to use. Tight-lipped on specifics, the Captions team is saving those details for hungry investors.


The team is hopeful that “the success of apps that aim to solve obvious, practical problems like translation will open the public’s eye to the immediate benefits of augmented reality.”

But this small, devoted group isn’t just about zeroes and ones. They’ve had the foresight to commission an impressive short film promoting the app’s potential application in emergency situations. The company asserts that the video is a faithful representation of the app’s current capabilities. A Glass community eager for a break-out app is anxiously hoping so.

I know, right?

– See more at:


No comments

Be the first one to leave a comment.

Post a Comment