The Journal The Authority on Global Business in Japan

Print is something I hold dear. I’ve always been an avid reader and have built a career on writing, but in 2010—after four years of listening to podcasts—I decided to get behind the microphone myself. Since then I have built a network that publishes daily content that receives half a million downloads per month. Listeners love the convenience of audio, but one question keeps coming up: Can we get a transcript of this?

SPEECH TO TEXT
As engaging as audio content is, it doesn’t replace the need for the written word. This I know, but my response has always been that, while I would love to offer transcripts of every show, to do so is simply not practical. I did the math. It would cost about $80 per episode, which would total more than $5,000 per month.

Despite understanding the clear benefits of converting our audio content to text—accessibility for those with hearing impair­ment, better search engine optimization, and a happier audience—I had to shelve the idea.

Recently, however, I’m reconsidering. Last summer, I discovered a tool called Temi that amazed even me, a renowned tech nerd. The service takes an uploaded audio file and returns a machine-generated transcript in a matter of minutes. In most cases, the accuracy has been impressive, and converting an hour of audio costs me just $6. I use it to transcribe interviews for The ACCJ Journal, but am thinking to unleash it on my back catalog of more than 10,000 hours of podcast episodes.

Impressed by the relief machine transcrip­tion has brought to my work as an editor and writer, I began researching the technology. You can find the results of that here.

SMART SOLUTION
Temi, I would learn, is just one of several players in the space. Spext and Happy Scribe are also doing amazing things to make truly reliable speech-to-text a reality, and all this is being driven by artificial intelligence (AI), machine learning, and Big Data—three buzzwords whose true meaning is often vague. But here is an example of how smart machines are changing the relation­ship between the two most common forms of communication—text and voice—in a way that could change almost every industry.

So, if you are used to saying, “Hey Siri,” “OK Google,” or “Alexa” and being unimpressed with the results, take a look at what is hap­pening in the world beyond those digital assistants. They are the most visible example of how AI is being applied to voice, but their capabilities can be deceiving. I see them as the first piece of a bridge that will simplify our interaction with technology while leading our multimedia culture back to a place where text thrives. Although I’m a strong proponent of audio, in our information-rich, search-driven world, text is needed more than ever.

Christopher Bryan Jones is Editor-in-Chief of The ACCJ Journal. Originally from Birmingham, Alabama, he has lived in Japan since 1997.