[UPDATE: Unfortunately, Google has disabled the ability described below to transcribe audio from a web page. Now you get an alert to attach a microphone and record from that. Playing audio over your computer, or playing a recorded message from your phone, doesn’t work. Unclear if this is a deliberate move to limit the program or just a glitch. Anyway, I’d like to know if anyone is using anther transcription tool successfully.]
As a copywriter, I often need to capture speech to text from a web page. My traditional method has been to play the video and start and stop it while I type what I hear into Word. Very tedious. Faced with a new project, I decided to find out if there is a better way. And it turns out there is, but it’s not as easy to find as you might think.
Google “capture speech to text” and most of the results will be for the opposite, text-to-speech, an important accessibility feature but not what I was looking for. Word for Mac still has a dictation feature built in (it’s gone from Word for Windows) but it’s only for YOUR dictation; toggle to another application, like a web browser, and the capture stops.
Finally, I found this page for Google’s beta of its Cloud Speech API. You can sign up for a trial of the Google Cloud Platform (they require your credit card, but won’t charge it without your permission) or simply use the widget on the page to translate in 15-second increments. The interface said my video was captured with 94% confidence of accuracy, which I’d say was about right. The transcript required a bit of cleanup, but the process was certainly faster than typing it all. Check it out.