Audio Editor, meet Audio Word Processor
Mike Cooper
The Trusted British Voice for Documentaries, Audiobooks & eLearning ?? 100+ Audiobooks | 100+ Documentary Credits ??? Tired of the joy being sucked out of the creative process? Let's talk…
I'm a "punch-and-roll" kinda guy.
Those of us who spend hours recording the human voice – and then listening back and editing the results – invariably find our own, personal, best ways of working. For years, whenever I made a mistake, I'd just leave the recording running, back up and take it from the top of the sentence. Then I'd worry about taking out the mistakes later. I tried using a "clicker" for a while, so I could spot pickups coming up in the timeline and maybe speed things up a little. But in the end, because I use a program like Pro Tools, I realised it fitted my workflow better – and saved me more time overall – to just punch back in over the mistake and simplify the editing part in the process.
If Descript had existed a few years ago, perhaps things might have been different.
I started out editing on tape back in the late 1980s, then made the transition to digital editing some years later. Being able to see the waveforms of my words on the screen in front of me made things a lot easier than having to spin backwards and forwards while juggling a chinagraph pencil, a roll of editing tape and a razor blade. And, after years of practice, I can now spot certain sounds now just by their shape on the monitor. But what if – instead of seeing waveforms on our screens – we could see the words themselves, and then edit the audio… by editing the text?
Arguably, that makes a lot more sense, and that's exactly what Descript is offering. It's a complete rethink of our relationship with editing speech, focused on the words and not the audio. If speech-to-text had existed a few decades ago, there's a very good chance that's how we'd have been editing all along. Check out this very short example:
Descript sends your audio through the Google Speech API to turn it into text (with a premium option to send it for personal treatment by a human instead), then places tabs above the timeline, showing which words are which bits of the waveform. Editing is then as simple as deleting words in the same way as you would with a word processor.
There is a caveat here, in that you're sending what might be sensitive or proprietary script material to a third party (i.e. Google) for processing. A contract I signed just this week, with a very large e-Learning producer, made a direct stipulation that this wasn't allowed, so be sure to check whatever agreements you might have in place with your own clients before launching in. There's also some language in section 8.2 of the T&Cs that – although it seems designed to give Descript the rights to process your recordings in order to provide and improve the service – has given one or two of my voiceover colleagues pause for thought. That section could certainly use some improved wording.
But for reporters who work with long interviews, or those who voice things infrequently, and who might need multiple takes to get what they need, this could well be a game-changer. Imagine being able to see a transcript of your whole recording before you edit, and then remove the flubs, the retakes and the extraneous material as easily as you would in Microsoft Word…
Of course, anyone who edits audio to professional studio standards will already be way ahead here, and will probably be asking: a) whether it's really good enough for studio use, b) how neat a job it makes of joining the resulting audio back together, and c) whether it leaves audible artefacts in the process. And then there's the Google Speech engine to factor in, which may or may not be able to recognise your material properly. Of course, there's only one way to find out, so I've decided to avail myself of Descript's free 30-minute trial and test it out for myself.
The biggest drawback to using Descript in the studio seems to be that – at this point in time – you can't record directly into the software, and you need to import existing audio to begin working on it. But recording is planned for a later release, and is already at the alpha testing stage. In fact, the developers have already invited me to download this and give it a spin, which shows they're actively working on improving things in response to feedback.
This isn't a full review, and there's a good reason for that. If my experience with audio software has taught me anything, it's that everyone's mileage varies. What works for me might not work for you, or for your own situation. That I've stuck with Pro Tools for the best part of ten years is due, in no small part, to how well its "Strip Silence" function works for me. I've tried other software that claims to offer the same function and never found anything that fits me just right. Similarly, some people do well with plugins designed to remove breaths from audio, but I've found they don't handle mine well enough. It then becomes a protracted exercise in putting things right, which takes longer than if I hadn't used the plugin in the first place.
And for the kind of work I do, and the way I already work? It's true that, for me at least, Descript might be a solution without a problem.
But then again, I'm a punch-and-roll kinda guy…
Descript is available at descript.com, but I'd really appreciate it if you'd use this link, which gets us both an extra 100 minutes of transcription for free when you create your account. After your trial you can use it on a pay-per-use basis at $0.15 per minute of transcription, or sign up for $10 and get transcriptions for $0.07 with their Early Adopter Discount. I hope it works for you, but please comment below and let's find out!