#65 - Phonic Tonic - Proof of Concept [1/3]

About Episode - Duration: 25 minutes, Published: 2019-04-05

In this three part episode series, we will be building an end to end, state of the art, online audio transcription service. This episode will focus on the high-level product idea, we will sketch out a customer workflow, and then work through the product proof of concept.

Download: mp4 or webm

Get notified about future content via the mailing list, follow @jweissig_ on Twitter for episode updates, or use the RSS feed.

Links, Code, and Transcript

Just a heads up, this is only a preview. If you wanted to watch the extended version of this episode you will have to get a subscription.

I thought it might be fun to mix things up a little, by creating an episode series, where we build a real life end-to-end online audio to text transcription service. The product idea is pretty simple, you upload your audio files to this website as an input, then on the backend infrastructure here in this black box, we will process and transcribe those uploaded audio files, and then send you an email with the transcribed text files as an output.

Transcription Service Architecture Input Output

The goal of this episode series is to architect and build out what’s happening behind the scenes here, on the backend in this black box, that enables this simple customer workflow. By the time we are done, we should have a simple application that delivers state of the art automated audio to text transcription with high accuracy with lightning speed. As you can see, this will involve a bunch of different containerized services, a database, blob storage, a worker queue, and lots of remote transcription API calls. My thinking here, is that I want to replicate what a real company might be dealing with, and hopefully learns lots as we work through this. I am positive we will make mistakes and not get everything right on our first try, but that it actually a good thing, as we can build on these idea in future episodes.

Transcription Service Architecture Behind The Scenes

For example, I could see us looking at lots of topics around logging, monitoring, cost modeling, and probably infrastructure optimization, just to name a few. Additionally, when working on these types of things, there are often pros and cons to each path you choose, and we can sort of chat about that as we work through this. It should be pretty fun, as it will be mix of business ideas, development work, and lots of backend operations stuff. Almost like a venn diagram as we will be briefly touching many topics in these areas. I find it useful to work on these types of projects from time to time as you gives you a point of reference for what other groups in your company likely care about. Sort, of puts yourself in their shoes.

Development Business Operations Venn Diagram

Alright, so I have broken the episode series into three parts. I picked the name Phonic Tonic, since it sounds sort of funny, rhymes, and Phonic related to the sound of speech, and we are going to be adding some ML tonic into the mix. This is mostly just BS though, I was searching around for names that were not already taken, and this is what I came up with after looking through a thesaurus.

The first episode, which you are watching now, is where we will chat about the general idea, walking through the workflow step-by-step and sketchout things out and chatting about the backend workflow, we will also chat about the competitive landscape and product pricing, then we will look at a demo where we actually transcribe something at the command line via a remote API. This is sort of the core idea that we are going to build everything on top of.

Transcription Service Proof of Concept

In the second episode, we will design, code, and containerize our website, media transcoder, audio transcriber, and the email notifier services. There will also be a bit where we chat about what a data model looks like for this things (and how I typically approach that). We will also wire all this up to the backend database, storage, and worker queue systems (mostly sitting on kubernetes). This will be super technical and fast paced. You will also get all the code as we work through this together.

Finally, in the third and final episode, we will round things out by checking out what a Product Launch Checklist looks like, we will chatting about payment processing options, and mostly just trying to sand off any rough spots for our Minimum Viable Transcription product. The end result here, should be to launch something that regular people out on the internet can actually use for quickly transcribing their audio files to text, on a self-serve basis.

Alright, that concludes the preview of this episode. In the rest of this episode we will be chatting about the competitive landscape, looking at pricing options, and walking through the backend architecture in more detail. I will also throw in a few career tips that worked well for them when you start thinking about the larger picture like this. So, if you want to watch the full version, you will have to have a subscription, you can learn more about them here.

Want to watch the full episode?

A membership gives you access to all content on the site, both free and paid, you will also get complete episode transcripts, and all commands and codes shown in the episode. There are 4 new episodes each month on all types of devops content.

Subscribe for $9/month


Comments (0)

Comments are enabled for subscribers only.

If you are interested in joining the discussion, and getting weekly members only content, please signup.