Have you ever wanted to get a youtube video and convert it into a text (transcription)? This can be useful when you have done videos and you want to create blog post from them. I have done many youtube computer tutorials, and I would like to get them transcribe, so I can use them in my blog. In this tutorial I’m going to show you how to transcribe those YouTube videos. The only requirement is that the video has (cc) caption available. If the video has caption available that means the text is available within the HTML page. What we are doing to do in this tutorial is extracting this text from the HTML page. In order to do that, we have to get to the html code using “Web Developer Tool” in Chrome. So let’s get started.
1 open the video that contains caption in Chrome in order to extract the text.
3 enter the following code
4 the resulting output contains to text transcription from the video. Save it as an XML file. You can do this by right-clicking and click save as and then save the file.
5 extract the text from the file. This XML file contains many HTML tags. The following command will eliminate this HTML tags and leave only the text with in the file. For this we are going to use the following command. This part of the tutorial I’m doing it on Linux using the bash. If you’re using Windows you need to find out an alternative way to eliminate the HTML tags.
sed -e 's/<[^>]*>//g' file.xml
Don’t worry about this command and the weird characters that’s just regular expression the resulting file will only contain the text from the video.
if you need help writing your post, or get stuck often, check out this post: what is the anatomy of a blog post