Back to blog
Guide6 min readMarch 16, 2026

How to Create Subtitles for a Podcast (SRT & VTT Guide)

Learn how to generate SRT and VTT subtitle files for your podcast or video. How to create, format, and upload captions to YouTube, LinkedIn, TikTok, and more.

P

Podtyper Team

Podcast Tools & AI

Adding subtitles to your podcast video increases watch time, improves accessibility, and helps your content perform better on every platform that auto-plays video without sound.

This guide explains the difference between SRT and VTT formats, how to generate them, and how to upload them to every major platform.


SRT vs. VTT: What's the Difference?

Both formats are plain text files that contain timestamped captions. They're functionally very similar — most platforms accept both.

SRT (SubRip Subtitle)

SRT is the older, more universally supported format. It uses a simple structure:

1
00:00:01,000 --> 00:00:04,500
Welcome back to the show. Today we're talking

2
00:00:04,500 --> 00:00:08,000
about podcast transcription and why it matters.

Each caption block has:

  1. A sequence number
  2. A timestamp range (start → end) in HH:MM:SS,mmm format (note the comma before milliseconds)
  3. The caption text
  4. A blank line separator

VTT (WebVTT — Web Video Text Tracks)

VTT is a newer format developed for HTML5 web video. It starts with a WEBVTT header and uses slightly different timestamp formatting (period instead of comma before milliseconds):

WEBVTT

00:00:01.000 --> 00:00:04.500
Welcome back to the show. Today we're talking

00:00:04.500 --> 00:00:08.000
about podcast transcription and why it matters.

VTT also supports additional features like positioning, styling, and speaker labels — though most basic use cases don't need them.

Which should you use?

| Platform | Preferred Format | |----------|-----------------| | YouTube | SRT or VTT | | LinkedIn | SRT | | TikTok | SRT | | Instagram | SRT | | Facebook | SRT | | Web video (HTML5) | VTT | | Editing software | SRT (most common) |

When in doubt, use SRT — it works everywhere.


How to Generate SRT and VTT Files for Your Podcast

Option 1: Use Podtyper (Fastest)

If your podcast is on Spotify, Apple Podcasts, or YouTube, Podtyper generates SRT and VTT files directly from the episode URL.

Steps:

  1. Copy your episode URL from Spotify, Apple Podcasts, or YouTube
  2. Paste it into podtyper.com
  3. Click Transcribe — done in 2–4 minutes
  4. On the transcript page, click Export and choose SRT or VTT

The file downloads immediately with accurate timestamps synced to the audio. Speaker labels are included in the TXT export; SRT and VTT follow standard caption formatting.

Free tier: 30 minutes/month, no credit card required.


Option 2: YouTube Studio Auto-Captions Export

If your podcast is already on YouTube, you can export YouTube's auto-generated captions as SRT:

  1. Open studio.youtube.com
  2. Go to Subtitles in the left sidebar
  3. Click on the video
  4. Under the auto-generated captions, click the three-dot menu → Download
  5. Choose SRT

Caveat: YouTube's auto-captions are less accurate than dedicated AI transcription tools, especially on technical vocabulary or accents. Using this as a starting draft and editing is recommended.


Option 3: OpenAI Whisper (Free, Technical)

Whisper can output SRT and VTT files directly from an audio file:

whisper audio.mp3 --model medium --output_format srt
whisper audio.mp3 --model medium --output_format vtt

This produces well-formatted subtitle files. Requires Python and local model download.


How to Upload Captions to Each Platform

YouTube

  1. Go to studio.youtube.com
  2. Open the video → Subtitles
  3. Click AddUpload file
  4. Select your SRT or VTT file
  5. Choose With timing (not "Without timing")
  6. Save

YouTube will display your captions as the default for viewers who have captions enabled, and will replace the auto-generated ones.


LinkedIn

  1. Upload your video post as normal
  2. Before publishing, click Add captions
  3. Upload the SRT file
  4. LinkedIn displays captions for auto-play in the feed (which plays silently by default)

LinkedIn captions are especially valuable because most LinkedIn video is watched without sound.


TikTok

Auto-captions (TikTok generates its own): TikTok has built-in auto-captioning. Go to Post → Captions → Auto. The quality is acceptable for clear speech.

Manual SRT upload:

  1. When uploading a video, tap Captions
  2. Select Upload instead of Auto
  3. Upload your SRT file

Manual upload gives better accuracy on technical content or accents.


Instagram (Reels)

Instagram Reels have built-in auto-captions via a sticker:

  1. After recording or uploading your clip, tap the sticker icon
  2. Search for "Captions"
  3. Tap the Captions sticker — Instagram transcribes automatically

For custom SRT upload, Instagram currently doesn't support external SRT files for Reels in most regions. Auto-captions are the practical option.


Facebook

  1. Upload your video
  2. Click Edit on the video
  3. Go to Captions tab
  4. Upload your SRT file or use auto-generate

Podcast Players (Web Embed)

If you embed your podcast player on a website and use an HTML5 <video> element, VTT is the native format:

<video controls>
  <source src="episode.mp3" type="audio/mpeg">
  <track kind="subtitles" src="episode.vtt" srclang="en" label="English" default>
</video>

Most podcast hosting platforms (Buzzsprout, Transistor, etc.) don't support custom caption files directly — they use their own players. Check your host's documentation.


Tips for Better Captions

Keep lines short. Aim for no more than 42 characters per line, 2 lines max per caption block. Long lines are hard to read quickly.

Match natural speech breaks. Don't split a sentence mid-clause. Break at commas, conjunctions, or the end of a thought.

Review technical terms. AI transcription handles common vocabulary well but may struggle with brand names, acronyms, or highly technical terms. A quick scan for these saves embarrassment.

Don't caption music and sound effects for speech captions. If you're adding accessibility captions (not just speech), indicate music and relevant sounds in brackets: [upbeat intro music].


Frequently Asked Questions

What's the maximum caption length per segment?

For readability, aim for 2 lines maximum, around 42 characters per line. Most tools and platforms display 2-line captions. Very long captions get cut off or overflow the screen.

Can I edit the SRT file manually?

Yes — SRT is a plain text file. Open it in any text editor (TextEdit, Notepad, VS Code) and edit the text directly. Be careful not to change the timestamp format or numbering.

Does adding captions affect SEO?

YouTube uses your caption file to understand video content and improve search ranking. Uploading an accurate SRT (instead of relying on YouTube's auto-captions) gives you more control over what Google indexes. LinkedIn and other platforms don't use captions for SEO, but they significantly affect engagement metrics.

How do I sync captions if the timing is off?

If timestamps are consistently off by a fixed amount (e.g., everything is 2 seconds late), you can adjust the SRT file by adding or subtracting time from all timestamps. Tools like Subtitle Edit (free, Windows) or online SRT editors make this easy.


Summary

Generating SRT and VTT subtitle files for your podcast is straightforward:

  1. Paste your episode URL into Podtyper
  2. Download the SRT or VTT export
  3. Upload to YouTube, LinkedIn, TikTok, or wherever you publish video

The whole process takes under 10 minutes per episode. Captions increase watch time, improve accessibility, and help your content perform better on platforms where video auto-plays silently.

Generate SRT captions from your podcast →

Try Podtyper free — no credit card needed

Paste any Spotify, Apple Podcasts, or YouTube link. Get a full transcript in minutes.

Start transcribing