Extract Hardsub From Video [LEGIT ⟶]
How to extract hardsubs from a video
Hard subtitles (hardsubs) are burned into the video image and cannot be toggled off. Extracting—or more precisely removing—hardsubs is different from extracting soft subtitles (subtitle files). This post explains options, trade-offs, and step-by-step methods for two common goals: (A) remove hardsubs to produce a “clean” video, and (B) extract subtitle text from hardsubs into an editable subtitle file (OCR). I cover tools, workflows, and practical tips.
What Are Hardsubs vs. Softsubs?
| Feature | Hardsubs (Burned-in) | Softsubs (Separate) |
|---------|----------------------|----------------------|
| How stored | Part of video frames | External file (.srt, .ass) |
| Can disable | ❌ No | ✅ Yes |
| Can extract easily | ❌ No (requires OCR) | ✅ Yes (copy/paste) |
| Quality loss | Yes (pixelated text) | No (vector/text format) | extract hardsub from video
Example: Most TV broadcasts, old DVD captions, and some streaming rips use hardsubs. How to extract hardsubs from a video Hard
Extracting Hardsub from Video: A Complete Technical Guide
Workflow A — Extract text from hardsubs (OCR -> .srt)
- Inspect and sample
- Open the video and note the region where subtitles appear (bottom/middle), common font color, and presence of outlines/shadows.
- Export frames or a frame strip
- Use ffmpeg to extract frames covering the entire video or sample at 1–2 FPS if the video is long:
ffmpeg -i input.mp4 -vf fps=1 frames/frame_%06d.png - For higher accuracy around cuts/dialogue, sample at 3–5 FPS or extract frames only where subtitles exist by scanning for changes in the subtitle area.
- Use ffmpeg to extract frames covering the entire video or sample at 1–2 FPS if the video is long:
- Preprocess images to improve OCR
- Crop to subtitle region to reduce noise.
- Convert to grayscale, increase contrast, remove background using morphological operations, and binarize.
- If subtitles are colored (e.g., yellow), convert to HSV and isolate color ranges.
- Apply de-noising and sharpen filters.
- Example using OpenCV (conceptual):
- Crop -> convert HSV -> mask color range -> morphological open -> adaptive threshold.
- Run OCR
- Use Tesseract with language packs:
- tesseract cropped.png out -l eng --oem 1 --psm 6
- For better results tune PSM (page segmentation mode) and OEM (engine mode).
- For non-Latin languages, install the appropriate language data.
- Use Tesseract with language packs:
- Group recognized text into subtitle cues
- Use frame timestamps to generate cue start/end times. If you sampled at N fps, map frame numbers to seconds.
- Merge identical/overlapping OCR outputs across consecutive frames to form continuous subtitle lines and determine durations.
- Clean and proofread
- Use a subtitle editor (Subtitle Edit or Aegisub) to fix OCR errors and timing.
- Export to .srt or .ass
- Save the final file; you can then optionally burn it in as softsubs or keep it separate.
Tips to improve OCR:
- If subtitles have strong outlines, perform stroke removal (thin outline color) then OCR on the inner fill.
- Use an ensemble: run multiple OCR engines and merge outputs.
- For animated or karaoke subtitles, OCR per-frame and post-process aggressively.