srt-parse

Segments an audio file into several smaller audio clips using an accompanying .srt closed captioning file.

Usage

usage: srt-parse [-h] [--output-dir OUTPUT_DIR]
             [--audio-out-file-pattern AUDIO_OUT_FILE_PATTERN]
             [--text-out-file-pattern TEXT_OUT_FILE_PATTERN]
             [--output-type {txt,csv}] [--csv-seperator CSV_SEPERATOR]
             [--csv-filename CSV_FILENAME]
             [--update-increment UPDATE_INCREMENT]
             [--in-encoding IN_ENCODING] [--out-encoding OUT_ENCODING]
             audio_input srt_input

Segment audio files according to a provided .srt closed caption file

positional arguments:
  audio_input           Location of audio file to be processed
  srt_input             Location of .srt file to be processed

optional arguments:
  -h, --help            show this help message and exit
  --output-dir OUTPUT_DIR
                        Directory for processed files to be saved to
  --audio-out-file-pattern AUDIO_OUT_FILE_PATTERN
                        A python-style f-string for saving audio files
  --text-out-file-pattern TEXT_OUT_FILE_PATTERN
                        A python-style f-string for saving text files
  --output-type {txt,csv}
                        Output filetype
  --csv-seperator CSV_SEPERATOR
                        Character sequence used to seperate values in csv
  --csv-filename CSV_FILENAME
                        Name of file to write as csv
  --update-increment UPDATE_INCREMENT
                        Print progress after every specified amount of
                        segments.
  --in-encoding IN_ENCODING
                        Encoding used to read the .srt file
  --out-encoding OUT_ENCODING
                        Encoding to use when writing text data to file

Example

Using srt-parse:

python3 srt-parse.py foo.mp3 foo.srt

Will produce in the following files in the output directory (by default .\out\)

0-audio.mp3
1-audio.mp3
2-audio.mp3
3-audio.mp3
...
out.csv

Each file is made per subtitle in the .srt file and out.csv groups each audio file to its transcript.

Notes

YouTube Subtitles will have a duration that matches two lines of text.

Ex.

1
00:00:03,830 --> 00:00:12,910 // this duration spans both written lines below
I'll say to you it's a hot wonderful night

3
00:00:10,840 --> 00:00:14,889
I want to thank the people who came up to me

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.idea		.idea
.gitignore		.gitignore
README.md		README.md
arpabet.py		arpabet.py
requirements.txt		requirements.txt
split.py		split.py
srt-parse.py		srt-parse.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

srt-parse

Usage

Example

Notes

About

Releases

Packages

Languages

AlanLiu96/srt-parse

Folders and files

Latest commit

History

Repository files navigation

srt-parse

Usage

Example

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages