Skip to content

Speech Recognition for Hebrew (using wav2vec2 models)

Notifications You must be signed in to change notification settings

imvladikon/wav2vec2-hebrew

Repository files navigation

Hebrew Speech Recognition with Wav2Vec2

Usage

Without package installation (using transformers library)

from transformers import (
    AutomaticSpeechRecognitionPipeline,
    AutoFeatureExtractor,
    Wav2Vec2ForCTC,
    AutoTokenizer
)

pretrained_model_name_or_path = "imvladikon/wav2vec2-xls-r-300m-hebrew"
asr = AutomaticSpeechRecognitionPipeline(
    feature_extractor=AutoFeatureExtractor.from_pretrained(
        pretrained_model_name_or_path
    ),
    model=Wav2Vec2ForCTC.from_pretrained(
        pretrained_model_name_or_path
    ),
    tokenizer=AutoTokenizer.from_pretrained(
        pretrained_model_name_or_path
    ))
filename = "audio.wav"
print(asr(filename))

Chunking file into smaller chunks is not implemented yet.

With package installation

pip install git+https://github.com/imvladikon/wav2vec2-hebrew

Speech recognition

from wav2vec2_hebrew import HebrewSpeechRecognitionPipeline

asr = HebrewSpeechRecognitionPipeline()
filename = "./samples/bereshit011.wav"
output = asr(filename)
print(output)
# [{'text': 'בראשית ברא אלוהים את השמייים ואת הארץ'}]

Alignment

import torchaudio
from wav2vec2_hebrew import HebrewWav2Vec2Aligner

filename = "./samples/bereshit011.wav"
text = "בראשית ברא אלוהים את השמיים ואת הארץ"
aligner = HebrewWav2Vec2Aligner(input_sample_rate=16000, use_cuda=True)
# aligning segments to text (sentences)
first_sentence = aligner.align_data(filename, text)[0]
# {'sentence': 'בראשית ברא אלוהים את השמיים ואת הארץ', 
#  'segments': [Segment(label='בראשית', start=6750.516853932584, end=18644.284644194755, score=0.16025335497152965)...]}

# showing in IPython (notebook)
waveform, sample_rate = torchaudio.load(filename)
aligner.show_segments(waveform, first_sentence)
# showing segments using IPython.display.Audio

Training process

Training logs and details are available in the train folder.

Datasets

Weights

About

Speech Recognition for Hebrew (using wav2vec2 models)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published