Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading older versions (v3.1 and v4) on non-clean installations via torch.hub #474

Closed
rvryan67 opened this issue Jun 28, 2024 · 19 comments
Closed
Assignees
Labels
bug Something isn't working documentation Improvements or additions to documentation v5 Useful information for V5 release

Comments

@rvryan67
Copy link

❓ Questions and Help

I'm having issues with latest version v5.0

Until I get time to investigate and fix the issue I want to use the previous version,

vad, utils = torch.hub.load( repo_or_dir="snakers4/silero-vad:v4.0", model="silero_vad", onnx=False )

This results in the following error:

The provided filename /root/.cache/torch/hub/snakers4_silero-vad_master/files/silero_vad.jit does not exist

@rvryan67 rvryan67 added the help wanted Extra attention is needed label Jun 28, 2024
@snakers4
Copy link
Owner

Hi, which issue are you having with v5?
Can you post some reproducible code which causes an error?

@rvryan67
Copy link
Author

import torch
import torchaudio
import uuid
import os
import urllib
import ffmpeg

vad, utils = torch.hub.load( repo_or_dir="snakers4/silero-vad", model="silero_vad", onnx=False )

def speechonly(wavfile, utils, vad):
    
    (get_speech_timestamps, save_audio, read_audio, VADIterator, collect_chunks) = utils
    
    VAD_SR = 16000
    vad_threshold = 0.4

    tmpAudioFile = "/tmp/" + str(uuid.uuid4()) + ".wav" # create wav file from audio_string

    wav = read_audio(wavfile, sampling_rate=VAD_SR)

    t = get_speech_timestamps(wav, vad, sampling_rate=VAD_SR, threshold=vad_threshold, min_speech_duration_ms=250) # Returns list with segments of audio timestamps (start and end)
    
    print(t)

    chunks = []
    chunk_probs = []

    for i in range(len(t)):
        t[i]["start"] = max(0, t[i]["start"] - 3200) # 0.2s head
        t[i]["end"] = min(wav.shape[0] - 16, t[i]["end"] + 20800) # 1.3s tail
        if i > 0 and t[i]["start"] < t[i - 1]["end"]:
            t[i]["start"] = t[i - 1]["end"] # Remove overlap

        chunk_duration = t[i]["end"]-t[i]["start"]

        if chunk_duration >= 512: # 512 is minimum size to pass through model at 16000 Hz sample rate
            speech_probability = vad(wav[t[i]["start"]:t[i]["end"]], VAD_SR).item()

            chunk = wav[t[i]["start"]:t[i]["end"]]
            chunks.append(chunk)
            chunk_probs.append(speech_probability)
    
    logger.info("speechonly len(chunks): " + str(len(chunks)) + ", max(chunk_probs): " + str(max(chunk_probs)) + ", vad_threshold: " + str(vad_threshold))

    if len(chunks) == 0 or max(chunk_probs) < vad_threshold: # No speech segments detected or maximum segment probability is below threshold
        return wavfile, t
    else:
        combined_chunks = torch.cat(chunks) # Combine audio segments into one tensor
        save_audio(tmpAudioFile, combined_chunks, sampling_rate=VAD_SR) # Save combined tensor to audio file with non-speech removed
        return tmpAudioFile, t
def urlToWav(inputUrl, outputfile):
    try:
        if os.path.isfile(outputfile):
            os.remove(outputfile)

        dowloadfile = '/tmp/'+os.path.basename(inputUrl)
        urllib.request.urlretrieve(inputUrl, dowloadfile)

        ( 
           ffmpeg.input(dowloadfile)
           .output(outputfile, acodec='pcm_s16le', ac=1, ar=16000)
           .run(capture_stdout=True, capture_stderr=True)
        )
    except Exception as e:
        print("failed to convert to WAV - ERROR: " + str(e))
        return ""   
    finally:
        if os.path.exists(dowloadfile):
            os.remove(dowloadfile)

    return outputfile
audioUrl = 'audioUrl = 'https://file-examples.com/storage/fe0ebbce85667e496a17872/2017/11/file_example_MP3_2MG.mp3''
tmpAudioFile = "/tmp/" + str(uuid.uuid4()) + ".wav" # create wav file from s3 bucket
urlToWav(audioUrl, tmpAudioFile)
speechOnlyFile = tmpAudioFile
speechOnlyFile, voicetimestamps =  speechonly(tmpAudioFile, utils, vad)

ERROR: Provided number of samples is 27936 (Supported values: 256 for 8000 sample rate, 512 for 16000)

@snakers4
Copy link
Owner

Hi, this is correct behavior, the VAD always had limitations regarding the chunk size, and now the chunk size is fixed as noted in the error message.

Also probably a more proper way to hack into probabilities would be just to extend the get_speech_timestamps function.

@rvryan67
Copy link
Author

The code worked up to recently, it's broken since v5 released yesterday.

Is there a way I can load the previous version to quickly fix the problem until I have time to fix properly?

@snakers4
Copy link
Owner

snakers4 commented Jun 28, 2024

Your code ran, but it produced incorrect results since vad never worked with such large chunks.

In your case v4.0 does not load because it looks like pytorch caches the hubconf file or the full repo.

From a fresh environment any version loads.

We removed old unused utils in 5.0, so after removing cache everything should work.

@ggoedde
Copy link

ggoedde commented Jun 28, 2024

This issue (The provided filename /root/.cache/torch/hub/snakers4_silero-vad_master/files/silero_vad.jit does not exist) is likely caused by this line:
https://github.com/snakers4/silero-vad/blob/v4.0/hubconf.py#L38

From the line above, the model attempts to be loaded from snakers4_silero-vad_master. However, running torch.hub.load(repo_or_dir="snakers4/silero-vad:v4.0", model="silero_vad", onnx=False ), i.e. specifying a version number, will put the model + code in /root/.cache/torch/hub/snakers4_silero-vad_v4.0 instead.

I see in the latest release (v5.0), this snakers4_silero-vad_master isn't hard coded in the model loading step.
https://github.com/snakers4/silero-vad/blob/v5.0/hubconf.py#L43

@snakers4 could the loading JIT model code in v4.0 be updated to match what is in v5.0? Otherwise I think trying to download v4.0 will continue to have this issue.

i.e. update hubconf.py for v4.0 to this:
image

instead of this:
image

@snakers4
Copy link
Owner

This issue (The provided filename /root/.cache/torch/hub/snakers4_silero-vad_master/files/silero_vad.jit does not exist) is likely caused by this line:

Many thanks, we arrived at the same conclusion. Hence the issue with "non-clean" initialization, when the init is "tainted" with loading several versions at once.

We are thinking now how to fix git history properly, there are 3 versions now - v5.0, v4.0 and v3.1 that people remember.

Ideally, ofc, we would deprecate the old ones, but being able to load the earlier model easily on a non-clean environment is a nice feature, e.g. for benchmarking.

@dgoryeo
Copy link

dgoryeo commented Jun 29, 2024

@snakers4 , if possible please don't deprecate the old versions yet. I use version 3.1 for transcribing long form anime movies and so far it works best. Thanks.

@snakers4
Copy link
Owner

snakers4 commented Jul 1, 2024

A possible solution would be to create historic branches for v4 and v3.1 and try to re-tag the tags to use these branches' commits.
If this works, it will be and easy fix.

@snakers4 snakers4 changed the title silero_vad.jit does not exist Properly loading v3.1 and v4 on a non-clean installation Jul 1, 2024
@snakers4 snakers4 mentioned this issue Jul 1, 2024
@snakers4 snakers4 added bug Something isn't working and removed help wanted Extra attention is needed labels Jul 1, 2024
@adamnsandle
Copy link
Collaborator

fixed v3.1 and v4.0 tags
they should work properly now

@snakers4
Copy link
Owner

snakers4 commented Jul 1, 2024

image

tags for v3.1 and v4 are updated to load predictably for older version on non-clean installations
the only downside is that this solution may not work properly for windows

if so, a PR would be appreciated for this line
https://github.com/snakers4/silero-vad/blob/master/hubconf.py#L39

@snakers4
Copy link
Owner

snakers4 commented Jul 1, 2024

Please can someone verify that this now works.

@GaetanBaert
Copy link

GaetanBaert commented Jul 1, 2024

Hello

I just tried, I got
ImportError: cannot import name 'get_number_ts' from 'utils_vad' (C:\Users\gaeta/.cache\torch\hub\snakers4_silero-vad_master\utils_vad.py)

When using model, _ = torch.hub.load( repo_or_dir='snakers4/silero-vad:v4.0', model='silero_vad', force_reload=True )

I'm on Windows

@adamnsandle
Copy link
Collaborator

Hello

I just tried, I got ImportError: cannot import name 'get_number_ts' from 'utils_vad' (/root/.cache/torch/hub/snakers4_silero-vad_master/utils_vad.py)

When using model, _ = torch.hub.load( repo_or_dir='snakers4/silero-vad:v4.0', model='silero_vad', force_reload=True )

Hi
Try running this code before loading vad model to overcome module collision

import sys
try:
    sys.modules.pop('utils_vad')
except:
    pass

@GaetanBaert
Copy link

It worked ! I tried to load both v4 and v5 in the same jupyter notebook (I wanted to make a benchmark between both versions), that's why I got this conflict.

@snakers4
Copy link
Owner

snakers4 commented Jul 1, 2024

@dgoryeo @rvryan67 @ggoedde @helloWorld199 @hungiito

please verify that these fixes work for you

@dgoryeo
Copy link

dgoryeo commented Jul 1, 2024

I can report that the fixes work. I tried V4 in colab environment and V3.1 in Windows environment (after cleaning the local cache).
Thanks @snakers4 !

@ggoedde
Copy link

ggoedde commented Jul 1, 2024

Confirmed that v4.0 works in Databricks environment. Thanks!

@snakers4
Copy link
Owner

snakers4 commented Jul 2, 2024

Looks like that we have 3 confirmations.
If the issue persists for someone, please open a new ticket.

@snakers4 snakers4 changed the title Properly loading v3.1 and v4 on a non-clean installation Loading older versions (v3.1 and v4) on non-clean installations via torch.hub Nov 14, 2024
@snakers4 snakers4 pinned this issue Nov 14, 2024
@snakers4 snakers4 added documentation Improvements or additions to documentation v5 Useful information for V5 release labels Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation v5 Useful information for V5 release
Projects
None yet
Development

No branches or pull requests

6 participants