Skip to content

Third Party Models

Compare
Choose a tag to compare
@aatmanvaidya aatmanvaidya released this 28 Feb 11:59
· 492 commits to main since this release
3f41df9
  • This release is to upload and all the third party models that Feluda uses for its operators.
  1. PANNs inferece - https://github.com/qiuqiangkong/panns_inference
    PANN is a CNN that is pre-trained on lot of audio files. They have been used for audio tagging and sound event detection. The PANNs have been used to fine-tune several audio pattern recognition tasks, and have outperformed several state-of-the-art systems. Feluda uses PANN's to extract a 2048 dimension vector of any given audio file.
    The pth file of the CNN model will be used in the audio_vec_embedding operator.