Skip to content

Latest commit

 

History

History
30 lines (17 loc) · 983 Bytes

README.md

File metadata and controls

30 lines (17 loc) · 983 Bytes

Spoken Language Identification Using Deep Learning

Bachelor Thesis Project

A Deep Learning-Based Approach for Spoken Language Identification

Dataset

  • Kaggle's spoken language identification with 73080 samples from English, Spanish, and German languages.
  • ShEMO a large-scale validated database for Persian speech emotion detection

Feature Extraction

Mel Spectrogram is used for feature extraction and results are saved into .npy files. The model reads them using a custom data generator.

mel-spectrogram

Architecture

This project implemented two different architectures CNN and CRNN.

Models Architecture

Website

There is also a website!

web-image

Presentation video

You can watch presentation in Persian Here.