Skip to content

WeldonWangwang/esp-skainet

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ESP-Skainet

ESP-Skainet is Espressif's intelligent voice assistant, which currently supports the Wake Word Engine and Speech Commands Recognition.

Overview

ESP-Skainet supports the development of wake word detection and speech commands recognition applications based around Espressif Systems' ESP32 chip in the most convenient way. With ESP-Skainet, you can easily build up wake word detection and speech command recognition applications.

In general, the ESP-Skainet features will be supported, as shown below:

overview

Input Voice Stream

The input audio stream can come from any way of providing voice, such as MIC, wav/pcm files in flash/TF Card.

Wake Word Engine

Espressif wake word engine WakeNet is specially designed to provide a high performance and low memory footprint wake word detection algorithm for users, which enables devices always wait for wake words, such as "Alexa", “天猫精灵” (Tian Mao Jing Ling), and “小爱同学” (Xiao Ai Tong Xue).

Currently, Espressif has not only provided an official wake word "Hi, Lexin" to the public for free but also allows customized wake words. For details on how to customize your own wake words, please see Espressif Speech Wake Words Customization Process.

Speech Commands Recognition

Espressif's speech command recognition model MultiNet is specially designed to provide a flexible offline speech command recognition model. With this model, you can easily add your own speech commands, eliminating the need to train model again.

Currently, Espressif MultiNet supports up to 100 Chinese speech commands, such as “打开空调” (Turn on the air conditioner) and “打开卧室灯” (Turn on the bedroom light).

We will add supports for English commands in the next release.

Action Function

When the voice stream is processed through wake word detection and speech commands recognition, the system will return a command ID. At this time, users can customize the Action Function according to the return Command ID, such as play music, control lights...

Quick Start with ESP-Skainet

Hardware Preparation

To run ESP-Skainet, you need to have an ESP32 development board which integrates an audio input module and at least 4 MB of external SPI RAM. We use ESP32-LyraT-Mini in examples.

On how to configure ESP32 module for your applications, please refer to the README.md of each example.

Software Preparation

Audio Config

During the wake word detection and speech commands recognition, the board will pick up audio data with the on-board microphone, and feed them to the WakeNet/MultiNet model frame by frame (30 ms, 16 KHz, 16 bit, mono).

ESP-Skainet

Make sure you have cloned this project with the --recursive option, shown as follows:

git clone --recursive https://github.com/espressif/esp-skainet.git 

If you have cloned this project without the --recursive option, please go to the esp-skainet directory and run the git submodule update --init command before anything else.

ESP-IDF

For details on how to set up the ESP-IDF, please refer to Getting Started Guide for the stable ESP-IDF version

In this case, we take ESP-IDF v3.2 as the test version. If you had already configured ESP-IDF before, and do not want to change your existing one, you can configure the IDF_PATH environment variable to the path to ESP-IDF.

Components

A component is the main framework of the SDK, with some drivers and algorithm inside.

hardware_driver

The hardware_driver component contains drivers for the ESP32-LyraT-Mini board.

esp-sr

The esp-sr component contains the APIs of ESP-Skainet neural networks, including the wake word detection and speech commands recognition framework.

Examples

The folder of examples contains sample applications demonstrating the API features of ESP-Skainet.

Take one Garbage classification as an example.

  1. Navigate to one example folder esp-skainet/examples/garbage_classification.
cd esp-skainet/examples/garbage_classification
  1. Compile and flash the project.
make
make flash monitor
  1. Advanced users can add or modify speech commands by using the make menuconfig command.

For details, please read the README file in each example.

Resources

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 99.2%
  • Other 0.8%