Talking Face Avatar: single portrait image From Leonardo.ai API 🙎♂️ + audio From ElevenLabs TTS API 🎤 = talking head video 🎞.
Go To Leonardo.Ai And Enter your Prompt And Negative Prompts To Generate Artistic Images
Here Some Recources :Leonardo.ai Youtube Video Leonardo.ai Youtube Video Toutorial
or you can use APIs Leonardo.Ai API Guide
Leonardo.ai Image Generation | Leonardo.ai Image Generation | Leonardo.ai Image Generation |
Go To Eleven Labs And Enter your Text And Generate Beautiful Audios With Diffrent Pitchs and Speeckers. ElvenLabs also is Multilingual
Here Some Recources :ElevenLabs Youtube Video
or you can use APIs ElevenLabs API Guide
Eleven Labs TTS | Eleven Labs TTS | Eleven Labs TTS |
output-_6_.mp4 |
output.5.mp4 |
output.1.mp4 |
-🔥 Scroll To left and Right To See All Videos
video 1 + enhancer(GFPGAN ) | video 2 | video 3 |
---|---|---|
RPG_40_Female_Astronaut_model_soft_natural_lighting_forest_win_0.output5_enhanced.mp4 |
Deliberate_11_hyperrealistic_portrait_of_a_beautiful_white_wom_0.output-_6__enhanced.mp4 |
DreamShaper_32_Clara_Crawford_photorealistic_beautiful_woman_l_1.saba_enhanced.mp4 |
video 4 | video 5 | video 6 |
---|---|---|
RPG_40_Portrait_of_beautiful_lady_little_pojatti_realistic_stu_0.output-_7__enhanced.mp4 |
Deliberate_11_a_hyper_realistic_ultra_detailed_photograph_of_a_1.output3_enhanced.mp4 |
RPG_40_hyperrealistic_photo_of_a_beautiful_white_woman_upper_b_0.output_enhanced.mp4 |
- 🔥 Several new mode, eg,
still mode
,reference mode
,resize mode
are online for better and custom applications.
-
Installing anaconda, python and git.
-
Creating the env and install the requirements.
git clone https://github.com/saba99/Talking_Face_Avatar.git
cd SadTalker
conda create -n sadtalker python=3.8
conda activate sadtalker
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
conda install ffmpeg
pip install -r requirements.txt
### tts is optional for gradio demo.
### pip install TTS
look at index.html
You can run the following script to put all the models in the right place.
bash scripts/download_models.sh
Model Details
The final folder will be shown as:
Model explains:
Model | Description |
---|---|
checkpoints/auido2exp_00300-model.pth | Pre-trained ExpNet in Sadtalker. |
checkpoints/auido2pose_00140-model.pth | Pre-trained PoseVAE in Sadtalker. |
checkpoints/mapping_00229-model.pth.tar | Pre-trained MappingNet in Sadtalker. |
checkpoints/mapping_00109-model.pth.tar | Pre-trained MappingNet in Sadtalker. |
checkpoints/facevid2vid_00189-model.pth.tar | Pre-trained face-vid2vid model from the reappearance of face-vid2vid. |
checkpoints/epoch_20.pth | Pre-trained 3DMM extractor in Deep3DFaceReconstruction. |
checkpoints/wav2lip.pth | Highly accurate lip-sync model in Wav2lip. |
checkpoints/shape_predictor_68_face_landmarks.dat | Face landmark model used in dilb. |
checkpoints/BFM | 3DMM library file. |
checkpoints/hub | Face detection models used in face alignment. |
gfpgan/weights | Face detection and enhanced models used in facexlib and gfpgan . |
🔮 3. Quick Start (Best Practice).
## you need manually install TTS(https://github.com/coqui-ai/TTS) via `pip install tts` in advanced.
python app.py
python inference.py --driven_audio <audio.wav> \
--source_image <video.mp4 or picture.png> \
--enhancer gfpgan
The results will be saved in results/$SOME_TIMESTAMP/*.mp4
.
Using --still
to generate a natural full body video. You can add enhancer
to improve the quality of the generated video.
python inference.py --driven_audio <audio.wav> \
--source_image <video.mp4 or picture.png> \
--result_dir <a file to store results> \
--still \
--preprocess full \
--enhancer gfpgan