Fish Speech Howto

Fish Speech is an open-source, multilingual text-to-speech model capable of generating high-quality, natural-sounding speech in Chinese, Japanese, and English with customizable voices and emotions.
View More

How to Use Fish Speech

Install dependencies: Install required packages by running: pip3 install torch torchvision torchaudio
Create virtual environment: Create a Python 3.10 virtual environment using conda: conda create -n fish-speech python=3.10
Activate environment: Activate the virtual environment: conda activate fish-speech
Install Fish Speech: Install Fish Speech by running: pip3 install -e .
Download models: Download required models from Hugging Face: huggingface-cli download fishaudio/fish-speech-1.2-sft --local-dir checkpoints/fish-speech-1.2-sft
Run inference: Generate speech by running: python tools/llama/generate.py --text "Your text here" --checkpoint-path "checkpoints/fish-speech-1.2-sft"
Decode audio: Decode the generated tokens to audio using VQGAN: python tools/vqgan/inference.py -i "codes_0.npy" --checkpoint-path "checkpoints/fish-speech-1.2-sft/firefly-gan-vq-fsq-4x1024-42hz-generator.pth"
Start web UI (optional): Launch the web interface by running: python -m tools.webui --llama-checkpoint-path "checkpoints/fish-speech-1.2-sft" --decoder-checkpoint-path "checkpoints/fish-speech-1.2-sft/firefly-gan-vq-fsq-4x1024-42hz-generator.pth"

Fish Speech FAQs

Fish Speech is an open-source text-to-speech (TTS) model developed by Fish Audio. It is trained on 150,000 hours of multilingual audio data and can generate high-quality speech in Chinese, Japanese, and English.

Latest AI Tools Similar to Fish Speech

Voisi
Voisi
Voisi is a comprehensive AI-powered language toolkit that enables users to create conversations, narrations, translations and more using hundreds of voices across multiple languages.
Podcraftr
Podcraftr
Podcraftr is an AI-powered platform that automatically converts text content into studio-quality podcasts with monetization and distribution capabilities.
TextPixie AI Translator
TextPixie AI Translator
TextPixie AI Translator is a free online tool that instantly translates text, images, and audio across 100+ languages with high accuracy using advanced AI algorithms.
Dubbing, Inc.
Dubbing, Inc.
Dubbing, Inc. is an AI-powered video dubbing platform that allows users to translate and localize video content into multiple languages quickly and affordably.

Popular AI Tools Like Fish Speech

ElevenLabs
ElevenLabs
ElevenLabs is an AI audio research and deployment company that offers advanced text-to-speech, voice cloning, and dubbing capabilities across 32 languages with over 100 realistic AI voices.
Vidnoz
Vidnoz
Vidnoz is an AI-powered video creation platform that enables users to quickly generate professional-quality videos with lifelike avatars, natural voices, and customizable templates.
Clipchamp
Clipchamp
Clipchamp is an easy-to-use online video editor with professional features, AI-powered tools, and templates that allows anyone to create high-quality videos without expertise.
Speechify
Speechify
Speechify is the leading AI text-to-speech app that converts written text into natural-sounding audio across multiple platforms and devices.