What problem does Hush solve for Voice AI systems?

Hush improves the quality of live call audio so downstream systems (ASR, voice agents, call-center bots, transcription pipelines) can understand the primary speaker more reliably, especially in noisy environments and with overlapping voices.

Does Hush run in real time, and does it need a GPU?

Yes—Hush is designed to run fully on CPU in real time (typically under ~1 ms processing per 10 ms audio frame) and does not require a GPU.

How large is the Hush model?

The model is approximately 8 MB.

What training data characteristics are mentioned for Hush?

Hush was trained on 10,000+ hours of mixed noisy audio, with competing human voices present in about 60% of the dataset at signal-to-interference ratios (SIR) of 12–24 dB.

What architecture is Hush based on?

Hush is built on the DeepFilterNet3 architecture and includes an enhancement with an Auxiliary Separation Head to better suppress background speakers.

How can Hush be deployed in production?

Hush can be deployed via ONNX (a prebuilt ONNX production bundle is provided), enabling CPU-only deployment across Linux, macOS (Apple Silicon), and Windows; the repository also references a prebuilt Weya NC Standalone library for production deployment without PyTorch.

Is Hush open source, and what license does it use?

Yes. The model weights and source code are available publicly (e.g., on Hugging Face and GitHub) under the Apache 2.0 license.

How did Hush perform on public benchmarks at launch?

At launch, Hush ranked #5 on Hugging Face’s Audio-to-Audio leaderboard, placing it among the top open-source models in its category.

Hush

WebsiteFreemiumVoice & Audio Editing

Hush is an 8 MB open-source, CPU-real-time speech enhancement model that suppresses background noise and competing speakers for production Voice AI calls in under ~1 ms per 10 ms frame.

Visit Website

Advertise This Tool

https://www.weya.ai/hush?ref=producthunt

Overview
Video
Alternatives

Product Information

Updated:Jul 8, 2026

What is Hush

Hush is weya AI’s in-house open-source noise suppression and speech enhancement model built specifically for production Voice AI systems such as phone agents, call-center bots, voice assistants, and real-time transcription pipelines. Unlike many enhancement models optimized mainly for generic noise benchmarks, Hush is designed for real-world calls where overlapping human speech is a frequent failure point for ASR and downstream conversational AI. It is lightweight (~1.8M parameters, ~8 MB), runs fully on CPU in real time, and is distributed with practical deployment artifacts (PyTorch checkpoint and an ONNX production bundle) under the Apache 2.0 license.

Key Features of Hush

Hush is an open-source, real-time speech enhancement/noise suppression model from weya AI built specifically for production Voice AI. It runs fully on CPU with very low latency (about under 1 ms processing per 10 ms audio frame), is lightweight (~8 MB, ~1.8M parameters), and is trained on 10,000+ hours of mixed noisy audio with a strong emphasis on suppressing competing background speakers (overlapping speech) in addition to typical ambient noise. It is language-agnostic (operates on acoustic features), causal/streaming-friendly, and can be deployed via an ONNX production bundle or prebuilt standalone binaries for common OSes, making it easy to integrate into voice pipelines.

Background speaker suppression: Designed to isolate the primary caller and reduce competing human voices (a common failure mode for voice agents and ASR), not just stationary noise.

Real-time CPU performance: Processes audio frames fast enough for live calls (reported under ~1 ms per 10 ms of audio) without requiring a GPU.

Lightweight footprint: Small model size (~8 MB; ~1.8M parameters) makes it practical for on-prem and edge deployments with limited resources.

Production-oriented deployment options: Ships with an ONNX production bundle and a standalone library for direct integration in C/C++/Python, with prebuilt binaries for Linux, macOS (Apple Silicon), and Windows.

Trained on large-scale real-world noisy data: Trained on 10,000+ hours of mixed audio; a large portion includes overlapping speakers at moderate SIR levels, improving robustness in real calls.

Language-agnostic enhancement: Works across languages because it enhances acoustic signal quality rather than relying on linguistic content.

Use Cases of Hush

Call center voice agents & IVR: Cleans noisy phone audio and suppresses background talk/TV to improve agent understanding, reduce reprompts, and stabilize end-to-end voice bot performance.

Real-time transcription pipelines: Improves ASR accuracy on live or recorded conversations by enhancing speech clarity and reducing interference from noise and overlapping speakers.

BFSI customer onboarding, sales, and collections calls: Boosts intelligibility in regulated, high-stakes calls (e.g., KYC, loan/collections conversations) where noisy environments and speaker overlap are common.

Voice assistants in noisy environments: Helps assistants function in cafes, streets, offices, and other real-world settings by reducing ambient noise and focusing on the main speaker.

Compliance and QA call review: Enhances recorded call audio for clearer audits, quality monitoring, and downstream analytics (summarization, intent detection) by improving the source signal.

Pros

Open-source (Apache 2.0) and designed for enterprise/on-prem deployment.

Real-time, CPU-only operation with very low latency and small model size.

Explicit focus on suppressing competing background speakers, a common production Voice AI pain point.

Cons

Optimized for 16 kHz streaming/call audio; may require resampling and careful pipeline integration for other formats.

As a speech enhancement model, it can introduce artifacts or over-suppress in extreme noise/overlap conditions depending on the input domain.

Best results may depend on proper frame-based streaming integration (session state, frame sizing) rather than simple offline batch processing.

How to Use Hush

1) Open the Hush model page: Go to the official Hugging Face repository for the model: https://huggingface.co/weya-ai/hush

2) Choose your integration path (quick demo vs. production): Decide whether you want to (a) try Hush via the hosted Hugging Face interface for a quick test, or (b) integrate it into your own Voice AI stack for real-time call processing.

3) Try Hush in the browser (quick test): On the Hugging Face model page, use the available demo/widget (if shown) to run an example and compare noisy input vs. enhanced output.

4) Download the model assets for local use: From the Hugging Face repo files, download the checkpoint and/or the ONNX production bundle (the ONNX tarball under the onnx/ directory) depending on your runtime needs.

5) Use ONNX for CPU real-time deployment: For production use without PyTorch, use the prebuilt ONNX bundle so Hush can run fully on CPU in real time (the model is designed to process ~10 ms frames with sub-ms compute on typical CPUs).

6) Integrate into your audio pipeline at the ‘front’: Place Hush before ASR/transcription or your voice agent so the call audio is enhanced first; this improves intelligibility and reduces background noise and competing speech reaching downstream components.

7) Feed audio as a real-time stream: Run Hush continuously on live audio frames (e.g., 10 ms chunks) to keep latency low and maintain real-time behavior for calls and conversational systems.

8) Validate on your target environments: Test with your real call conditions (cafes, streets, office noise, overlapping speakers). Note that Hush is trained with background speakers at moderate SIR (about 12–24 dB), so extremely loud competing speakers may not be fully suppressed.

9) Understand what not to use as an output: If you see references to a ‘separation head’ or background-speaker mask, treat it as a training-time auxiliary regularizer (ERB-domain soft mask), not a standalone source-separation output for production.

10) Deploy on your target OS: Deploy the CPU runtime where you need it (Linux, macOS including Apple Silicon, or Windows) using the ONNX approach to avoid heavy production dependencies.

Hush FAQs

Hush is an open-source speech enhancement/noise suppression model built for Voice AI that removes background noise and suppresses competing background speakers from real-world call audio.

Hush Video

Latest AI Tools Similar to Hush

EchoWave

FreemiumAI Video Editing Voice & Audio Editing AI Social Media Assistant

EchoWave is an online video and audio editing platform that enables creators to convert audio content into engaging videos with waveform visualizations, subtitles, and effects for social media sharing.

AIdeaflow Podcast

FreeAI Podcast Assistant Text to Speech Voice & Audio Editing

AIdeaflow Podcast is an AI-powered platform that transforms text into engaging podcast content with natural conversations across 120+ voices and multiple languages.

TranscribetoText.AI

FreemiumTranscription AI Speech Recognition Voice & Audio Editing

TranscribeToText.AI is a powerful online transcription service that converts audio and video files to text in over 120 languages with 99.9% accuracy, offering unlimited transcription access and flexible output options.

Rift Podcast

Free TrialAI Podcast Assistant Text to Speech Voice & Audio Editing

Rift Podcast is an AI-powered application that transforms web content into personalized audio podcasts, offering exclusive insights curated from various tech platforms and delivered in just 15 minutes daily.

Popular AI Tools Like Hush

W-Okada Voice Changer

FreemiumAI Voice Changer Voice & Audio Editing AI Voice Chat Generator

W-Okada Voice Changer is an open-source real-time voice conversion software that uses AI to transform voices with high quality and low latency.

FnKey

FreeText to Speech Voice & Audio Editing

FnKey is a lightweight macOS menu bar application that enables quick voice-to-text transcription by holding the Fn key to speak and automatically pastes the transcribed text when released.

Background noise removal

FreeAI Noise Cancellation Voice & Audio Editing

A powerful Chrome extension that uses advanced AI technology to remove unwanted background noise from audio and video files, offering real-time noise cancellation for crystal-clear sound quality.

Audio player for ChatGPT

FreeText to Speech Voice & Audio Editing

A Chrome extension that enhances ChatGPT's Read Aloud feature by adding a user-friendly audio player with basic controls like play/pause, seek bar, and duration display.

Ranking

Submit & PromoteNew

Hush

Product Information

What is Hush

Key Features of Hush

Use Cases of Hush

Pros

Cons

How to Use Hush

Hush FAQs

1. What is Hush by weya AI?

2. What problem does Hush solve for Voice AI systems?

3. Does Hush run in real time, and does it need a GPU?

4. How large is the Hush model?

5. What training data characteristics are mentioned for Hush?

6. What architecture is Hush based on?

7. How can Hush be deployed in production?

8. Is Hush open source, and what license does it use?

9. How did Hush perform on public benchmarks at launch?

Hush Video

Popular Articles

Latest AI Tools Similar to Hush

Popular AI Tools Like Hush