Hush

Hush

WebsiteFreemiumVoice & Audio Editing
Hush is an 8 MB open-source, CPU-real-time speech enhancement model that suppresses background noise and competing speakers for production Voice AI calls in under ~1 ms per 10 ms frame.
https://www.weya.ai/hush?ref=producthunt
Hush

Product Information

Updated:Jun 24, 2026

What is Hush

Hush is weya AI’s in-house open-source noise suppression and speech enhancement model built specifically for production Voice AI systems such as phone agents, call-center bots, voice assistants, and real-time transcription pipelines. Unlike many enhancement models optimized mainly for generic noise benchmarks, Hush is designed for real-world calls where overlapping human speech is a frequent failure point for ASR and downstream conversational AI. It is lightweight (~1.8M parameters, ~8 MB), runs fully on CPU in real time, and is distributed with practical deployment artifacts (PyTorch checkpoint and an ONNX production bundle) under the Apache 2.0 license.

Key Features of Hush

Hush is an open-source, real-time speech enhancement/noise suppression model from weya AI built specifically for production Voice AI. It runs fully on CPU with very low latency (about under 1 ms processing per 10 ms audio frame), is lightweight (~8 MB, ~1.8M parameters), and is trained on 10,000+ hours of mixed noisy audio with a strong emphasis on suppressing competing background speakers (overlapping speech) in addition to typical ambient noise. It is language-agnostic (operates on acoustic features), causal/streaming-friendly, and can be deployed via an ONNX production bundle or prebuilt standalone binaries for common OSes, making it easy to integrate into voice pipelines.
Background speaker suppression: Designed to isolate the primary caller and reduce competing human voices (a common failure mode for voice agents and ASR), not just stationary noise.
Real-time CPU performance: Processes audio frames fast enough for live calls (reported under ~1 ms per 10 ms of audio) without requiring a GPU.
Lightweight footprint: Small model size (~8 MB; ~1.8M parameters) makes it practical for on-prem and edge deployments with limited resources.
Production-oriented deployment options: Ships with an ONNX production bundle and a standalone library for direct integration in C/C++/Python, with prebuilt binaries for Linux, macOS (Apple Silicon), and Windows.
Trained on large-scale real-world noisy data: Trained on 10,000+ hours of mixed audio; a large portion includes overlapping speakers at moderate SIR levels, improving robustness in real calls.
Language-agnostic enhancement: Works across languages because it enhances acoustic signal quality rather than relying on linguistic content.

Use Cases of Hush

Call center voice agents & IVR: Cleans noisy phone audio and suppresses background talk/TV to improve agent understanding, reduce reprompts, and stabilize end-to-end voice bot performance.
Real-time transcription pipelines: Improves ASR accuracy on live or recorded conversations by enhancing speech clarity and reducing interference from noise and overlapping speakers.
BFSI customer onboarding, sales, and collections calls: Boosts intelligibility in regulated, high-stakes calls (e.g., KYC, loan/collections conversations) where noisy environments and speaker overlap are common.
Voice assistants in noisy environments: Helps assistants function in cafes, streets, offices, and other real-world settings by reducing ambient noise and focusing on the main speaker.
Compliance and QA call review: Enhances recorded call audio for clearer audits, quality monitoring, and downstream analytics (summarization, intent detection) by improving the source signal.

Pros

Open-source (Apache 2.0) and designed for enterprise/on-prem deployment.
Real-time, CPU-only operation with very low latency and small model size.
Explicit focus on suppressing competing background speakers, a common production Voice AI pain point.

Cons

Optimized for 16 kHz streaming/call audio; may require resampling and careful pipeline integration for other formats.
As a speech enhancement model, it can introduce artifacts or over-suppress in extreme noise/overlap conditions depending on the input domain.
Best results may depend on proper frame-based streaming integration (session state, frame sizing) rather than simple offline batch processing.

How to Use Hush

1) Open the Hush model page: Go to the official Hugging Face repository for the model: https://huggingface.co/weya-ai/hush
2) Choose your integration path (quick demo vs. production): Decide whether you want to (a) try Hush via the hosted Hugging Face interface for a quick test, or (b) integrate it into your own Voice AI stack for real-time call processing.
3) Try Hush in the browser (quick test): On the Hugging Face model page, use the available demo/widget (if shown) to run an example and compare noisy input vs. enhanced output.
4) Download the model assets for local use: From the Hugging Face repo files, download the checkpoint and/or the ONNX production bundle (the ONNX tarball under the onnx/ directory) depending on your runtime needs.
5) Use ONNX for CPU real-time deployment: For production use without PyTorch, use the prebuilt ONNX bundle so Hush can run fully on CPU in real time (the model is designed to process ~10 ms frames with sub-ms compute on typical CPUs).
6) Integrate into your audio pipeline at the ‘front’: Place Hush before ASR/transcription or your voice agent so the call audio is enhanced first; this improves intelligibility and reduces background noise and competing speech reaching downstream components.
7) Feed audio as a real-time stream: Run Hush continuously on live audio frames (e.g., 10 ms chunks) to keep latency low and maintain real-time behavior for calls and conversational systems.
8) Validate on your target environments: Test with your real call conditions (cafes, streets, office noise, overlapping speakers). Note that Hush is trained with background speakers at moderate SIR (about 12–24 dB), so extremely loud competing speakers may not be fully suppressed.
9) Understand what not to use as an output: If you see references to a ‘separation head’ or background-speaker mask, treat it as a training-time auxiliary regularizer (ERB-domain soft mask), not a standalone source-separation output for production.
10) Deploy on your target OS: Deploy the CPU runtime where you need it (Linux, macOS including Apple Silicon, or Windows) using the ONNX approach to avoid heavy production dependencies.

Hush FAQs

Hush is an open-source speech enhancement/noise suppression model built for Voice AI that removes background noise and suppresses competing background speakers from real-world call audio.

Latest AI Tools Similar to Hush

EchoWave
EchoWave
EchoWave is an online video and audio editing platform that enables creators to convert audio content into engaging videos with waveform visualizations, subtitles, and effects for social media sharing.
AIdeaflow Podcast
AIdeaflow Podcast
AIdeaflow Podcast is an AI-powered platform that transforms text into engaging podcast content with natural conversations across 120+ voices and multiple languages.
TranscribetoText.AI
TranscribetoText.AI
TranscribeToText.AI is a powerful online transcription service that converts audio and video files to text in over 120 languages with 99.9% accuracy, offering unlimited transcription access and flexible output options.
Rift Podcast
Rift Podcast
Rift Podcast is an AI-powered application that transforms web content into personalized audio podcasts, offering exclusive insights curated from various tech platforms and delivered in just 15 minutes daily.