How is Mercury different from other LLMs?

Unlike other LLMs that generate text one token at a time sequentially, Mercury generates tokens in parallel using diffusion technology. This makes it 5-10x faster than current generation LLMs while providing high-quality responses at lower costs.

What are Mercury's pricing details?

Mercury charges $0.25 per 1M tokens for input and $0.75 per 1M tokens for output, which is less than one-quarter the price of comparable models like Claude 4.5 Haiku.

What are the main applications of Mercury?

Mercury is particularly useful for coding tasks (with features like responsive autocomplete and intelligent tab suggestions), real-time voice applications, instant agents, rapid search across organizational knowledge bases, and creative co-pilot functions.

What versions of Mercury are available?

There are two main versions: Mercury 2, which is the fastest reasoning LLM ideal for complex applications where performance and speed are crucial, and Mercury Edit, which is a small, coding-focused dLLM designed for code editing and latency-sensitive coding workflows.

How can businesses access Mercury?

Mercury is available through major cloud providers like AWS Bedrock and Azure Foundry. It's OpenAI API compatible and can be a drop-in replacement for traditional LLMs. Businesses can reach out to sales@inceptionlabs.ai for enterprise solutions.

Mercury

WebsitePaidAI Code Generator AI Voice Assistants

Mercury is the first commercial-scale diffusion-based large language model (dLLM) that can generate text up to 10x faster than traditional LLMs while maintaining high quality output.

Visit Website

Advertise This Tool

https://www.inceptionlabs.ai/?ref=producthunt

Overview
Video
Alternatives

Product Information

Updated:Mar 9, 2026

What is Mercury

Mercury is a groundbreaking AI model developed by Inception Labs that represents a fundamental shift from traditional autoregressive language models to diffusion-based text generation. Launched in February 2025, Mercury and its code-specialized version Mercury Coder are available through Amazon Bedrock Marketplace and Amazon SageMaker JumpStart. The model family was created by a team of researchers from Stanford, UCLA, and Cornell who pioneered foundational diffusion work. Mercury is designed to handle various tasks including code generation, reasoning, and real-time voice applications.

Key Features of Mercury

Mercury is a groundbreaking diffusion-based Large Language Model (dLLM) developed by Inception Labs that fundamentally changes how language models generate text. Unlike traditional autoregressive models that generate text sequentially, Mercury generates multiple tokens in parallel, achieving speeds of over 1,000 tokens per second on standard NVIDIA GPUs while maintaining high-quality outputs. It offers enterprise-grade capabilities including a 128K token context window, tool calling support, and compatibility with major cloud platforms like AWS Bedrock and Azure AI Foundry.

Parallel Token Generation: Uses diffusion-based architecture to generate multiple tokens simultaneously instead of sequential generation, enabling 5-10x faster processing than traditional LLMs

Cloud Platform Integration: Available through major cloud providers including AWS Bedrock and Azure AI Foundry with enterprise-grade reliability and 99.5%+ uptime

API Compatibility: Maintains OpenAI API compatibility and supports standard prompting methods (zero-shot, few-shot, CoT), making it a drop-in replacement for existing LLM workflows

Advanced Reasoning Capabilities: Features multi-step refinement process that catches errors and improves coherence during text generation, particularly strong in coding and mathematical reasoning tasks

Use Cases of Mercury

Code Development: Powers real-time code completion, intelligent tab suggestions, and rapid code edits in development environments with ultra-low latency

Enterprise Search: Enables instant data retrieval and summarization across large organizational knowledge bases with minimal latency

Real-time Voice Applications: Supports responsive voice-powered workflows including customer support, translation services, and interactive voice agents

Automated Workflows: Handles complex routing, analytics, and decision processes in enterprise environments with ultra-responsive AI capabilities

Pros

Significantly faster processing speed (1000+ tokens per second)

Lower inference costs compared to traditional LLMs

Drop-in compatibility with existing LLM workflows

Cons

Limited track record as a new technology

Currently focused primarily on coding and enterprise applications

Requires specific GPU hardware for optimal performance

How to Use Mercury

Create an account: Visit platform.inceptionlabs.ai and create an Inception Platform account or sign in if you already have one

Get API key: Go to API Keys section in your account dashboard and create a new API key. New API keys come with 10 million free tokens

Choose deployment method: You can access Mercury through direct API integration, Amazon Bedrock Marketplace, Amazon SageMaker JumpStart, or Azure AI Foundry depending on your needs

Make API calls: Use the API key to make calls to Mercury API endpoints. The API is OpenAI-compatible and can be accessed through REST calls or existing OpenAI client libraries

Basic API usage example: Make a POST request to https://api.inceptionlabs.ai/v1/chat/completions with your API key in the Authorization header and JSON payload containing model (e.g. 'mercury-2') and messages

Configure settings: Optionally set parameters like max_tokens and enable streaming/diffusion visualization by setting the diffusing parameter to true

Integrate with tools: Mercury can be integrated with popular tools and frameworks including LangChain, AISuite, and LiteLLM for more complex applications

Monitor usage: Track your token usage through the platform dashboard. Input tokens cost $0.25 per 1M tokens and output tokens cost $0.75 per 1M tokens

Get support: For issues or questions, contact [email protected] or join their Discord channel. Enterprise customers can reach out to [email protected]

Mercury FAQs

Mercury is the first commercially available diffusion-based Large Language Model (dLLM) launched by Inception Labs in February 2025. It uses a breakthrough diffusion-based approach to language generation instead of traditional auto-regressive generation.

Mercury Video

Latest AI Tools Similar to Mercury

Foundry

Contact for PricingAI Code Generator Game Tools

Foundry is a versatile platform that exists in multiple forms - as a smart contract development toolchain, a virtual tabletop gaming software, and a traditional metal casting facility - each offering specialized features for their respective domains.

PythonConvert.com

FreemiumAI Code Generator AI Code Assistant

PythonConvert.com is a free web-based tool that provides AI-powered code translation between Python and other programming languages as well as Python type conversion capabilities.

Softgen

Free TrialAI Code Generator No-Code & Low-Code AI Website Builder

Softgen.ai is an AI-powered full-stack project generator platform that enables users to transform their ideas into functional web applications without coding requirements.

Micro SaaS Ideas

Business Ideas Generator No-Code & Low-Code AI Code Generator

Micro SaaS Ideas are small-scale, niche-focused software solutions that target specific problems or markets, offering entrepreneurs a way to build profitable businesses with minimal resources and complexity.

Popular AI Tools Like Mercury

GitHub Copilot Chat

PaidAI Code Assistant AI Code Generator AI Developer Tools

GitHub Copilot Chat is an AI-powered coding assistant that provides natural language interactions, real-time code suggestions, and contextual support directly within supported IDEs and GitHub.com.

CopilotForXcode

FreemiumAI Code Assistant AI Code Generator AI Code Refactoring

CopilotForXcode is an Xcode Source Editor Extension that integrates GitHub Copilot, Codeium, and ChatGPT to provide AI-powered code suggestions, chat assistance, and prompt-to-code functionality within Xcode.

OpenAI Codex CLI

FreeAI Code Assistant AI Code Generator

OpenAI Codex CLI is a lightweight, open-source coding agent that runs in your terminal, enabling developers to translate natural language into code execution while providing ChatGPT-level reasoning with the ability to run code, manipulate files, and iterate under version control.

GoCodeo SaaS Builder

FreeAI Code Generator No-Code & Low-Code

GoCodeo SaaS Builder is an AI-native framework that automatically generates production-ready full-stack SaaS applications using autonomous AI agents, complete with authentication, real-time features, and TypeScript integration.

Ranking

Submit & PromoteNew

Mercury

Product Information

What is Mercury

Key Features of Mercury

Use Cases of Mercury

Pros

Cons

How to Use Mercury

Mercury FAQs

1. What is Mercury?

2. How is Mercury different from other LLMs?

3. What are Mercury's pricing details?

4. What are the main applications of Mercury?

5. What versions of Mercury are available?

6. How can businesses access Mercury?

Mercury Video

Popular Articles

Latest AI Tools Similar to Mercury

Popular AI Tools Like Mercury