Kuzco is a powerful Swift package that enables local Large Language Model (LLM) inference directly in iOS and macOS apps, built on llama.cpp with zero network dependency for privacy-focused AI integration.
https://github.com/jaredcassoutt/Kuzco?ref=producthunt
Kuzco

Product Information

Updated:Aug 19, 2025

What is Kuzco

Kuzco is a versatile Swift package designed to bring local Large Language Model capabilities to iOS, macOS, and Mac Catalyst applications. Built as a wrapper around the battle-tested llama.cpp engine, it serves as a bridge between Apple's development ecosystem and advanced AI functionality. The package supports multiple popular LLM architectures including LLaMA, Mistral, Phi, Gemma, Qwen, and others, making it a comprehensive solution for developers looking to implement AI features in their applications without relying on cloud services.

Key Features of Kuzco

Kuzco is a Swift package that enables on-device Large Language Model (LLM) inference for iOS, macOS, and Mac Catalyst applications. Built on llama.cpp, it provides local AI model execution with zero network dependency, ensuring privacy and reliability. The package supports multiple LLM architectures, offers customizable configurations, and features modern Swift concurrency with streaming responses.
On-Device LLM Processing: Runs AI models locally without internet connectivity using llama.cpp, supporting various architectures like LLaMA, Mistral, Phi, Gemma, and Qwen
Advanced Configuration Options: Provides fine-tuning capabilities for context length, batch size, GPU layers, and CPU threads to optimize performance for different devices
Modern Swift Integration: Features async/await native support with streaming responses and comprehensive error handling for seamless integration into Swift applications
Automatic Architecture Detection: Smart detection of model architectures from filenames with fallback support for better compatibility and ease of use

Use Cases of Kuzco

Private AI Chatbots: Build chat applications that process user conversations entirely on-device, ensuring user privacy and offline functionality
Enterprise Data Analysis: Process sensitive business data locally using AI models without exposing information to external servers
Mobile AI Applications: Create iOS apps with AI capabilities that work reliably regardless of internet connectivity
Educational Tools: Develop learning applications that can provide AI-powered tutoring and feedback while maintaining student privacy

Pros

Complete privacy with on-device processing
No network dependency required
High performance optimization for Apple devices
Comprehensive developer-friendly API

Cons

Requires sufficient device resources to run models
Limited to iOS/macOS platforms only
May have slower performance compared to cloud-based solutions

How to Use Kuzco

Install Kuzco via Swift Package Manager: Add Kuzco to your project by adding the package URL 'https://github.com/jaredcassoutt/Kuzco.git' and select 'Up to Next Major' with version 1.0.0+
Import and Initialize: Add 'import Kuzco' to your Swift file and initialize with 'let kuzco = Kuzco.shared'
Create a Model Profile: Create a ModelProfile with your model's ID and path: let profile = ModelProfile(id: 'my-model', sourcePath: '/path/to/your/model.gguf')
Load the Model: Load the model instance using: let (instance, loadStream) = await kuzco.instance(for: profile)
Monitor Loading Progress: Track the loading progress through the loadStream and wait for .ready stage before proceeding
Create Conversation Turns: Create conversation turns for your dialogue: let turns = [Turn(role: .user, text: userMessage)]
Generate Response: Generate a response using predict() with your desired settings: let stream = try await instance.predict(turns: turns, systemPrompt: 'You are a helpful assistant.')
Process the Response: Process the streaming response by iterating through the tokens: for try await (content, isComplete, _) in predictionStream { print(content) }
Optional: Configure Advanced Settings: Customize performance with InstanceSettings (contextLength, batchSize, gpuOffloadLayers, cpuThreads) and PredictionConfig (temperature, topK, topP, repeatPenalty, maxTokens) if needed

Kuzco FAQs

Kuzco is a Swift package that enables running Large Language Models (LLMs) directly on iOS, macOS, and Mac Catalyst apps. It's built on top of llama.cpp and allows for on-device AI with no network dependency, ensuring privacy and speed.

Latest AI Tools Similar to Kuzco

Gait
Gait
Gait is a collaboration tool that integrates AI-assisted code generation with version control, enabling teams to track, understand, and share AI-generated code context efficiently.
invoices.dev
invoices.dev
invoices.dev is an automated invoicing platform that generates invoices directly from developers' Git commits, with integration capabilities for GitHub, Slack, Linear, and Google services.
EasyRFP
EasyRFP
EasyRFP is an AI-powered edge computing toolkit that streamlines RFP (Request for Proposal) responses and enables real-time field phenotyping through deep learning technology.
Cart.ai
Cart.ai
Cart.ai is an AI-powered service platform that provides comprehensive business automation solutions including coding, customer relations management, video editing, e-commerce setup, and custom AI development with 24/7 support.