Local LLMs
Self-hosted and on-device model workflows for privacy and predictable usage costs.
Browse Local LLMs tools filtered by practical fit and workflow needs.
68 matching tools.
Tools in this category
Anarlog
Open-source on-device AI notepad for meetings — local transcription, BYO API keys, notes saved as portable files. Formerly Hyprnote; canonical brand is now Anarlog.
- Free
- meeting-notes
- transcription
- open-source
Best for: Privacy-conscious professionals (lawyers, healthcare, researchers, journalists), Operators in regulated industries where cloud notetakers are non-starters
AUTOMATIC1111
Feature-rich Stable Diffusion WebUI with extensive model, extension, and parameter control.
- Free
- image-generation
- stable-diffusion
- local-inference
Best for: Advanced local Stable Diffusion workflows
CogView 4
THUDM text-to-image model family for high-quality generation in open research and local workflows.
- Free
- image-generation
- text-to-image
- open-weights
Best for: Developer workflows, Faceless content production
ComfyUI TTS
Node-based text-to-speech and voice workflow stack inside ComfyUI using custom audio nodes.
- Free
- text-to-speech
- voiceover
- narration
Best for: Local custom voiceover pipelines, Experimental multi-model TTS workflows
Command R+
Large instruction-tuned model oriented to advanced assistant and retrieval-heavy workflows.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Advanced local assistant deployments, Complex retrieval and planning workflows
Coqui TTS
Open-source toolkit for local text-to-speech and voice cloning workflows.
- Free
- text-to-speech
- voiceover
- local-inference
Best for: Advanced local text-to-speech pipelines
DeepSeek-R1
Reasoning-focused open-weight family with MIT core licensing and smaller distilled options.
- Free
- local-inference
- open-weights
- mit
Best for: Reasoning-heavy workflows on distilled checkpoints, Local experimentation with open model pipelines
DeepSeek-V4
Preview open-weight DeepSeek family with Pro and Flash MoE models, 1M context, and strong coding and agentic reasoning focus.
- Free
- cloud-llm
- local-inference
- open-weights
Best for: Coding-agent experiments with open-weight models, Long-context analysis over documents or repositories
DeepSeek-VL2
Mixture-of-experts local vision-language family for OCR, documents, charts, and grounded multimodal reasoning.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Private visual document analysis, Multimodal document understanding
EchoMimic
Open-source audio-driven portrait animation framework with editable landmark control and newer multimodal animation branches.
- Free
- avatar-video
- local-inference
- open-source
Best for: Open diffusion-based avatar animation experiments
FLUX
FLUX family for quality-first generation, fast local variants, and modern in-context image editing workflows.
- Free
- image-generation
- text-to-image
- image-editing
Best for: Thumbnail and visual concept generation, Fast style exploration for creator content
Fooocus
Beginner-friendly local Stable Diffusion UI focused on high-quality images with minimal setup.
- Free
- image-generation
- stable-diffusion
- local-inference
Best for: Fast local image generation with minimal setup
Forge
Performance-focused Stable Diffusion WebUI fork designed for practical local generation speed and compatibility.
- Free
- image-generation
- stable-diffusion
- local-inference
Best for: Faster local Stable Diffusion workflows in a linear WebUI
Gemma 2
Older Gemma family branch focused on efficient local text workloads in 2B, 9B, and 27B sizes.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Efficient local chat workloads, Summarization and long-form drafting
Gemma 3
Multimodal Gemma family with 128K context and broad local deployment options under Gemma terms.
- Free
- local-inference
- open-weights
- on-device
Best for: Local assistants with manageable compliance processes, Multimodal summarization and extraction
Gemma 3n
Device-first Gemma branch with multimodal support, long context, and efficient E2B/E4B variants.
- Free
- local-inference
- open-weights
- on-device
Best for: Multimodal local assistant workflows, Privacy-sensitive visual assistant tasks
Gemma 4
Newest Gemma family with Apache-2.0 licensing, multimodal input, 256K context, and sparse on-device variants.
- Free
- local-inference
- open-weights
- on-device
Best for: Multimodal local assistant workflows, Multimodal document understanding
GLM-4.5 Air
Open-weight GLM model variant for local reasoning, coding, and automation workflows.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Private local LLM workflows, Reasoning and coding support in automation tasks
GLM-4.7-Flash
Lightweight GLM 4.7 branch focused on fast coding, reasoning, and long-context generation.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Fast local coding assistants, Reasoning-heavy drafting with tighter latency budgets
gpt-oss-20b
Apache-2.0 open-weight text model with long context and practical local deployment targets.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Private drafting and extraction workflows, Batch automations with stable cost control
Hallo
Open-source portrait animation model for higher-fidelity talking-head generation from one image and driving audio.
- Free
- avatar-video
- local-inference
- open-source
Best for: High-fidelity portrait animation from one image and audio
HiDream
Open HiDream family for quality-focused generation and instruction-based image editing.
- Free
- image-generation
- text-to-image
- image-editing
Best for: Thumbnail and visual concept generation, Fast style exploration for creator content
HunyuanDiT
Tencent open-weights diffusion-transformer image model family for high-quality text-to-image generation.
- Free
- image-generation
- text-to-image
- open-weights
Best for: Developer workflows, Faceless content production
InternVL 3.5
Apache-2.0 multimodal family with many size options and a strong focus on reasoning, OCR, and agent-style visual tasks.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Multimodal internal analysis workflows, Builders experimenting with vision-language tasks
InvokeAI
Polished local-first generative image platform with strong workflow UX for Stable Diffusion users.
- Free
- image-generation
- stable-diffusion
- local-inference
Best for: Professional local image workflows with cleaner UX
Kandinsky 3
Open-weights text-to-image model family oriented to prompt-following and stylistic generation.
- Free
- image-generation
- text-to-image
- open-weights
Best for: Thumbnail and visual concept generation, Faceless content production
Kimi K
Earlier open-weight Kimi branch for long-context reasoning and local LLM experimentation.
- Free
- local-inference
- open-weights
- reasoning
Best for: Local long-context drafting and analysis, Builders comparing open-weight LLM stacks
Kimi K2.6
Latest open-weight Kimi model for long-horizon coding, agent swarms, multimodal execution, and large-context local experimentation.
- Free
- local-inference
- open-weights
- reasoning
Best for: Local agentic coding workflows, Multimodal local assistant builds
Kokoro TTS
Compact open-weight TTS model for local voice synthesis and experimentation.
- Free
- text-to-speech
- voiceover
- local-inference
Best for: Lightweight local text-to-speech experiments
Kolors
Open-weights text-to-image model family from Kwai for high-quality image synthesis workflows.
- Free
- image-generation
- text-to-image
- open-weights
Best for: Thumbnail and visual concept generation, Faceless content production
LatentSync
Open-source lip-sync framework for generating talking portrait videos from audio and face inputs.
- Free
- avatar-video
- local-inference
- open-source
Best for: Free local talking-head generation
LivePortrait
Open-source local portrait animation tool that turns a single image into a talking video.
- Free
- avatar-video
- local-inference
- open-source
Best for: Local avatar animation workflows
Llama 3.1
Open model family often used as a balanced local default for general chat, writing, and coding.
- Free
- local-inference
- open-weights
- self-hosted
Best for: General local chat and assistant workflows, Summarization and drafting tasks
Llama 3.2 Vision
Vision-capable Llama model for local image-plus-text understanding tasks.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Local image + text analysis workflows, Multimodal document understanding
Llama 3.3
Larger Llama generation aimed at high-quality local reasoning and assistant workflows.
- Free
- local-inference
- open-weights
- self-hosted
Best for: High-quality local assistant workflows, Reasoning-heavy long-form tasks
Llama 4
Open-weight multimodal family with massive context, but significant policy and license constraints.
- Free
- local-inference
- open-weights
- multimodal
Best for: Large multi-document summarization pipelines, Multimodal internal analysis workflows
LocalAI
Open-source local AI runtime with OpenAI-compatible APIs for self-hosted LLM and multimodal workloads.
- Free
- local-inference
- self-hosted
- open-source
Best for: Local model serving and testing, Private local LLM workflows
LocalForge
Open-source local app for running AI models and workflows on your own machine.
- Free
- local-inference
- open-source
- self-hosted
Best for: Local model serving and testing, Private local assistant workflows
MiniCPM-V 2.6
Efficient local VLM with strong OCR, multi-image, and video understanding in an 8B-class footprint.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Private visual document analysis, Multimodal local assistant workflows
Ministral 3 8B
Apache-2.0 open-weight 8B model tuned for efficient local use with very long context.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Long-document summarization and extraction, Private local assistant workflows
Mistral NeMo
Mid-size model line that balances general reasoning, coding support, and local deployability.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Balanced local assistant workloads, Coding and reasoning mixed tasks
Mistral Small 4
Open hybrid Mistral model that combines instruct, reasoning, coding, OCR, and transcription in one 256K-context family.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Multimodal local assistant workflows, Multimodal document understanding
Mixtral 8x22B
Mixture-of-experts model family offering strong quality with favorable active-parameter efficiency.
- Free
- local-inference
- open-weights
- self-hosted
Best for: High-end local inference setups, Long-context reasoning workflows
Molmo
Open vision-language family from AI2 focused on strong multimodal quality with Apache-2.0 licensing.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Multimodal document understanding, Private visual document analysis
MuseTalk
Open-source real-time lip-sync framework for talking avatar and portrait video workflows.
- Free
- avatar-video
- local-inference
- open-source
Best for: Free local talking-head generation
NVIDIA Nemotron
Open model family for agentic AI with reasoning-focused releases across edge, single-GPU, and multi-GPU tiers.
- Free
- open-weights
- reasoning
- agentic-ai
Best for: Agentic AI prototyping, Reasoning-heavy developer workflows
Ollama
Local LLM runtime for running open models on your own machine with simple CLI and API workflows.
- Free
- local-inference
- self-hosted
- offline
Best for: Local model serving and testing, Privacy-first AI workflows
Phi-3 Mini
Lightweight Phi model family for fast local inference on modest hardware.
- Free
- local-inference
- open-weights
- on-device
Best for: Low-latency local chat and coding help, Entry-level local LLM deployments
Phi-3.5 Mini Instruct
MIT-licensed small model with long context, optimized for practical local and on-device use.
- Free
- local-inference
- open-weights
- on-device
Best for: Private drafting and summarization on modest hardware, Lightweight offline content automation
Phi-3.5 Vision Instruct
Compact MIT-licensed multimodal model for local image, OCR, chart, and multi-image reasoning tasks.
- Free
- local-inference
- open-weights
- on-device
Best for: Multimodal document understanding, Private visual document analysis
Phi-4
Higher-capability Phi model for instruction-following and reasoning-heavy local tasks.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Reasoning-heavy local workflows, Structured instruction and planning tasks
Phi-4 Reasoning
Reasoning-tuned Phi-4 variant for complex chain-of-thought style local workloads.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Complex reasoning and analytical tasks, Local private inference with explicit step logic
Piper TTS
Fast local neural text-to-speech engine for offline voice generation.
- Free
- text-to-speech
- voiceover
- local-inference
Best for: Local private text-to-speech pipelines
PixArt-Σ
Open-weights text-to-image model line focused on efficient high-resolution generation.
- Free
- image-generation
- text-to-image
- open-weights
Best for: Thumbnail and visual concept generation, Faceless content production
Qwen2.5
Versatile multilingual open model family with strong long-form writing and instruction-following behavior.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Multilingual content generation, Long-form drafting and rewriting
Qwen2.5 VL
Multimodal Qwen model family for local vision-language workflows.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Multimodal local assistant workflows, Private visual document analysis
Qwen3 8B
Apache-2.0 open-weight 8B model with 128K context, local-first deployment, and optional cloud API access.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Private local writing and rewriting, Multilingual content transformation
Qwen3.5
Native multimodal Qwen family with sparse MoE scaling, strong agent behavior, and a flagship 397B total / 17B active open model.
- Free
- local-inference
- open-weights
- self-hosted
Best for: Multimodal local assistant workflows, Private visual document analysis
Qwen3.6
Qwen3.6 family covering the hosted Qwen3.6-Plus flagship and the first open-weight Qwen3.6-35B-A3B release.
- Free
- cloud-llm
- local-inference
- open-weights
Best for: Teams choosing between hosted and local Qwen generation, Agentic coding workflows
Qwen3.6-35B-A3B
First open-weight Qwen3.6 model: a 35B total / 3B active multimodal MoE focused on agentic coding and practical local use.
- Free
- local-inference
- open-weights
- apache-2-0
Best for: Local agentic coding workflows, Multimodal local assistant builds
Reflow Studio
Open-source local dubbing workstation combining RVC voice cloning, Wav2Lip lip sync, and face enhancement.
- Free
- lip-sync
- dubbing
- voice-cloning
Best for: Local avatar animation workflows, Developers building lip-sync and dubbing workflows
SadTalker
Open-source audio-driven talking-face generator for creating avatar-style clips from still portraits.
- Free
- avatar-video
- local-inference
- open-source
Best for: Free local talking-head generation
Stable Diffusion
Open model family for text-to-image generation, spanning v1.x, v2.x, SDXL, and SD3/SD3.5.
- Free
- image-generation
- design
- thumbnails
Best for: Faceless content production, Solopreneur operations
SwarmUI
Local-first Stable Diffusion UI focused on multi-model orchestration and scalable generation queues.
- Free
- image-generation
- stable-diffusion
- local-inference
Best for: Local Stable Diffusion workflows with larger job queues
VideoReTalking
Open-source talking-head editing stack for re-syncing, re-voicing, and expression-aware face video edits.
- Free
- avatar-video
- local-inference
- open-source
Best for: Talking-head repair and re-sync for existing video
Voicebox
Local-first open-source voice cloning studio powered by Qwen3-TTS.
- Free
- text-to-speech
- voice-cloning
- local-inference
Best for: Local custom voiceover pipelines, Advanced local text-to-speech pipelines
Wav2Lip
Open-source lip-sync model for syncing speech to an existing face video or portrait clip.
- Free
- avatar-video
- local-inference
- open-source
Best for: Fast local lip-sync for recorded face video
Z-Image
Z-Image text-to-image family for high-fidelity generation and fast iterative visual production.
- Free
- image-generation
- text-to-image
- image-editing
Best for: Thumbnail and visual concept generation, Fast style exploration for creator content