flash-1-mini
Personal AI, mobile, edge, first private AI
Free — forever
Tier 1 (laptop)
Specs
- Parameters
- 4 billion
- Context length
- 8K tokens
- Recommended quantization
- Q4_K_M (~2.7 GB)
- Minimum hardware
- Any laptop with 4+ GB RAM
- License
- Open weights, no employee count limit
Capabilities
- Bilingual English / French
- Citation-grounded responses
- Function calling
- RAG-optimized
- Multiple GGUF quantization levels
- Optimized for low-memory devices
- Sub-second response on Apple Silicon
- Quantizations from Q2_K to fp16
Compatibility
- Ollama
- LM Studio
- llama.cpp
- Any GGUF-compliant runtime