Small Language Models: The 2026 Landscape

Created by Adrian Dunkley | maestrosai.com | ceo@maestrosai.com | Fair Use

This is a practical map of the small-language-model ecosystem as it stands in April 2026. The field moves fast; what follows focuses on families that are stable, open-weight, and usable for LAC small businesses today. All sizes below refer to the most practical quantized (usually Q4) version that runs on consumer hardware. Parameter counts are from each vendor’s release notes and model cards.

The 2026 SLM short list

Model family	Sizes	License	Strongest at	Languages
Microsoft Phi-4	3.8B, 14B	MIT (open weights)	Reasoning, math, code	English-first, decent Spanish/Portuguese
Microsoft Phi-3.5 Mini	3.8B	MIT	Low-memory deployment	English-first
Google Gemma 4	2B, 9B, 27B	Gemma License (open weights)	Agentic workflows, multimodal	140+ languages incl. strong ES/PT
Google Gemma 3	2B, 9B, 27B	Gemma License	General purpose, image understanding	Broad multilingual
Meta Llama 4 Scout	17B active (109B total, MoE)	Llama 4 Community License	Long context (10M tokens), multimodal	200+ languages
Meta Llama 3.3	8B, 70B	Llama Community	Stable, well-supported, many fine-tunes	Good in ES/PT, ok in FR
Mistral 7B / NeMo / Small	7B to 12B	Apache 2.0 (most variants)	Fine-tunability, European languages	Strong in FR/ES/IT
Mistral Ministral	3B, 8B	Research license	Edge devices	Multilingual
Qwen 3	0.5B to 32B	Apache 2.0	Strong reasoning, very fast inference	Chinese + English + multilingual
DeepSeek V3 / R1 distilled	7B to 32B	MIT-style	Reasoning (R1 distills)	English + Chinese primarily
IBM Granite 3	2B, 8B	Apache 2.0	Enterprise-grade, business documents	Professional English, decent ES
SmolLM 2	135M, 360M, 1.7B	Apache 2.0	Tiny, on-phone use cases	English-focused

Deep dive on the five most useful for LAC SMBs

Microsoft Phi-4 and Phi-4 Mini

Why it matters for LAC: Phi-4 punches well above its weight on reasoning and math. It’s excellent for invoice extraction, accounting reconciliation, and any structured-data task. Phi-4 Mini at 3.8B runs comfortably on an older laptop.
Weakness: English-first. For Spanish and Portuguese output it works, but you’ll want to fine-tune on a few hundred examples of your business’s style.
Licensing: MIT. You can run it commercially, modify it, ship it.
Hardware: Phi-4 Mini (3.8B) runs on 8 GB RAM. Phi-4 (14B) runs on 16-24 GB.
Good for: Back-office tasks, document processing, offline assistants.

Google Gemma 4

Why it matters for LAC: Gemma 4 (April 2026) is purpose-built for reasoning and agentic workflows. The 9B model hits a sweet spot: strong output, modest hardware, broad multilingual coverage, and it fine-tunes cleanly.
Multilingual quality: 140+ languages in training data, with Brazilian Portuguese, Mexican Spanish, Argentine Spanish, and French handled well out of the box. Kreyòl and Papiamento need fine-tuning or human review.
Hardware: 2B on 6 GB RAM, 9B on 16 GB, 27B on 32+ GB.
Good for: Customer-service agents, content drafting, summarisation, the brain of a privacy-first WhatsApp agent.

Meta Llama 4 Scout

Why it matters for LAC: At 17B active parameters (MoE with 109B total), Scout is the largest practical “small” model in 2026. It supports a 10-million-token context window, which means it can hold a whole year of business documents in memory. Multimodal, too.
Multilingual: pre-trained on 200 languages including over 100 with >1 billion tokens each. Top-tier Spanish and Portuguese. Good French and Dutch (helpful for Curaçao, Aruba).
Hardware: full-precision Scout needs a workstation GPU. Quantised Q4 runs on a 24 GB RTX 4090 or an Apple Silicon machine with 64 GB unified memory. Commercial Mac Studios are a realistic target.
Good for: Privacy-first agents that need long context (legal review, multi-week case histories, large knowledge bases).

Mistral 7B / NeMo / Small

Why it matters for LAC: Apache 2.0 on most variants (no gotchas). Strong in Romance languages, including French Caribbean content. Easiest family to fine-tune with modest data.
Multilingual: French, Spanish, Italian, Portuguese, and English are all strong. NeMo in particular is trained with multilingual emphasis.
Hardware: 7B Q4 runs on 8 GB RAM. NeMo 12B wants 16 GB.
Good for: French Caribbean use cases (Martinique, Guadeloupe, Haiti), European Spanish/Portuguese, bilingual content.

Qwen 3

Why it matters for LAC: Extremely strong reasoning for its size, very fast inference, flexible size options from 0.5B to 32B. Apache 2.0 licensed. The 7B and 14B variants are standouts for mid-range hardware.
Multilingual: strongest in English and Chinese, competent in Spanish and Portuguese, weaker in French Caribbean and Kreyòl (review required).
Hardware: Qwen 3 7B runs on 8 GB, 14B on 16 GB, 32B on 24 GB (Q4).
Good for: High-throughput tasks where latency matters: voice assistants, POS integrations, real-time dashboards.

Scorecard: picking the right SLM for a task

Columns are qualitative ratings based on public benchmarks and practitioner reports; see rankings/global-benchmarks.md for numeric scores.

Task	Best SLM	Second choice
Invoice / receipt extraction	Phi-4	Gemma 4 9B
WhatsApp customer reply (ES/PT)	Gemma 4 9B	Llama 4 Scout (if hardware allows)
Long-context document review	Llama 4 Scout	Qwen 3 32B
Voice assistant (on-device)	Qwen 3 7B	Phi-4 Mini
Marketing copy in ES/PT	Gemma 4 9B	Mistral NeMo
Marketing copy in French Caribbean	Mistral Small	Gemma 4 9B
Business Q&A on company docs (RAG)	Phi-4 or Gemma 4 9B	Llama 3.3 8B
Offline clinical notes (Cuban clinics)	Mistral NeMo	Gemma 4 9B
Agricultural advice chatbot	Gemma 4 9B	Llama 4 Scout
Kreyòl / Papiamento output	Frontier cloud model, or fine-tuned Gemma 4 with review	N/A

What to avoid

Models older than 12 months unless they have a clear niche. The field is moving fast; 2024-era SLMs are rarely worth setting up in 2026.
Models without open weights if your goal is offline deployment. You can’t run them locally.
Licenses with usage restrictions that your business model might trip. Read the license for Llama 4 and Gemma before shipping a product.
Unreviewed Kreyòl, Papiamento, or indigenous-language output. No 2026 SLM is reliable here without human review.

Trends for the next 6 months

Multimodal SLMs are catching up to cloud models. Expect Gemma 4 and Phi-5 to handle images, charts, and simple video by year-end.
MoE (mixture-of-experts) SLMs like Llama 4 Scout give frontier-like behavior with consumer hardware. More vendors will ship MoE small models.
Tool-use SLMs: open-weight models fine-tuned specifically for agent tool use. Watch for Gemma 4 agentic variants and Qwen 3 function-calling releases.
LAC language fine-tunes: community fine-tunes for Brazilian Portuguese, Caribbean Spanish, and Kreyòl are beginning to emerge on Hugging Face. Track for your market.

use-cases.md: concrete LAC scenarios using the models above.
deployment.md: how to actually install and run.
rankings/global-benchmarks.md: numeric benchmark scores.
governance/README.md: SLMs as a data-residency solution.

Start here

Practical Guide

Tools and Models

AI Agents

Small Language Models

Governance and Responsible AI

AI Risks

Rankings and Benchmarks

Models

Small Language Models: The 2026 Landscape

The 2026 SLM short list

Deep dive on the five most useful for LAC SMBs

Microsoft Phi-4 and Phi-4 Mini

Google Gemma 4

Meta Llama 4 Scout

Mistral 7B / NeMo / Small

Qwen 3

Scorecard: picking the right SLM for a task

What to avoid

Trends for the next 6 months

Start here

Practical Guide

Tools and Models

AI Agents

Small Language Models

Governance and Responsible AI

AI Risks

Rankings and Benchmarks

​Small Language Models: The 2026 Landscape

​The 2026 SLM short list

​Deep dive on the five most useful for LAC SMBs

​Microsoft Phi-4 and Phi-4 Mini

​Google Gemma 4

​Meta Llama 4 Scout

​Mistral 7B / NeMo / Small

​Qwen 3

​Scorecard: picking the right SLM for a task

​What to avoid

​Trends for the next 6 months

​Related reading

Small Language Models: The 2026 Landscape

The 2026 SLM short list

Deep dive on the five most useful for LAC SMBs

Microsoft Phi-4 and Phi-4 Mini

Google Gemma 4

Meta Llama 4 Scout

Mistral 7B / NeMo / Small

Qwen 3

Scorecard: picking the right SLM for a task

What to avoid

Trends for the next 6 months

Related reading