> ## Documentation Index
> Fetch the complete documentation index at: https://aiplaybooklac.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Models

# Small Language Models: The 2026 Landscape

> **Created by Adrian Dunkley** | [maestrosai.com](https://maestrosai.com) | [ceo@maestrosai.com](mailto:ceo@maestrosai.com) | Fair Use

***

This is a practical map of the small-language-model ecosystem as it stands in April 2026. The field moves fast; what follows focuses on families that are stable, open-weight, and usable for LAC small businesses today.

All sizes below refer to the most practical quantized (usually Q4) version that runs on consumer hardware. Parameter counts are from each vendor's release notes and model cards.

***

## The 2026 SLM short list

| Model family                   | Sizes                        | License                      | Strongest at                            | Languages                                |
| ------------------------------ | ---------------------------- | ---------------------------- | --------------------------------------- | ---------------------------------------- |
| **Microsoft Phi-4**            | 3.8B, 14B                    | MIT (open weights)           | Reasoning, math, code                   | English-first, decent Spanish/Portuguese |
| **Microsoft Phi-3.5 Mini**     | 3.8B                         | MIT                          | Low-memory deployment                   | English-first                            |
| **Google Gemma 4**             | 2B, 9B, 27B                  | Gemma License (open weights) | Agentic workflows, multimodal           | 140+ languages incl. strong ES/PT        |
| **Google Gemma 3**             | 2B, 9B, 27B                  | Gemma License                | General purpose, image understanding    | Broad multilingual                       |
| **Meta Llama 4 Scout**         | 17B active (109B total, MoE) | Llama 4 Community License    | Long context (10M tokens), multimodal   | 200+ languages                           |
| **Meta Llama 3.3**             | 8B, 70B                      | Llama Community              | Stable, well-supported, many fine-tunes | Good in ES/PT, ok in FR                  |
| **Mistral 7B / NeMo / Small**  | 7B to 12B                    | Apache 2.0 (most variants)   | Fine-tunability, European languages     | Strong in FR/ES/IT                       |
| **Mistral Ministral**          | 3B, 8B                       | Research license             | Edge devices                            | Multilingual                             |
| **Qwen 3**                     | 0.5B to 32B                  | Apache 2.0                   | Strong reasoning, very fast inference   | Chinese + English + multilingual         |
| **DeepSeek V3 / R1 distilled** | 7B to 32B                    | MIT-style                    | Reasoning (R1 distills)                 | English + Chinese primarily              |
| **IBM Granite 3**              | 2B, 8B                       | Apache 2.0                   | Enterprise-grade, business documents    | Professional English, decent ES          |
| **SmolLM 2**                   | 135M, 360M, 1.7B             | Apache 2.0                   | Tiny, on-phone use cases                | English-focused                          |

***

## Deep dive on the five most useful for LAC SMBs

### Microsoft Phi-4 and Phi-4 Mini

* **Why it matters for LAC**: Phi-4 punches well above its weight on reasoning and math. It's excellent for invoice extraction, accounting reconciliation, and any structured-data task. Phi-4 Mini at 3.8B runs comfortably on an older laptop.
* **Weakness**: English-first. For Spanish and Portuguese output it works, but you'll want to fine-tune on a few hundred examples of your business's style.
* **Licensing**: MIT. You can run it commercially, modify it, ship it.
* **Hardware**: Phi-4 Mini (3.8B) runs on 8 GB RAM. Phi-4 (14B) runs on 16-24 GB.
* **Good for**: Back-office tasks, document processing, offline assistants.

### Google Gemma 4

* **Why it matters for LAC**: Gemma 4 (April 2026) is purpose-built for reasoning and agentic workflows. The 9B model hits a sweet spot: strong output, modest hardware, broad multilingual coverage, and it fine-tunes cleanly.
* **Multilingual quality**: 140+ languages in training data, with Brazilian Portuguese, Mexican Spanish, Argentine Spanish, and French handled well out of the box. Kreyòl and Papiamento need fine-tuning or human review.
* **Hardware**: 2B on 6 GB RAM, 9B on 16 GB, 27B on 32+ GB.
* **Good for**: Customer-service agents, content drafting, summarisation, the brain of a privacy-first WhatsApp agent.

### Meta Llama 4 Scout

* **Why it matters for LAC**: At 17B active parameters (MoE with 109B total), Scout is the largest practical "small" model in 2026. It supports a 10-million-token context window, which means it can hold a whole year of business documents in memory. Multimodal, too.
* **Multilingual**: pre-trained on 200 languages including over 100 with >1 billion tokens each. Top-tier Spanish and Portuguese. Good French and Dutch (helpful for Curaçao, Aruba).
* **Hardware**: full-precision Scout needs a workstation GPU. Quantised Q4 runs on a 24 GB RTX 4090 or an Apple Silicon machine with 64 GB unified memory. Commercial Mac Studios are a realistic target.
* **Good for**: Privacy-first agents that need long context (legal review, multi-week case histories, large knowledge bases).

### Mistral 7B / NeMo / Small

* **Why it matters for LAC**: Apache 2.0 on most variants (no gotchas). Strong in Romance languages, including French Caribbean content. Easiest family to fine-tune with modest data.
* **Multilingual**: French, Spanish, Italian, Portuguese, and English are all strong. NeMo in particular is trained with multilingual emphasis.
* **Hardware**: 7B Q4 runs on 8 GB RAM. NeMo 12B wants 16 GB.
* **Good for**: French Caribbean use cases (Martinique, Guadeloupe, Haiti), European Spanish/Portuguese, bilingual content.

### Qwen 3

* **Why it matters for LAC**: Extremely strong reasoning for its size, very fast inference, flexible size options from 0.5B to 32B. Apache 2.0 licensed. The 7B and 14B variants are standouts for mid-range hardware.
* **Multilingual**: strongest in English and Chinese, competent in Spanish and Portuguese, weaker in French Caribbean and Kreyòl (review required).
* **Hardware**: Qwen 3 7B runs on 8 GB, 14B on 16 GB, 32B on 24 GB (Q4).
* **Good for**: High-throughput tasks where latency matters: voice assistants, POS integrations, real-time dashboards.

***

## Scorecard: picking the right SLM for a task

Columns are qualitative ratings based on public benchmarks and practitioner reports; see [rankings/global-benchmarks.md](../rankings/global-benchmarks.md) for numeric scores.

| Task                                   | Best SLM                                                | Second choice                      |
| -------------------------------------- | ------------------------------------------------------- | ---------------------------------- |
| Invoice / receipt extraction           | Phi-4                                                   | Gemma 4 9B                         |
| WhatsApp customer reply (ES/PT)        | Gemma 4 9B                                              | Llama 4 Scout (if hardware allows) |
| Long-context document review           | Llama 4 Scout                                           | Qwen 3 32B                         |
| Voice assistant (on-device)            | Qwen 3 7B                                               | Phi-4 Mini                         |
| Marketing copy in ES/PT                | Gemma 4 9B                                              | Mistral NeMo                       |
| Marketing copy in French Caribbean     | Mistral Small                                           | Gemma 4 9B                         |
| Business Q\&A on company docs (RAG)    | Phi-4 or Gemma 4 9B                                     | Llama 3.3 8B                       |
| Offline clinical notes (Cuban clinics) | Mistral NeMo                                            | Gemma 4 9B                         |
| Agricultural advice chatbot            | Gemma 4 9B                                              | Llama 4 Scout                      |
| Kreyòl / Papiamento output             | Frontier cloud model, or fine-tuned Gemma 4 with review | N/A                                |

***

## What to avoid

* **Models older than 12 months** unless they have a clear niche. The field is moving fast; 2024-era SLMs are rarely worth setting up in 2026.
* **Models without open weights** if your goal is offline deployment. You can't run them locally.
* **Licenses with usage restrictions** that your business model might trip. Read the license for Llama 4 and Gemma before shipping a product.
* **Unreviewed Kreyòl, Papiamento, or indigenous-language output**. No 2026 SLM is reliable here without human review.

***

## Trends for the next 6 months

* **Multimodal SLMs** are catching up to cloud models. Expect Gemma 4 and Phi-5 to handle images, charts, and simple video by year-end.
* **MoE (mixture-of-experts)** SLMs like Llama 4 Scout give frontier-like behavior with consumer hardware. More vendors will ship MoE small models.
* **Tool-use SLMs**: open-weight models fine-tuned specifically for agent tool use. Watch for Gemma 4 agentic variants and Qwen 3 function-calling releases.
* **LAC language fine-tunes**: community fine-tunes for Brazilian Portuguese, Caribbean Spanish, and Kreyòl are beginning to emerge on Hugging Face. Track for your market.

***

## Related reading

* [use-cases.md](use-cases.md): concrete LAC scenarios using the models above.
* [deployment.md](deployment.md): how to actually install and run.
* [rankings/global-benchmarks.md](../rankings/global-benchmarks.md): numeric benchmark scores.
* [governance/README.md](../governance/README.md): SLMs as a data-residency solution.

***

*Created by Adrian Dunkley | MaestrosAI | maestrosai.com | [ceo@maestrosai.com](mailto:ceo@maestrosai.com)*
*Fair Use, Educational Resource | April 2026*
*SEO: SLM models 2026 | Phi-4 LAC | Gemma 4 Caribbean | Llama 4 Scout | Mistral NeMo | Qwen 3 | open weights LAC | small language model comparison*