ML//model

2026-03-08

Concrete implementations of ML architectures: trained artifacts with specific weights, capabilities, and release contexts. Includes both proprietary (GPT-4, Claude, Gemini) and open-weight families (LLaMA, DeepSeek, Qwen, Mistral).

Distinguished from architectures (Transformer, MoE) and training techniques (Training). A model is architecture + data + training pipeline + weights.

The competitive landscape since 2024: US frontier labs (OpenAI, Anthropic, Google) vs Chinese labs (DeepSeek, Moonshot, Alibaba/Qwen) vs European open-weight (Mistral).