Open Weight Is Not Open Source

We installed Qwen3.6-35B-A3B on a €2,000 mini-PC. Then we opened its LICENSE file. Apache 2.0. That single line is the second thing that changed in the local-AI stack over the past twelve months, and the one that matters more than the benchmarks. The European Union AI Act draws a sharp boundary between open-weight and open-source. Twelve months ago, no flagship open-weight model from any major provider sat on the open-source side of that line. Today, Qwen 3, Mistral Large 3, and DeepSeek V4 do. Llama 4 and Gemma 4 do not.

Practitioner observation, not legal advice. Concrete deployment requires your own counsel. This post names the questions, it does not answer them.

How twelve months changed the picture

Until late 2024, the open-weight conversation was a story about availability: which models had downloadable weights at all. Llama led that story; Mistral, Mixtral, Qwen 2, and Gemma followed. The license attached to those weights was a secondary concern. Most readers conflated “downloadable” with “open-source” and moved on.

The EU AI Act has made the secondary concern primary. Article 53 sets out the obligations for providers of general-purpose AI models. Article 53(2) carves out a narrow exemption for models released under “free and open-source” licences. The exemption is not honorary. It removes specific documentation duties, removes the requirement to appoint an EU representative for third-country providers, and reduces what downstream deployers can be asked to inherit. Whether a given model qualifies for that exemption is a question with operational consequences.

The twelve months from May 2025 to May 2026 saw an asymmetric response across providers:

Provider	Pre-2026 flagship	Licence	2026 flagship	Licence
Mistral (frontier)	Large 2 (Jul 2024)	Mistral Research Licence (explicitly non-commercial)	Mistral Large 3 (Dec 2025), 675B / 41B active	Apache 2.0
Alibaba (Qwen)	Qwen 2.5-72B (Sep 2024)	Qwen Licence with commercial restrictions on largest variant	Qwen 3 family (28 Apr 2025) onward	Apache 2.0 across all sizes
Meta (Llama)	Llama 3	Llama Community Licence: 700M MAU cap, competitor restriction, ban on training competing AI	Llama 4	Same Llama Community Licence
Google (Gemma)	Gemma 1 to 3	Gemma Licence + Acceptable Use Policy	Gemma 4	Same Gemma Licence + AUP
DeepSeek	V2 / V3	MIT for weights	V4	MIT

Mistral’s flagship licence switch landed in December 2025, eight months before the AI Act’s enforcement deadline of 2 August 2026. Alibaba had already moved Qwen 3 to Apache 2.0 in April 2025. Meta and Google did not move. Whether the timing is causal is impossible to prove from outside, but the correlation is sharp enough that the burden of explanation now sits with the providers who held their ground.

What Article 53(2) actually says

The Hugging Face team has compiled the most readable practitioner walkthrough of the open-source criteria as they apply to general-purpose AI models. Three conditions, all of them required:

The model is released under a free and open-source licence permitting access, usage, modification, and redistribution.
The weights, model architecture, and usage information are publicly available.
The model is not provided against a price or otherwise monetised.

The first condition is where the boundary lives. The Hugging Face guidance, reading from the official Code of Practice for general-purpose AI, states the test directly:

Licenses with usage restrictions (e.g., research-only, acceptable use restrictions, commercial terms) do not qualify as a free and open-source licence.

Apache 2.0 qualifies. MIT qualifies. OpenMDW qualifies. The Llama Community Licence does not. Its 700M monthly-active-user cap is a usage restriction tied to commercial scale, and its prohibition on using Llama to train competing models is a usage restriction tied to purpose. The Gemma Licence does not either. Its Acceptable Use Policy attaches deployment-time restrictions that the open-source definition does not contemplate. Mistral Large 3 under Apache 2.0 sits cleanly inside the exemption. Qwen 3 under Apache 2.0 sits cleanly inside the exemption.

This is not the Open Source Initiative’s definition. It is the European Commission’s working interpretation, encoded into Article 53 of the AI Act and made operational through the Code of Practice. Different boundary, different consequences.

License meets hardware

The licence dimension is one axis. Hardware deployability is the other. The conventional industry framing collapses both into a single “open-source LLM” bucket. The honest matrix is two-dimensional:

	FOSS-clean (Apache 2.0 / MIT)	Other terms
Fits a €2k 128 GB UMA box	Qwen3.6-35B-A3B (35B / 3B active MoE)	Gemma 4 (smaller variants)
Needs workstation or dual-H100+	Mistral Large 3 · Mistral Medium 3.5* · DeepSeek V4	Llama 4 Maverick

*Mistral Medium 3.5 ships under modified MIT. See Mistral’s licence overview for the per-tier terms. Memory footprint at Q4 quantisation lands near the 128 GB UMA ceiling but is bandwidth-bound on integrated graphics; in practice a workstation is the right deployment target for either Mistral tier.

The matrix produces one intersection where a German Mittelstand can deploy today with both clean licence posture and commodity hardware: Qwen3.6 on a €2,000 unified-memory mini-PC. Every other quadrant requires either a workstation budget, a per-use-case licence review, or both. That single intersection is what made Post 1 of this series possible. It did not exist twelve months ago.

The Mistral picture is more layered than the headline

Mistral is the only European provider holding ground at the frontier model tier after Cohere acquired Aleph Alpha in April 2026. The relevant facts about Mistral’s licence posture deserve to be stated precisely, and Mistral’s own documentation is the right source for them.

Mistral Large 3, the 675-billion-parameter mixture-of-experts model released in December 2025, ships under Apache 2.0. The licence is clean. The deployment is not casual: at FP8 precision it asks for four H100-class GPUs, and at 4-bit quantisation it asks for a high-end workstation with 256 GB of unified memory or comparable hardware. Mario covered the inference side of that picture in Sovereign Inference Is Not Sovereign Memory.

Mistral Medium 3.5, the 128-billion-parameter dense model released in April 2026, ships under modified MIT, a permissive licence with terms that apply to high-revenue commercial use. The hardware footprint is more accessible than Large 3 but still sits above what a €2,000 mini-PC delivers comfortably. The per-use-case licence terms are documented on Mistral’s own site.

The honest reading: Mistral’s flagship moved into clean open-source territory; Mistral’s more deployable mid-tier is permissive but not unrestricted. For Mistral, both current open-weight tiers, the practical path involves more compute than a €2,000 mini-PC and a per-use-case licence review against Mistral’s licence overview. The differentiation is Mistral’s, not ours; we describe what they ship.

The open-source label, claimed and contested

Meta has consistently described Llama as “open-source AI” in announcements, marketing, and developer outreach. The Open Source Initiative, the organisation that maintains the actual Open Source Definition the term has referred to since 1998, has published a direct rejection of that label for both Llama 3 and Llama 4. Their characterisation: “open washing”, an attempt to redefine the term for corporate benefit.

The disagreement is not academic. Three Llama Licence clauses do operational work that no actual open-source licence permits. The 700-million-monthly-active-user cap converts free use into a discretionary licence at scale. The competitor restriction blocks entire product categories regardless of user count. The prohibition on using Llama outputs to train other AI models limits a downstream researcher’s freedom in a way that Apache 2.0 explicitly does not.

Each clause individually fails the unrestricted-use criterion that both the OSI definition and the EU AI Act’s working interpretation require. Calling Llama open-source under either framework is a category claim the licence text does not support. The post is not an argument with Meta’s choice to ship a restrictive licence. That is their commercial decision. It is an argument with the labelling of that decision.

The architectural consequence

For a knowledge platform that runs against multiple model engines, licence posture becomes a routing variable, not an architecture commitment. A workspace can declare which licence tiers it accepts and the engine layer enforces the rule. Inference for a regulated workload routes only to engines whose underlying weights ship under Apache 2.0, MIT, or OpenMDW. Inference for a low-stakes workload can route to a permissive but non-FOSS engine when that fits. The audit log records which model under which licence processed which request. That produces evidence the deployer can put in front of an auditor without reconstructing it after the fact.

This is the same architectural principle that handles the inference / memory split, the engine swappability, and the operation-level audit trail that earlier posts in this series have laid out. Licence-aware routing is a fourth dimension on the same proxy interface, not a separate system. The platform does not interpret the regulation. It records what was used, when, under which terms, and it makes the routing rule changeable as the regulatory weather changes.

What the platform does not do, and what no platform can do for the deployer, is decide whether the deployer’s specific use case qualifies for an exemption. That decision sits one layer up.

Five questions that remain open

Before this kind of setup goes into production, the following questions belong on the desk of any data-protection counsel advising on EU AI Act compliance. They are named here because they remain open. We have not resolved them.

The monetisation criterion for providers with parallel commercial offerings. Mistral releases Mistral Large 3 weights under Apache 2.0 while operating a paid hosting service for the same model. The Code of Practice text suggests the unmonetised release qualifies for the exemption regardless of separate hosted services. No enforcement decision has tested this against a provider with a parallel revenue stream.
Systemic-risk classification of FOSS frontier models. Mistral Large 3, with reported training on roughly 3,000 H200 GPUs, plausibly crosses the 10²⁵-FLOP threshold that triggers systemic-risk obligations under the AI Act. If so, the FOSS exemption covers the documentation duties of Article 53(1)(a) and (b) and the EU representative requirement of Article 54, but the systemic-risk obligations remain in force regardless.
What “substantial fine-tuning” means for downstream provider status. A deployer who fine-tunes a base model materially can be classified as a provider in their own right under Article 25, with the upstream exemption no longer cascading. The Act does not quantify “substantial”. For a Mittelstand fine-tuning Qwen 3 on internal documents, this question is load-bearing for their compliance posture.
Modified-MIT classification under the open-source criterion. Mistral Medium 3.5’s modified MIT licence carries terms for high-revenue commercial use. Whether that pattern triggers the same usage-restriction disqualification as the Llama Community Licence’s MAU cap is a question the Code of Practice has not yet tested in a published opinion.
Multi-tenant SaaS cascading. A platform operator that hosts a FOSS-exempt upstream model for many tenants: what is the operator’s status under Articles 25 and 26? Whether upstream provider exemption affects downstream multi-tenant operator obligations, or whether the two are orthogonal, has not been settled in published guidance.

These five are open. The answers turn on facts specific to a deployment that no general post can supply. The questions themselves are stable enough to put on a meeting agenda.

We picked Qwen3.6-35B-A3B because the licence cleared the bar. We deployed it on a €2,000 mini-PC because the model also cleared the capability bar (local tool-use, multi-step MCP routing, recovery from tool errors that broke a year ago) and because the mixture-of-experts architecture turned the “too much memory, too little bandwidth” Strix-Halo trade-off into the right shape of hardware. That is the next post.