Copyleft, Copyright, and Why Free Software Licenses Matter More Than Ever in the Age of AI

#freesoftware
#GNU
#AI
#copyright
#copyleft

Srinivas Gowda

6 min readFebruary 16, 2026

Copyleft, Copyright, and Why Free Software Licenses Matter More Than Ever in the Age of AI

It’s 2026, and I’m writing about a talk I attended nearly four years ago. That probably needs an explanation.

In October 2022, I sat in an auditorium at Reva University in Bangalore and listened to Richard Stallman speak about the Free Software Movement and GNU. At the time, ChatGPT hadn’t even launched yet. “AI” was still something most people associated with research labs and self-driving car demos, not with their daily workflow.

Stallman spent the afternoon talking about copyleft, copyright, and the four freedoms—ideas he’s been articulating since the 1980s. I found it interesting. I took some mental notes. And then, like most people after a good talk, I went on with my life.

But over the past couple of years, the licensing questions he kept returning to have become impossible to ignore.

Major projects like Redis, HashiCorp, and Elasticsearch have moved away from traditional open source licenses to more restrictive “source-available” models. Companies routinely label AI systems “open source” while shipping usage restrictions the Open Source Initiative explicitly rejects. The Linux Foundation has proposed new licensing frameworks like OpenMDW, purpose-built for machine learning, because existing software licenses weren’t designed for models. The EU AI Act now references “free and open-source AI” but can’t cleanly define what that means. Meanwhile, debates about whether training on GPL-licensed code makes a model a derivative work remain legally unresolved—even as billions of dollars ride on the answer.

Suddenly, that Thursday afternoon in Bangalore doesn’t feel like a memory from another era. It feels like a warning I should have taken more seriously.

I also walked away with a small souvenir: a signed copy of Free Software, Free Society.

Signed title page of Free Software, Free Society (Third Edition)

Richard Stallman speaking at Reva University (Bangalore, October 2022)

TL;DR

“Free software” is about freedom, not price: the right to run, study, modify, and share.
Copyleft uses copyright law to make those freedoms propagate forward.
AI breaks our old vocabulary (“open source”, “open weights”, “source available”) because different layers are licensed differently.
If you can run it but not study or modify it, it isn’t free. If you can download weights but can’t reproduce or audit the system, it isn’t meaningfully open.

What Stallman Actually Means by “Free”

Before getting into licenses, it’s worth restating something Stallman repeated during the talk: free software is about freedom, not price.

Think “free” as in free speech, not free beer.

The four freedoms are deceptively simple:

Run the program as you wish, for any purpose.
Study how it works, and change it.
Redistribute copies.
Distribute modified versions.

If software grants you all four, it’s free software. If it withholds any of them, it isn’t.

That sounds philosophical—and it is. But it has very concrete consequences once you start looking at how licenses work. In 2026, those consequences show up everywhere.

Copyleft vs. Copyright: The Core Tension

Copyright, in the conventional sense, is a mechanism of restriction. When you write software and hold the copyright, you control how others can use, modify, and distribute it. It’s the legal scaffolding that makes proprietary software possible.

Copyleft takes that same mechanism and flips it on its head.

Instead of using copyright to restrict, copyleft uses it to guarantee freedom. The GNU General Public License (GPL) effectively says:

You can use this software.
You can study it.
You can modify it.
You can share it.

But any derivative work you distribute must carry the same freedoms.

You cannot take free software, modify it, and lock it down. Freedom, once granted, must propagate.

Permissive licenses like MIT and BSD say:

Here’s the code. Do whatever you want.

Copyleft licenses like the GPL say:

Here’s the code. Do whatever you want—but keep it free.

In 2022, that tension felt like a decades-old argument between pragmatists and idealists. In 2026, it’s a fault line running straight through AI.

AI Breaks the Old Vocabulary

When people argue about whether an AI system is “open”, they’re often talking past each other because “the system” isn’t one thing. At a minimum, you’re dealing with:

Model architecture (code)
Training data (text, images, audio, code)
Model weights
Inference + deployment code (the thing you actually run)

Each layer can be licensed differently. Some layers don’t even have clean legal frameworks for licensing in the first place.

Meta releases LLaMA and calls it “open source,” but with usage restrictions that don’t meet the Open Source Initiative’s definition. OpenAI started as a non-profit committed to openness and now keeps its most powerful models entirely proprietary.

The word “open” is being stretched, bent, and—sometimes—hollowed out.

Stallman’s framework cuts through this: don’t ask “can I see something?” Ask what freedoms you actually have.

The Training Data Problem

One of the most contentious issues in AI is still training data.

Large models are trained on vast corpora that include copyrighted books, articles, forum posts, and code—often without explicit consent from creators.

If a model is trained on GPL-licensed code:

Is the model a derivative work?
Does copyleft propagate through training?

No court has provided definitive answers yet.

But the philosophical question matters as much as the legal one. Copyleft isn’t only a legal instrument; it’s a moral argument:

If something is built from a commons, it should remain in the commons.

In a world where AI models are built on the collective output of millions and then locked behind API paywalls, that argument lands differently than it did in 2022.

The “Open Weights” Trap

A common pattern now looks like this: companies release model weights but retain control over:

Training data
Training process
Usage rights

This creates the illusion of openness.

You can download the weights. You can run the model. But:

You can’t reproduce it.
You can’t fully study how it was built.
You may not be able to use it commercially.

Partial freedom is not freedom.

A binary you can run but can’t modify isn’t free software. A model you can use but can’t understand isn’t open AI.

A Practical Checklist for “Open” AI

If a company calls a model “open source”, I now try to reduce the discussion to a checklist:

Weights: Are they available? Under what license?
Inference: Can I run the model locally without remote dependency?
Training code: Is it available and buildable?
Data: Is the training data available, licensed, and documented?
Reproducibility: Can an independent team reproduce something equivalent?
Modification: Can I fine-tune, merge, or adapt it, then redistribute my version?
Restrictions: Are there behavioral clauses (“you may not use for…”) that make it source-available rather than open?

You don’t need everyone to agree on a single word. You do need clarity about the freedoms and constraints.

The Rise of New Licensing Frameworks

Since 2022, we’ve seen serious attempts to build AI-specific licenses:

OpenMDW (Linux Foundation initiative)
Responsible AI Licenses (RAIL)
AI-focused source-available licenses

Some embed behavioral restrictions. Some attempt to cover patents, database rights, and trade secrets. Some explicitly reject copyleft.

Enterprise licensing decisions now tend to prioritize:

Scalability
Commercial safety
Compatibility with proprietary AI layers

Permissive licenses dominate because they let companies build proprietary intelligence on top of open foundations without triggering copyleft.

Stallman would probably have a lot to say about that.

Licenses as Infrastructure

Almost every AI pipeline today depends on open source software:

PyTorch
TensorFlow
Hugging Face Transformers
scikit-learn
NumPy

These licenses are not trivia. They’re legal and ethical infrastructure.

If you’re building an AI product and you don’t understand the licenses in your dependency tree, you’re building on a foundation you don’t understand. That’s not just legal risk. It’s intellectual negligence.

What I Think We Should Do

Four years after that talk in Bangalore, here’s what I’ve come to believe.

1. Learn the Licenses

At least get comfortable with:

GPL, LGPL, AGPL
MIT, BSD, Apache 2.0
Creative Commons variants (especially for data)

If you work in AI, learn how these interact with model weights, training data, and inference code. This is no longer optional knowledge.

2. Be Precise With Language

When someone says “open”, ask: open in what sense?

Weights?
Training code?
Data?
Rights to modify and redistribute?

Vagueness benefits whoever profits from confusion.

3. Advocate for Genuine Openness

Support projects committed to openness across code, data, and methodology. Push for licensing frameworks that protect both creators and users without laundering restrictions through “open” branding.

4. Respect the Commons

If you’ve built something using open source tools and public data, think seriously about what you owe in return.

Copyleft is not just a license. It’s a social contract.

The AI boom was built on decades of shared work. Locking down the outputs of that collaboration should make us uncomfortable.

Final Thought

Richard Stallman is polarizing. He’s uncompromising—sometimes to a fault.

But in 2026, watching the AI industry wrestle with openness, ownership, and freedom at global scale, his ideas feel less like relics and more like a diagnostic framework.

The question is no longer:

Is the code open?

The question is:

Who has the freedom—and who doesn’t?

If we lose sight of that, we risk ending up in a world where the most powerful technology ever created is controlled by a handful of companies, built on the backs of millions who never agreed to those terms.

That’s not the future I want.

And after sitting through that talk in Bangalore, I don’t think it’s the future Stallman wants either.

Have thoughts on AI licensing, copyleft, or the future of free software? I’d love to hear from you. Let’s keep this conversation going.