DALL-E/Every illustration.

Copy Rights and Wrongs

Who owns the output of AI—the machines or the creators?

64 3

ICYMI: I’m teaching a new course called How to Write With AI. You’ll learn how to do the best writing of your life and how AI can help you achieve your goals, faster. It’s a four-week course running from September 19 through October 24 that includes hands-on workshops with tools like ChatGPT, Claude, Spiral, and Lex. Discounted early bird registration ends next week. Check out the course website for complete details.


To risk sounding like a zitty 14-year-old who just discovered Atlas Shrugged rather than a 32-year-old man gleefully planning to vote blue this fall, sometimes I can’t help but think that all taxation is theft.

It is a sad fact of the world that governments are constantly coming up with novel ways to carve off their pound of asset flesh from us. What is particularly galling is that all the taxes I pay over my lifetime will probably fund, like, one quarter of a Raytheon missile that blows up some cave in another country that is supposedly filled with terrorists but is more likely occupied by goats that are about to be judiciously kabobified by my tax dollars. If the government had just left me and my (hilariously unlucrative) newsletter earnings alone, I would’ve spent said monies on happy things like puppy adoption fees, pepperoni pizza on which the spicy meat cups are crunchy, and footbaths.

You may or may not agree with me on taxes, but I’m using them to make a very, very roundabout point about copyright protection laws, and what happens when large-scale government interventions, awash with noble purpose, sometimes end up failing to address the problems they set out to solve. 

Like taxes, copyright law has a worthy goal. Taxes are for investments in common goods like roads, libraries, and Boston Celtics championship parades. Yet in practice, American tax policy has often proved itself to be a boondoggle, captured by corporate interests and riddled with waste. Copyright, similarly, is for protecting the people in our society who make art by granting them the exclusive ability to monetize their work. Too often, though, copyright law has functioned to shore up the interests of mega-corporations while failing to facilitate meaningful revenue capture for actual artists. In both cases, the contrast between the noble ideals and the bureaucratic malarky is stark.

In the generative AI space, the difference matters, because copyright is shaping up to be the de facto legal framework for regulating this technology. Just this week, a group of authors filed a class-action lawsuit against Anthropic alleging that the company included their books in the Claude model’s training data. Last week, the journalists at 404 Media revealed that Nvidia had scraped YouTube and Netflix to form its own video models, in obvious violation of both platforms’ terms of service. 

There will be more stories like these in the years to come. Every AI company I can think of is scraping the internet’s content. And while many of the major startups are striking partnerships with media and entertainment companies—such as the deal Condé Nast and OpenAI announced yesterday—these are the exception, not the rule. The vast majority of training data is scraped without consent. 

Is this yet another case of powerful tech startups taking advantage of creatives, or are these technologies simply tools that enable creative people to make wonderful things? Even if we believe the latter to be true, are these innovations only possible because they bend the rules we have around intellectual property?

These questions matter both for individual creators and for the entire technology sector. Silicon Valley has bet the farm on this bubble, and copyright law could be the needle that pops it. Before we go any further down his road, however, it’s important to get clear on what we’re talking about when we talk about copyright and AI. When someone says that “generative AI violates IP” or “OpenAI stole my content without compensation,” they are asserting bigger ideas about the origins of creativity. 

Where does an idea come from?

Large language models (LLMs) get their smarts by training on vast datasets. They don't learn complete pieces of content, but rather focus on smaller units called "tokens," which are often word fragments. Then, when you give it a prompt like, “Write me an essay,” an LLM uses statistical calculations to predict the most likely next token in a sequence. 

To me, this is analogous to the human process of writing. You ingest a large amount of copyrighted data into your subconscious, then split those ideas up in a million little ways to produce something new. The issue is that asking a machine to engage in a similar process on our behalf is not something that copyright law was ever designed to adjudicate.

There’s another reason why conversations about AI and copyright tend to be so fraught.  When we ask whether a generative AI company is violating copyright law, we can be referring to a number of different things:

  1. Training data: Was copyrighted material included in the training data?
  2. Machine: Once a model is trained, is the software that is produced through that training run owned by the AI company, or by the owners of any copyrighted materials in the data set?
  3. Output: When you prompt the model with a question, who owns the copyright of the answer it spits out? Who is liable if the content that is generated violates another corporation’s copyright?
  4. Published work: If a publisher distributes a piece of AI-generated content, are they responsible for compensating any of the people in the previous buckets? What does copyright ownership look like if generative AI is used to create part of a distributed work, rather than all of it? 

No one knows the answer to any of these questions. There are multiple lawsuits ongoing against the AI companies, but they are all in various lower courts in the U.S. None of them have reached the Supreme Court, so it’ll be many years before we have a definitive answer. In the meantime, I would expect these companies to build as fast as possible. Rather than wait around for the courts to figure it out, it makes more strategic sense to build world-changing technology now and apologize later. Ex-Google CEO Erich Schmidt recently voiced this sentiment in a talk with Stanford MBAs, suggesting that startups should steal IP and have lawyers “clean up the mess” for them after the fact. 

Create a free account to continue reading

The Only Subscription
You Need to Stay at the
Edge of AI

The essential toolkit for those shaping the future

"This might be the best value you
can get from an AI subscription."

- Jay S.

Mail Every Content
AI&I Podcast AI&I Podcast
Monologue Monologue
Cora Cora
Sparkle Sparkle
Spiral Spiral

Join 100,000+ leaders, builders, and innovators

Community members

Already have an account? Sign in

What is included in a subscription?

Daily insights from AI pioneers + early access to powerful AI tools

Pencil Front-row access to the future of AI
Check In-depth reviews of new models on release day
Check Playbooks and guides for putting AI to work
Check Prompts and use cases for builders

Comments

You need to login before you can comment.
Don't have an account? Sign up!
Georgia Patrick over 1 year ago

Evan... Keep on top of this and keep writing. Keep digging until you remove the layers of detritus produced by technology companies so you get to truth. Copyright rules intend to keep humans creating. AI is not a human. Full stop. If humans were like chickens or geese, think about what happens when you kill them all and have no more eggs. AI cannot give us eggs. Ever. It's about money with great disregard for the long game. Nobody is paying attention to the throughline on this. The human lives, creates, gets compensation for creation, continues to create, and then dies. At death, you get no more eggs. No more writing, art, music, or moral courage. At death, is AI at the funeral? Is AI taking care of the survivors and the estate? Let's start with intentions on all of this: What does the human intend when creating? What does the technology company intend in the manufacturing process that takes what a human creates and turns it into profits for them?

@raokrishna1 over 1 year ago

“Still, I think the real reason creators are upset about generative AI is much more basic: money.”

So, what exactly is are tech companies and the VCs funding them all about? All the big tech companies (and small) will fight tooth and nail and file patents. There will spend billions to acquire them.

But when it comes to ponying up money to creators they claim the “larger good.”

I am not saying the system is perfect. But this will only squeeze creators even further.

@jerry_delacruz.arcnow over 1 year ago

Beautifully written and a joy to read, Evan. Your points are persuasive and I agree that the pie needs to grow larger to make room for what is inevitably coming. The shape and size of what is coming is blurred like a race car whizzing by. When the dust settles, perhaps we can ask our AI LLMs how to adjudicate that which each of us creates. I love the commenter’s point about how AI can never make eggs regardless of how intelligent it gets. However, I see a future where eggs will be treasured by society just as virtual eggs will be treasured.