The three big unanswered questions about Sora
AI & Machine Learning·4 min read

Sora’s Infinite Scroll: The technical, legal, and climate bill of AI-generated video

By Alexander Cole

OpenAI’s Sora—a TikTok‑like app that serves an endless feed of exclusively AI‑generated videos—shot to the top of Apple’s App Store within days of its October release. Its promise is simple and strange: ten‑second cinematic hallucinations on demand, populated by hyperreal cameos of real people and copyrighted characters.

OpenAI’s Sora—a TikTok‑like app that serves an endless feed of exclusively AI‑generated videos—shot to the top of Apple’s App Store within days of its October release. Its promise is simple and strange: ten‑second cinematic hallucinations on demand, populated by hyperreal cameos of real people and copyrighted characters.

Why this matters now: Sora is not just another consumer app. It crystallizes the urgent trade‑offs of mainstreaming generative video—enormous compute demand, unsettled copyright law, and new vectors for deepfake harm—at the moment when regulators, rights holders, and researchers are scrambling to catch up. If Sora scales, its costs—financial, environmental, and legal—will be distributed across infrastructure providers, advertisers, and everyday users who may find their likenesses weaponized or monetized without clear consent. That makes Sora an early test case for how we govern attention‑scale generative media.

How Sora generates an endless stream

How Sora generates an endless stream

Technically, Sora stitches together recent advances in text‑to‑video, lip‑syncing, and neural rendering into a single product. Videos are short—up to ten seconds—but the app layers multiple models: a motion‑and‑frame generator that supplies the visual backbone, an audio model for voices and music, and a “cameo” pipeline that maps a small personal sample into a controllable avatar. OpenAI’s public notes say the app is invite‑only as it ramps capacity, and the company has acknowledged it will need revenue streams to make video generation sustainable.

The design choices matter. Ten seconds at 30 frames per second produces roughly 300 frames; even at lower frame rates, each frame requires far more compute than a single image. Sora therefore trades off duration for frequency—short clips that can be produced and consumed at scale. That amplifies algorithmic design decisions: frame interpolation and temporal coherence modules must run quickly, or users will experience lag; compression‑aware generation reduces bandwidth but can worsen artifacts, and the cameo system must reconcile identity fidelity with constraints to prevent misuse.

The economics and the energy bill

These are not theoretical challenges. Bill Peebles, Sora’s head, posted on October 5 that users can now limit how their cameos are used, for example, banning political contexts or certain words. That functionality hints at the underlying access‑control logic—permission tokens tied to identity embeddings—that OpenAI has had to bake into the model stack after early tests showed cameo misuse was easy and rapid.

The economics and the energy bill

Generating video is expensive in two distinct ways: capital and variable cost. On the capital side, OpenAI has undertaken massive data center investments and power contracts to secure GPU capacity and electricity. The company acknowledges that video generation demands orders of magnitude more GPU‑hours than text or image tasks, and CEO Sam Altman wrote on October 3 that “we are going to have to somehow make money for video generation.”

Copyright, consent, and a likely litigation wave

Variable cost is the compute and energy per generation. Video models use larger parameter counts and longer activation paths than chat models; they stress memory and interconnects far more intensively. While precise watt‑hour figures depend on model architecture and hardware (NVIDIA H100‑class inference hardware, for example, draws hundreds of watts per socket under load), the practical effect is clear: unconstrained free generation scales into substantial monthly bills. For OpenAI, a runaway hit like Sora could turn usage spikes into multi‑million‑dollar operating costs in weeks.

That financial exposure maps directly to emissions. The company’s investment in new data centers and power deals is a hedge—but it also underscores a structural problem: popular generative‑video services shift demand to electricity markets and, unless paired with clean‑power procurement, can increase carbon footprints. OpenAI has not published per‑video carbon figures; researchers and climate‑focused reporters are already pressing for such transparency as the next necessary metric.

Copyright, consent, and a likely litigation wave

Bias, personalization, and the problem of scale

Sora’s permissive training and generation affordances have immediate legal consequences. Within days of Sora’s debut, publications reported that OpenAI had notified some rights holders about opt‑outs for their characters, an approach that inverts traditional copyright practice (where creators must affirmatively license their content). That choice raises two legal fights: one over whether training on copyrighted media constitutes fair use, and another over whether offering synthetic reproductions of trademarked characters or deceased public figures crosses a separate tort line.

The cameo feature complicates matters further. Even when users must opt in to supply facial and voice samples, the app allows creators to import others’ cameos subject to permission settings. That system will be tested by edge cases: unauthorized political uses, eroticized deepfakes, or defamatory reconstructions. Lawsuits are likely to target platforms and creators alike; precedent around image‑based deepfakes is thin, and courts will soon parse how platform policies and technical access controls map to culpability.

OpenAI’s incremental fixes—granular cameo permissions, filters, and take‑down pathways—are necessary but insufficient. Technical mitigations like provenance stamps (cryptographic provenance or watermarking), auditable consent receipts, and stricter provenance‑first generation pipelines can reduce downstream risk. But those features impose latency and complexity, and they reduce the seamlessness that makes Sora addictive.

Bias, personalization, and the problem of scale

Sources