Script to Video AI in 2026: The Model, the Price, the Deprecation

OpenAI is shutting down Sora 2 on September 24, 2026, with no recommended replacement listed on the deprecation page (Wikipedia on Sora text-to-video). Seedance 2.0 Fast generates a five-second 720p clip for about fifty cents through third-party providers. Sora 2 Pro at 1024p costs ten times that (AI Free API on Sora 2 pricing). Veo 3.1 sits in the middle band on cost and continues past the Sora deprecation. Choosing a script-to-video model in mid-2026 isn’t only a quality decision; it’s a sunset-timeline decision and a per-second-cost decision, and the typical “best AI video generator” listicle prices in neither.

This is the practical guide. Six use cases (hook clips, B-roll insertion, narrative shorts, animated transitions, asset chaining, multilingual scale-out), the model that fits each one, verified per-second pricing as of May 21, 2026, and the honest scope of where script-to-video belongs in a real production pipeline. The audience here is the producer or operator running actual budget against actual clips, not the early-adopter watching model demos on Twitter.

Five models, one deprecation deadline

The 2026 model lineup in script-to-video: OpenAI’s Sora 2 and Sora 2 Pro (deprecating September 24), Google’s Veo 3.1, Bytedance’s Seedance 2.0 and Seedance 2.0 Fast, Kuaishou’s Kling’s latest release, and Alibaba’s Wan 2.7. Multiple aggregator platforms (Higgsfield, Filmora, Veo3 AI, Truescho) bundle access across these models into unified workspaces, so the underlying-model choice is increasingly the variable that matters, not the platform.

The deprecation reshapes everything downstream. Teams building pipelines on Sora 2 in May 2026 have roughly four months before the API stops serving requests. Production work that needs to ship past September has to migrate to an alternative model now, not in August. OpenAI hasn’t named a replacement, which means migration is to a competitor’s model rather than to a next-generation Sora. This is the load-bearing fact most listicles bury.

The right framing for May 2026 tool selection: Sora 2 is appropriate for one-off marquee shots where the cost and quality are justified and the deprecation date is past the project’s ship window. Sora 2 is inappropriate for any pipeline that needs to ship past Q3, because the migration cost will exceed any quality benefit. Veo 3.1 and Seedance 2.0 are the practical primary choices for ongoing pipelines.

The per-second pricing reality

Per-second pricing varies by an order of magnitude across the 2026 model lineup. Sora 2 Pro runs $0.30 per second at 720p and $0.50 per second at 1024p. The base Sora 2 model runs $0.10 per second at 720p. Seedance 2.0 through BytePlus’s official rate comes in at roughly $0.15 per second; Seedance 2.0 Fast at roughly $0.12 per second; third-party providers offer Seedance Fast as low as $0.10 per second or about $0.50 for a five-second 720p clip (Atlas Cloud on Seedance 2.0 pricing).

The math at typical project lengths makes the spread concrete. A 30-second hero shot generated by Sora 2 Pro at 1024p costs $15. The same shot from Seedance 2.0 Fast costs $3 to $4.50. A 90-second AI video sequence (rare for finished narrative content but common for B-roll stitched into a longer cut) costs $45 on Sora 2 Pro and around $9 on Seedance. The five-to-ten-times spread compounds quickly for any team running more than a handful of generations per week.

Veo 3.1’s pricing sits in the middle band on most aggregator platforms; specifics vary by provider. Kling’s latest release and Wan 2.7 typically land in the lower-cost tier. The point of the pricing table isn’t a single answer; it’s that ignoring per-second economics is now an expensive choice, where in 2024 the cost was uniformly experimental.

Six use cases, six matches

Hook clips (the first three to five seconds of a video) are where cinematic quality justifies cost. Sora 2 Pro and Veo 3.1 are the appropriate choices here, with Veo getting the long-term recommendation given Sora’s September deprecation. A 5-second hook at Sora 2 Pro 1024p costs $2.50; the production value at that length tier is high enough that the budget makes sense for marquee placements.

B-roll insertion in longer cuts is where Seedance 2.0 Fast earns its slot. A 10-minute video typically wants 15 to 20 short B-roll inserts of 3 to 5 seconds each. At $0.50 per 5-second clip, 20 inserts cost $10. At Sora 2 Pro rates, the same 20 inserts cost $30 to $50. The cost differential at volume swings the math toward Seedance for nearly all B-roll work. The technical hurdle to watch is continuity-of-style across multiple Seedance calls; varying the prompt phrasing too much produces stylistically inconsistent inserts that read as patched together rather than cohesive.

Narrative shorts in the 15-to-60-second range are AI-filmmaking territory and multi-shot continuity matters. Veo 3.1’s sequencing capabilities lead the category in May 2026; Seedance 2.0 Pro is the cost-conscious alternative. Both models still struggle with character consistency across cuts; a recurring character introduced in shot one often drifts in shot four. Narrative projects that need consistent characters across many shots typically still combine generated footage with live-action references rather than going fully synthetic.

Animated scene transitions (wipes, morphs, stylized interstitials) sit in a different sub-category. Kling’s latest release and Wan 2.7 excel here because the transitional motion doesn’t require strict realism. Lower cost, faster generation, and the visual style fits the transition role.

Asset chaining (reference image into animated clip) is the workflow that ChatCut wires up natively. Step one uses GPT Image 2 to generate a reference frame from a prompt; step two passes that reference into Seedance 2.0 to animate the frame into a 5-second clip; step three drops the result onto the timeline. The chained workflow produces the most coherent script-to-video output in 2026 for stylized hook clips and custom B-roll because the reference image controls composition before the video model interprets it.

Multilingual scale-out belongs on aggregator platforms that bundle video generation with dubbing layers. Filmora and Truescho integrate multiple video models with downstream voice and caption translation. The cost adds dubbing per language on top of the per-second video generation. The fit case is finished narrative content destined for 12-plus markets; the fail case is generating the multilingual version before identifying the winning creative in a single market.

Where script-to-video belongs in a production pipeline

The September 2026 Sora deprecation, the per-second pricing spread, and the model leaderboard rotation all point at the same question: what is script-to-video actually for?

The honest answer in May 2026 is that script-to-video is excellent for hook clips, custom B-roll, animated transitions, and illustrative shots that can’t reasonably be captured with a camera. It is not a substitute for shot-by-shot filmmaking. Multi-shot character consistency, dialogue with lip-sync, narrative coherence across a 90-second story arc all still break down at the edges of what the 2026 models reliably produce. Treating script-to-video as a way to generate the whole video from a prompt tends to produce content audiences read as uncanny, and the platform-level consequences are documented. YouTube’s January 2026 enforcement wave wiped 4.7 billion views from sixteen channels that built their content economy on full-AI narratives (ScaleLab on YouTube’s 2026 AI-content crackdown). The pattern was specifically narrative-generation-at-scale; the platform reads it and audiences read it.

Mixed workflows are where script-to-video earns its keep. Real footage shot with a real camera, AI-generated hook clips in the first three seconds, AI-generated B-roll inserted to illustrate a specific point, AI-generated animated transitions between sections. The viewer sees real humans, the production economics get the AI cost savings on the supporting assets, and the platform algorithms don’t flag the content as generated-at-scale.

This framing also affects which model the project budgets for. A pipeline running ten AI hook clips a month is a different cost profile than a pipeline running ten hundred-second narrative shorts. The first sits comfortably on Sora 2 Pro or Veo 3.1; the second has to lean on Seedance to be financially viable.

The ChatCut workflow inside this market

The chained workflow described above (reference frame to animated clip to timeline) is what ChatCut wires up natively. From inside the editor, AI image generator generates a reference frame from a prompt; AI video generator passes that reference into Seedance 2.0 for a five-second animation; the result lands on the timeline alongside the rest of the project’s real footage. The user doesn’t switch tabs, doesn’t export-and-reimport, doesn’t manage a separate generative-AI workflow. Other editors make you hunt for buttons. ChatCut lets you type a sentence.

The lane is asset-tier generation: hook clips, custom B-roll, stylized intro animations within a real editing project. The principle is that the chained workflow (reference image first, animation second) produces more coherent output than text-to-video alone, and ChatCut integrates the two steps inside the timeline rather than across separate tools.

The boundary stays explicit. The editor isn’t trying to compete head-to-head with the standalone cinematic-narrative video models on full-shot generation. Those models’ 2026 strength is multi-shot sequencing and longer-form generation, which is a different production category. ChatCut’s job is integrating generated assets into a working timeline alongside real footage; the cinematic-from-prompt category belongs to the standalone video models. Output ships at 1080p on both Free and Pro plans; Ultra HD finishing isn’t on the feature list. Free includes 20 one-time credits to test the chained workflow; Pro starts at $25 a month. For the broader AI filmmaking and short films use case, text-based AI video editing covers the deep dive on the transcript layer, and text-based editing feature page documents the editing model the script-to-video assets land inside.

Five questions worth a direct answer

Sora 2 or Seedance for hero hooks in 2026? Veo 3.1 for new pipelines. Sora’s September deprecation makes it a poor bet for any project that ships past Q3; Seedance is the right cost-conscious primary; Veo is the middle path that lives past Sora’s sunset.

What does a 30-second AI hero shot cost? Sora 2 Pro 1024p: $15. Seedance 2.0 Fast at 720p through third-party providers: $3 to $4.50. The five-to-ten-times spread is the most actionable number in the category.

Can AI replace my whole video production? No, and the platforms are now actively penalizing teams that try. YouTube’s January 2026 enforcement wiped 4.7 billion views from accounts running full-AI narrative content. Mix AI-generated assets into real-footage workflows; don’t replace the camera with a prompt.

What’s the September 24, 2026 Sora deprecation? OpenAI is shutting down Sora 2 and Sora 2 Pro models plus the Videos API on that date, with no replacement listed. Teams need to migrate to Veo 3.1, Seedance, Kling, or Wan before then.

Worst script-to-video mistake in 2026? Generating full narrative videos instead of supporting assets. The viewer trust collapse is documented, the platform penalties are real, and the workflow that actually performs in 2026 is real footage with generated assets layered in.

Wiring GPT Image 2 and Seedance 2.0 into one editing project? Try ChatCut Free. 20 one-time credits, chained generation into the timeline, 1080p output, Chrome-only.