Back to blog

How to Write a Script for a Video in 2026: A Working Guide

Writing a video script in 2026 is mostly a three-decision problem. You pick a structure. You match length to your format. You pick a tool. The reason most first drafts feel broken is people pick the tool first, the structure last, and never check whether 600 words fits a 60-second Short. (It doesn’t. It’s 4 minutes of talking.)

This guide is built so you can ship a usable draft tonight. Five structures with one canonical use case each. A format-by-format word-count table. The 150 words-per-minute rule with a stopwatch trick. And an honest read on whether you actually need Final Draft (you almost certainly don’t).

The fastest way to write a video script

Three decisions, in this order, before you write a word.

Decision 1: structure. Pick one of five from the next section. Don’t blend them. A how-to is not a 3-Act story. A 30-second ad is not a brand film.

Decision 2: length. Look up your format in the table further down. A 60-second YouTube Short at 150 words per minute is roughly 150 words. That is your hard ceiling.

Decision 3: tool. For talking-head and tutorial work, ChatGPT or Claude plus a Google Doc beats every paid screenwriting app. Final Draft, Fade In, and Celtx are for film and TV. If you’re not shooting a narrative short, skip them.

Most “I’m stuck on my video script” problems are actually “I haven’t picked a structure yet” problems. Scripting for video content stops being scary the moment you have a frame.

The five video script structures that actually work

You don’t need ten frameworks. You need one of these.

Hook → Promise → Content → Recap → CTA: the YouTube long-form default

The default shape behind most of what gets watched on YouTube in 2026. Hook in 5 to 30 seconds. State the payoff before the intro so people don’t bounce. Deliver in five to seven setup-tension-payoff loops in ascending value order. Recap three to five takeaways. End with a specific next-action call. Most YouTube education and how-to channels run on this skeleton with minor variations.

Problem → Agitate → Solve (PAS): the explainer workhorse

The default for explainers, organic ads, and short marketing video. Name the problem. Twist the knife so the viewer feels it. Show the solution. Dan Kennedy called PAS the most reliable copywriting formula ever invented, which sounds like marketing puffery until you try writing a 60-second SaaS explainer without it.

AIDA (Attention, Interest, Desire, Action): the paid-ad default

The default for paid-ad creative. E. St. Elmo Lewis theorized AIDA in 1898 and direct-response shops still use it. PAS pairs with AIDA inside a single ad: Problem and Agitation do Attention and Interest, Solution does Desire and Action.

3-Act (Setup, Confrontation, Resolution): brand films and narrative

Brand films, narrative ads, documentary pieces. Robert McKee’s framing in Story is that a feature needs at minimum three extreme changes that turn the protagonist’s world upside down. Use this only if you are actually telling a story. Don’t bolt 3-Act onto a how-to.

StoryBrand SB7 (Donald Miller): brand-story video

Seven beats for brand video: a Character (your customer) with a Problem meets a Guide (you), who gives them a Plan and Calls them to Action, leading to Success while helping them Avoid Failure. Best for homepage hero spots and brand-story videos.

Beginner variants worth knowing: H.E.A.R. and HIVES

Two creator-named variants worth knowing in passing. Roberto Blake (Awesome Creator Academy, 600,000+ subscribers) teaches H.E.A.R. (Hook, Engage, Action, Retention) as a beginner-friendly scaffold. Ali Abdaal teaches a framework called HIVES inside Part-Time YouTuber Academy. Both are simplified versions of Hook → Promise → Content → Recap. If you’ve never written a script, H.E.A.R. is fine training wheels. Once you’re comfortable, drop back to the full five-beat version.

Match the structure to your video format

This is the bookmark-worthy table. Numbers verified May 2026, calculated at roughly 150 words per minute.

FormatTarget lengthWord count (~150 wpm)Hook windowStructure
TikTok / Reels15–30s35–75 wordsfirst 0.5–2sPAS or single-loop hook
YouTube Shorts30–60s75–150 wordsfirst 2–3sHook to Reveal to Loop, or PAS
Paid social ad15–30s35–75 wordsfirst 1sAIDA
SaaS / B2B explainer60–90s125–225 wordsfirst 10sPAS or StoryBrand
Tutorial3–10 min~450–1,500 wordsfirst 30sHook → Promise → Content → Recap
YouTube long-form8–20 min~1,200–3,000 wordsfirst 5–30sHook → Promise → Content → Recap (5–7 loops)
Video Sales Letter8–15 min~1,200–2,250 wordsfirst 5–8sPAS expanded → Proof → CTA
Brand film60–120s150–300 wordsfirst 5sStoryBrand SB7 or 3-Act

Two ways to use this table. Working forward: I have a 90-second explainer, so I write 125 to 225 words, and I use PAS or StoryBrand. Working backward: I wrote 600 words and I want it to be a Short, but a Short is at most 150 words, so I cut two-thirds. The arithmetic answers the question. Almost every “my script is too long” problem is solved here, before you hit record.

The 150 words-per-minute rule

Memorize one number: average spoken pace is 130 to 160 words per minute. Use 150 wpm as the default for video.

The variations matter less than you’d think:

  • Presentations and explainers: roughly 120–130 wpm (slower for clarity).
  • YouTube long-form: 150–160 wpm.
  • Shorts, TikTok, ads: closer to 180 wpm because creators speak with about 20% more energy than conversation.

Two equations every scriptwriter should keep in their head: 1,200 words divided by 150 is 8 minutes. 75 words divided by 150 is 30 seconds. Read every script aloud with a stopwatch before the shoot. This catches half of all length problems before they become re-record problems.

How to write the hook (and why you write it last)

You can’t write a compelling hook until you know what the video delivers. Draft the body first. Then go back to the top.

A hook does one job in 2 to 5 seconds: it tells the viewer why staying is worth it. Animoto’s research on the first three seconds breaks down the math: 50–60% of drop-offs happen in the first 3 seconds, and 65% of viewers who survive that first 3 seconds make it to 10. The hook is most of the retention battle. Three patterns that work in 2026:

Pattern 1: promise the result

“By the end of this video you’ll have a 60-second explainer ready to record.” This works when the payoff is concrete and the viewer can imagine themselves with it before they decide to keep watching.

Pattern 2: open with the highest-value moment

Show the punchline of the build before you build to it. The cold-open structure from prestige television, scaled down. Works especially well for tutorials and how-to videos, where the finished result is the strongest possible argument for staying.

Pattern 3: ask the question the viewer is already asking

“Why does every YouTube script feel like a TED Talk?” This works because you’re naming the problem they came in with. It only fails when the question is too generic; specificity is what makes it land.

How to write the body in setup-tension-payoff loops

Long-form video doesn’t reward chronological information. It rewards loops. A loop is three beats: setup (here’s what we’re about to look at), tension (here’s what’s interesting or counterintuitive), payoff (here’s the answer or the next twist). Stack five to seven loops in ascending value, save the best loop for the back half, and keep each loop tight enough that nothing drifts past 60 seconds without a beat change.

The MrBeast production memo (a leaked 36-page internal onboarding document, widely circulated since September 2024) frames this as “crazy progression”: escalate the stakes faster than the viewer expects. A 10-minute video about a guy surviving in the woods covers multiple days in the first 3 minutes, not just day one, because the viewer’s investment compounds. The principle generalizes well past stunt content. Even an explainer benefits from skipping past the obvious first move.

How to write the recap and CTA

The recap is three to five concrete takeaways, in the same order you delivered them. Not a vague “thanks for watching.” Specific phrases the viewer can repeat tomorrow.

The CTA is one specific next action. “Click the link in the description for the script template.” “Subscribe if you’re writing your second video this month.” Set the CTA up mid-video so the close lands instead of feeling tacked on. Buried CTAs are the most common reason a good video converts at 1% instead of 4%.

Talking-head specifics: the 20-second B-roll rule

Talking-head video has one structural trap. The MrBeast memo (and any creator who has watched their own retention curves) flags it: a single talking-head shot longer than 20 seconds without B-roll, cutaways, or visual change tanks retention. Mark every 15–20 seconds in the script with a B-roll cue, an on-screen text card, or a cut to a graphic. Two-column AV scripts make this easy: left column for visuals, right column for what you’re saying.

Once your talking head is shot, the hard work is matching what you said to what you wrote across every “um,” every retake, every line you re-recorded after lunch. Most editors hand-scrub the timeline for that. There’s a faster way (we’ll get to it in the ChatCut section below), but the script is what makes the cleanup possible at all.

Short-form specifics (TikTok, Reels, Shorts in 30 seconds)

Short-form rewrites the rules. Pat Flynn (Smart Passive Income, 2.5B+ Shorts views across his network) talks about recording short-form by playing a rough cut and speaking over it; sometimes one take is enough because the visuals carry pacing the script doesn’t. Less rigid scripting works because Shorts don’t reward setup. They reward the punchline first.

Three rules that hold up:

  • Hook in the first 1 second, not the first 3.
  • One idea, one loop. No B-plot.
  • Write the last line first. If the close lands, the rest of the script writes itself backward.

For paid social, the same rules apply, but with AIDA instead of a single loop, and a hard CTA in the last 2 seconds.

The eight beginner mistakes that kill retention

  1. No hook in the first 2–3 seconds. Half the audience leaves before the intro card.
  2. Writing the hook first. You can’t sell a payoff you haven’t written.
  3. Essay voice instead of spoken voice. Use contractions. Keep sentences under 15 words. Read aloud.
  4. Buried CTA. Mid-video setup, end-of-video delivery.
  5. No B-roll plan. The 20-second rule from above.
  6. Overwriting for the platform. Calculate first, write second.
  7. Stripping emotion. Mark stress and emphasis in the script so the read isn’t flat.
  8. Telling instead of showing. Video is visual. Pair every line with what’s on screen.

How scripts integrate into your editing workflow

The biggest unlock in 2026 isn’t the script itself. It’s that the script and the edit are now the same conversation. Treating them as separate jobs (write the script, set it aside, open a separate NLE, scrub the timeline against your notes) is the slowest workflow you can pick. The strongest workflows in 2026 collapse those two jobs into one tool.

Transcript-aware editors: edit by clicking on text

For talking-head, podcast, and tutorial video, “edit by the script” tools have matured into a real category. Descript pioneered the pattern: upload your recording, get a transcript, cut the video by deleting words from the transcript. Riverside’s editor does the same for podcast and remote-interview recordings. ScreenFlow’s transcript pane offers it for tutorial footage. The script and the cut converge.

These tools are great once you’ve already recorded. They become awkward when you’re still drafting, because the written script lives somewhere else (a Google Doc, a Notion page) and the editor only sees the recorded transcript. The drafting tool and the editing tool stay separate, and you become the manual reconciler between them.

ChatCut: where script and edit are one conversation

ChatCut is the next move in that direction. It runs in any Chrome tab on Mac, Windows, Chromebook, or Linux, and the written script never leaves the editor.

Three things change when the script lives inside the AI editor:

  • Draft and revise the script with the Agent. Paste an outline or a partial draft. Ask for hook variants, tighten the second act, or swap a 1,500-word essay structure for a 12-loop YouTube long-form. The script becomes a conversation with the AI, not a static file you import later. The structure decisions earlier in this article (Hook → Promise → Content → Recap → CTA, the 150 words-per-minute rule, the five formula choices) all map directly to prompts you give the Agent.
  • Match recorded takes to the script automatically. Once you record, upload the takes. ChatCut transcribes, aligns to the written script, and surfaces the strongest version of each line across every take. The “every ‘um,’ every retake, every line you re-recorded after lunch” cleanup work that takes an hour of scrubbing in a traditional NLE happens in a single pass.
  • Edit by rewriting the script. Change a sentence in the script, the cut updates. Add a new line, the Agent finds the take that matches (or flags that you need to re-record). Cut a paragraph, the corresponding video disappears.

Three concrete scenarios where this fusion pays back fast:

  • The retake reconcile. You recorded the opening hook five times across two sessions. The Agent shows you all five takes inline with the script, you pick the best one, the rest are auto-cut.
  • The filler cleanup. Your 12-minute final read has 90 “ums” and 30 long pauses. The Agent removes them in a single prompt; you sanity-check the result against the transcript.
  • The Shorts repurpose. Your 12-minute YouTube long-form has three Shorts inside it. Ask the Agent to surface the strongest 60 seconds, format as 9:16, and drop in AI captions with the TikTok preset. Draft Shorts cut.

Don’t click through menus. Just tell ChatCut what you want. It’s a text-based editing workflow on top of a browser-based AI video editor.

When you still need standalone tools

For specific narrow cases the fusion model doesn’t cover:

  • Narrative film or TV requiring .fdx output. Final Draft 13 ($249.99, on sale at $199.99) or Fade In ($79.95). 95% of film and TV pros use Final Draft because the studio submission portals expect the .fdx file format. WriterDuet (free plan or $9.99 a month) is the writers’-room pick for real-time collaboration on screenplay format.
  • Long-form drafting before you start recording. ChatGPT Plus ($20 a month) or Claude Pro ($20 a month) for the outline-to-first-draft pass. Many creators do their first three drafts in an LLM tab, then bring the polished script into ChatCut once they’re ready to record. Both paths are valid.
  • End-to-end AI-generated talking-head. InVideo AI ($28 a month Plus, $50 a month Max) bundles script, voiceover, and stock footage. Useful if you’re publishing AI-generated video without any human-recorded takes. Less useful once you start shooting your own footage.

For 99% of YouTubers, marketers, course creators, and explainer-video work, the standalone scriptwriting tools above are optional. The script-edit fusion in a single tool is the workflow change. For the free-AI-script side specifically, our free AI movie script maker round-up covers the seven that hold up in 2026. For the editing side, the YouTube editing walkthrough and best video editing apps roundup go deeper. If you’re specifically repurposing long-form into shorts, turn long videos into shorts is the natural next read.

Frequently asked questions

How long should a video script be?

Multiply your target video length in minutes by 150 words. A 60-second Short is about 150 words. A 10-minute YouTube essay is about 1,500. Read aloud with a stopwatch to confirm; some speakers run faster, especially on Shorts.

Do I need to write a script for a YouTube video?

For long-form (8 minutes plus), yes. The retention math doesn’t work without one. For Shorts, a hook line and a closing line written down, with the middle improvised, is often enough. Pat Flynn records short-form by speaking to the rough cut, not from a full script.

What’s the best free video script template?

The two-column AV format. Left column for visuals and on-screen text, right column for audio and dialogue. Boords, StudioBinder, and the Celtx blog all publish free versions. For screenplays, KIT Scenarist exports a Final Draft–compatible template at $0.

Can ChatGPT write video scripts?

Yes, and it’s now the default tool for most talking-head and tutorial work. Paste your outline, your brand voice notes, your target word count, and the structure you’ve picked from the five above. Iterate the hook last. For a deeper comparison against Claude, Squibler, and InVideo AI, see our free AI movie script maker guide.

How do I write a script for a 60-second video?

Cap it at 150 words. Pick PAS or a single-loop hook. Open with the punchline. End with one specific CTA. Read aloud with a stopwatch before you record.

What’s the difference between a screenplay and a video script?

A screenplay is industry-formatted (sluglines, character names centered in caps, dialogue indented) and built for the .fdx workflow that runs film and TV. A video script for talking-head, explainer, or social is usually a two-column AV doc focused on what’s said and what’s seen, not on submission formatting. Use a screenplay tool only when you’re shooting narrative film or TV.

How fast can I draft a 10-minute YouTube script with AI?

About 45 minutes start to finish in 2026. Outline in ChatGPT or Claude (10 minutes), expand each beat to draft prose (20 minutes), revise the hook last and read aloud (15 minutes). The variable that breaks the estimate isn’t the AI; it’s how clearly you’ve defined the payoff before you start. Spend an extra 5 minutes on “what does the viewer get out of this video” and the draft writes itself.

Try ChatCut, where the script and the cut live in one place

The bigger unlock in 2026 isn’t the script. It’s that you stop treating drafting and editing as separate jobs. Discuss the script with the AI, record once you’ve nailed the structure, edit by rewriting the script. One tool, one conversation.

Try ChatCut Free on your next talking-head video. Free Plan includes 20 one-time credits, no credit card required.

The bottom line for 2026: writing a video script isn’t a craft secret. Pick a structure, hit your word count, draft the body, write the hook last, and stop paying for software you don’t need. Most stuck scriptwriters aren’t stuck on words. They’re stuck on a missing frame. Pick the frame, and the words come.