The company asked me to test several AI software — just a routine evaluation, nothing special. But after spending a few weeks with Runway Gen-4.5, I genuinely got excited. This isn’t another incremental update. This is the first AI video model that actually understands how movies work.
Let me walk you through my testing process, what I found, and where this thing still falls short.

My First Step: Understanding What Gen-4.5 Actually Is
Runway Gen-4.5 launched in December 2025, and the headline feature is finally here: image-to-video. For two years, Runway users could only generate from text prompts. Now you can upload a still frame and animate it .
Why does this matter? Because professional creators overwhelmingly prefer image-to-video workflows:
- Stronger character consistency — no more flickering faces between shots
- Locked-in backgrounds — your environment stays stable
- Precise art direction — you control the starting frame exactly
- Easier scene chaining — building a cohesive film shot-by-shot actually works
The company claims Gen-4.5 is the #1 ranked video model on the Artificial Analysis Text-to-Video benchmark with 1,247 Elo points . I wanted to verify that myself.

My Second Step: The Hardware Reality Check
Before getting into results, let me set expectations about what you need to run this.
Gen-4.5 was developed entirely on NVIDIA GPUs — Hopper and Blackwell series for inference. That’s enterprise-grade hardware. You’re not running this on your local machine. The good news: Runway handles everything on their servers. You just need a browser.
Cost breakdown (as of April 2026):
| Plan | Monthly Price | What You Get |
|---|---|---|
| Free | $0 | Limited generations, watermarked output |
| Pro | $15/month | 625 generations, up to 10-second videos |
| Unlimited | $35/month | Unlimited generations, priority access |
| Enterprise | Custom | Custom quotas, dedicated support |
Per-generation cost on Pro works out to roughly $0.024 per video. That’s cheaper than buying a coffee to manually edit the same footage.
My Third Step: Testing Core Performance
I ran Gen-4.5 through multiple real-world scenarios. Here’s what I found across six different tests.
Test 1: Action and Water
Prompt: A man sprinting through a stream with handheld camera tracking.
Runway’s Result: The motion felt energetic initially, but cracks appeared quickly:
- Water wake behaved unnaturally — more like a speedboat than a person running
- The runner’s stride skipped oddly, losing the weight of the motion
- Trees in the background flickered noticeably
- A subtle diffusion pattern shimmered across fine details
Comparison: Using the same image and prompt in Kling produced more stable backgrounds, cleaner foliage, and better temporal consistency.
Winner: Kling
Test 2: VFX (Glowing Particles)
Prompt: Glowing particles and purple smoke swirling around a character.
Runway’s Result: This was a noticeable improvement over the first test. Smoke billowed more naturally, and the motion felt smoother. I’d rate this shot around 6/10 — usable with some caveats.
Comparison: Kling edged ahead with slightly better smoke dynamics and more cohesive lighting.
Winner: Kling (close to a tie)
Test 3: Fire
Prompt: A woman running toward a burning barn, then grabbing her head in panic.
Runway’s Result: The model followed the prompt, but execution was rough:
- Character skipped unnaturally across the frame
- Scale felt mismatched — the barn looked miniature
- Sparks looked mapped onto the building rather than coming from it
Comparison: Kling’s version delivered better flame behavior and a more cinematic focus pull. Still imperfect, but noticeably more film-ready.
Winner: Kling
Test 4: Dialogue and Facial Expression
Prompt: Two people talking while the camera slowly pushes in.
Runway’s Result: The face looked mostly realistic, but there was a slight uncanny valley in the expressions. Crucially, the camera didn’t actually push in as requested, and there’s no built-in lip-syncing yet.
Comparison: Veo 3.1 impressed with a proper camera move and more believable facial motion, plus native lip-sync prompting support.
Winner: Google Veo 3.1
Test 5: 2D Animation
Prompt: A 2D character standing at a bus stop.
Runway’s Result: The character morphed mid-shot, and the face collapsed in the final frames. This was genuinely disappointing — animation styles seemed to break the model.
Comparison: Kling’s generation was still quirky (it added an unprompted bird), but motion stayed more consistent, and the character held together longer.
Winner: Kling (neither did great)
Test 6: 3D Animation
Prompt: An octopus interacting with an object.
Runway’s Result: The facial expression worked well, and the octopus correctly interacted with the object. However, textures crawled unnaturally, and the coral in the background flickered frame-to-frame.
Comparison: Kling delivered a sharper environment and better texture stability.
Winner: Kling
My Fourth Step: The Consistency Breakthrough (What Runway Actually Fixed)
Despite losing most of those direct comparisons, Gen-4.5 has one killer feature that changes everything: character consistency across multiple shots.
Here’s a quick comparison of what changed from Gen-4 to Gen-4.5:
| Feature | Gen-4 | Gen-4.5 |
|---|---|---|
| Character consistency across shots | Unstable — faces changed constantly | Stable — same face, same clothes across scenes |
| Camera control | Basic | Precise — zoom, dolly, whip pan all work |
| Temporal coherence | Frame-to-frame flicker | Smooth — minimal flickering |
| Maximum length | 5 seconds | 10 seconds |
Why this matters: Before Gen-4.5, AI video had a fundamental problem. You’d generate a character in one shot, then try to generate them in a different angle or scene, and they’d look like a completely different person. Different face, different clothes, different everything .
Gen-4.5 solves this through image-to-video with reference. Upload one image of your character — the model locks onto their face, clothing, proportions. Then you can move them through different scenes and angles without losing identity .
What this looks like in practice: In one official demo, a 5-second clip transitions through three shots — close-up, mid-shot, wide shot — while a character sits on a flying octopus. Her face never breaks. Her clothes stay consistent. The physics hold up .
In another, a giant fuzzy gorilla walks through New York City streets. The perspective logic, lighting consistency between subject and background, and scale all work .
Even more impressive: Runway’s CEO generated a two-minute narrative video with multiple scene transitions inside a subway car. No jump cuts. No spatial confusion. No continuity errors .
My Fifth Step: The “Can You Tell It’s AI?” Test
Runway ran a blind test with 1,000 participants. They showed real footage and AI-generated clips at the same resolution and length. Participants had 10 seconds to decide which was which.
Only 57% of people correctly identified the AI-generated videos.
That means 43% of viewers were fooled. Think about that for a second. Almost half of regular people couldn’t tell the difference between a real camera and an AI model.
This is the first AI video model that passes the “average person” Turing test. Professional filmmakers can still spot the flaws — I could too — but the gap is closing fast.
My Sixth Step: Where Runway Still Falls Short
I need to be honest about the limitations. The model has three known weaknesses that Runway itself acknowledges :
| Weakness | What It Means | Real-World Example |
|---|---|---|
| Causal reasoning | Effects happen before causes | A door opens before the handle is pressed |
| Object permanence | Objects disappear/reappear randomly | A cup vanishes after being temporarily blocked from view |
| Success bias | Actions always succeed | A badly aimed kick still scores a goal |
Beyond these technical limitations, here’s what I experienced:
Prompt adherence is inconsistent. When I asked for “camera slowly pushes in” during the dialogue test, Gen-4.5 simply ignored that instruction .
720p output requires upscaling. Default clips render in 720p. There’s a built-in 4K upscale button, but upscaling doesn’t fix deeper visual problems .
No native audio or lip-sync. Veo 3.1 already supports direct lip-sync prompting. Runway doesn’t have this yet .
Action physics still break. The man sprinting through water created speedboat-like wake, not human splashing. Complex motion still confuses the model .
Background flicker persists. Frame-to-frame stability has improved dramatically, but fine details like trees and coral still shimmer .
My Seventh Step: Benchmark Scores vs Competitors
Based on Artificial Analysis data and community testing:
| Model | Elo Score | Rank |
|---|---|---|
| Runway Gen-4.5 | 1,247 | #1 |
| Google Veo 3.1 | ~1,180 | #2-3 |
| Kling 2.6 | ~1,170 | #2-3 |
| Pika 2.5 | ~1,100 | #4 |
| Previous Gen-4 | ~1,050 | #5+ |
Runway Gen-4.5 currently holds the top position in the Text-to-Video benchmark .
But here’s my honest take on that ranking: Elo scores measure controlled test scenarios. In real-world creative workflows, the “best” model depends entirely on what you’re making.
- Need consistent characters across multiple scenes? Runway Gen-4.5 is your best bet
- Need complex action with physics? Kling handles motion better
- Need dialogue and lip-sync? Google Veo 3.1 wins
- Need quick iterations and testing? Both Runway and Kling work fine
My Eighth Step: Practical Workflow for Filmmakers
After all this testing, here’s the workflow I’ve settled on:
Step 1: Generate your anchor frame
Use a high-quality image model (Midjourney, Flux, NanoBanana Pro) to create your character and scene. Quality in, quality out — this step matters more than anything .
Step 2: Animate with Gen-4.5
Upload that image to Runway. Add a motion-focused prompt. Select 5-10 seconds. Generate.
Pro tip: Keep your prompt simple and action-oriented. “A woman walks toward camera” works better than “A woman wearing a blue dress walks toward the camera while the camera slowly dollies left and right.”
Step 3: Chain multiple shots
Create your second anchor frame — same character, different angle or scene. Generate. The model will maintain consistency automatically .
Step 4: Upscale and edit
Use Runway’s built-in upscale button for 4K output. Import into your editing software. Add sound, music, transitions.
My cost estimate for a 3-minute short film:
- Anchor frames: ~20 images ($0.50-1.00)
- Video generations: ~30 clips at 10 seconds each ($0.75-1.50 on Pro plan)
- Total: $1.25-2.50 for raw footage
Compare that to renting a camera, hiring actors, scouting locations, shooting for three days. The math is not close.
My Final Step: The Verdict
Is Runway Gen-4.5 the new gold standard for AI cinematography?
For character consistency and multi-shot narrative coherence — yes, absolutely. No other model keeps the same face across different scenes as reliably as Gen-4.5 .
For action physics, prompt adherence, and raw visual fidelity — not yet. Kling and Veo both beat it in specific scenarios .
Here’s my decision guide:
| You should use Gen-4.5 if… | You should use something else if… |
|---|---|
| You’re making a narrative with the same characters across scenes | You need complex action or water physics |
| You want to maintain visual consistency shot-to-shot | Dialogue and lip-sync are critical |
| You’re comfortable with 720p base + upscaling | You need native 4K output |
| You’re working within 10-second clips | You need longer continuous sequences |
The bottom line: Runway Gen-4.5 doesn’t win every benchmark, but it solves the hardest problem in AI video — keeping characters consistent across multiple shots. If you’re making anything with recurring characters (short films, ads, explainer videos, animated series), this is currently the best tool for the job.
For everything else, keep Kling and Veo in your back pocket. The best AI filmmaker uses all three, depending on the shot.