I've spent the last few months talking to customers — small startups and large platforms alike, but mostly folks building UGC applications. The conversations kept circling back to the same challenge: they all wanted to use foundation models with strong vision support to understand and process their video content, but getting it working reliably at scale was a nightmare.
These are perfect use cases for models like ChatGPT, Claude, or Gemini. But the reality? You need to extract frames at the right intervals, pull transcripts, clean up VTT formatting, write prompts that work consistently, handle rate limiting, parse structured responses, and do it all without blowing your token budget. The worst part? Every customer is solving these exact same problems independently with fragile, expensive code.
Using LLMs for video inference unlocks automated moderation, multilingual captions, semantic search, and content understanding. But getting all the pieces glued together is repetitive, tedious, and error-prone. So we decided to make it easier.
Introducing @mux/ai
Today we're launching the public beta of @mux/ai — an open-source TypeScript toolkit that makes it dead simple to build LLM-powered video features. It's a collection of battle-tested workflows for common video AI tasks, plus the low-level primitives you need to build custom features we haven't built ourselves yet.
Here's what chapter generation looks like with @mux/ai:
import { generateChapters } from "@mux/ai/workflows";
const result = await generateChapters("your-mux-asset-id", "en");
// That's it. You now have chapters.
console.log(result.chapters);
// [
// { startTime: 0, title: "Introduction and Setup" },
// { startTime: 45, title: "Main Content Discussion" },
// ...
// ]No dealing with transcript APIs, no prompt engineering, no retry logic. Just working chapters that you can drop straight into Mux Player.
Seven workflows, ready to use
We're shipping @mux/ai with seven pre-built workflows for the most common video AI tasks:
Content Understanding:
- Video Summarization: Generate titles, descriptions, and tags, with several tone options
- Chapter Generation: Smart segmentation with descriptive chapter titles, ready to drop into Mux Player
- Content Moderation: Detect inappropriate content using specialized models from OpenAI or Hive
- Video Embeddings: Generate embeddings for semantic search and recommendation experiences
Accessibility & Localization:
- Caption Translation: Multi-language support in minutes
- Audio Dubbing: AI-generated voice tracks in dozens of languages with ElevenLabs
- Burned-in Caption Detection: Identify hardcoded subtitles already in the picture so you don't double caption your videos
Each workflow handles the full complexity — fetching video data from Mux, formatting it for AI providers, making API calls with error handling, and returning typed results you can actually use.
Tested across providers with repeatable validation
Here's something that sets @mux/ai apart: every workflow ships with comprehensive evaluation coverage. We don't just vibe code the features and hope they work — we measure them systematically.
If you're not familiar with evals, they're automated tests that measure AI system quality across real-world scenarios. Instead of just checking if code runs without errors, evals validate that your LLM actually produces good results — accurate chapter titles, appropriate moderation scores, high-quality translations. They're essential for catching regressions when you change prompts or switch providers.
Using Evalite, we test every workflow against what we lovingly call the "3 E's framework":
Efficacy: Does it work correctly? We measure accuracy, output quality, schema compliance, and compare results across providers.
Efficiency: How fast is it? We track token consumption, latency, and scalability characteristics.
Expense: What does it cost? We calculate exact costs per request across OpenAI, Anthropic, and Google so you can make informed decisions about how you use LLMs, and which models you select.
We also host our latest eval results publicly, and they are regenerated on every CI run.
This isn't just documentation — it's our quality bar. We don't add new workflows without eval coverage. When you use @mux/ai, you're getting workflows that have been measured for quality, speed, and cost across every provider we support, and we use that data to pick the best default platforms and models for each workflow.
Choose your AI provider
We built @mux/ai to be flexible with how you're building your infrastructure today, in both the models and providers that you're using. Today @mux/ai is a "BYO LLM" toolkit, meaning you have to bring your own API keys for the providers that you want to use. This means you're still in complete control over the provider you pick, in alignment with your own data protection practices. At launch we're supporting the hosted LLMs at OpenAI, Anthropic, and Google, and we're working to support hosted models including Vertex and Bedrock to let you run licensed models in your own infrastructure.
Different models are better at different things, and prices change constantly. @mux/ai gives you a consistent interface across multiple providers:
- OpenAI: Industry-leading moderation, cost-effective tools with GPT 5.1
- Anthropic: Advanced reasoning with Claude 4.5 Sonnet
- Google: Multimodal processing with Gemini 2.5 Flash
- ElevenLabs: Premium voice synthesis for dubbing
- Hive: For premium, detailed content moderation
@mux/ai defaults to the most cost-effective models, but our evals show you exactly what you're getting with each provider. Want to know if Claude produces better chapter titles than GPT? Check the evals. Wondering if Google's lower cost is worth any quality tradeoff? The data's right there.
You can also A/B test providers in parallel:
const [openaiResult, anthropicResult] = await Promise.all([
generateChapters("your-mux-asset-id", "en", { provider: "openai" }),
generateChapters("your-mux-asset-id", "en", { provider: "anthropic" })
]);If you're a data driven decision maker like me, you can also use Mux Data's experimentation features to A/B test the output from different providers. Segment users by the version of chapters (or titles or descriptions) that you generate to see which are most effective. Just keep the "law of small numbers" in mind, especially for videos that don't get a lot of views.
Workflows for common tasks, primitives for everything else
@mux/ai is built around two complementary abstractions:
Workflows
Workflows are functions that handle complete video AI tasks end-to-end. Each workflow orchestrates the entire process: fetching video data from Mux (transcripts, thumbnails, storyboards), formatting it for AI providers, and returning structured results.
import { getSummaryAndTags } from "@mux/ai/workflows";
const result = await getSummaryAndTags("your-mux-asset-id", { provider: "openai" });Workflows are great for common tasks, but what about custom features? That's where primitives come in.
Primitives
Primitives are low-level building blocks that give you direct access to Mux Video components. They provide functions for fetching transcripts, storyboards, thumbnails, and processing text as well as providing utilities for preparing those components for best analysis by an LLM. Primitives are perfect for building your own custom workflow
Say you want to analyze cinematography style. There's no pre-built workflow, but you can compose primitives using the same components that our pre-built workflows use.
Every workflow in @mux/ai is built from these same primitives. Start with workflows, drop down to primitives when you're building something bespoke.
import { fetchTranscriptForAsset, getStoryboardUrl } from "@mux/ai/primitives";
const transcript = await fetchTranscriptForAsset("your-mux-asset-id", "en");
const storyboardUrl = getStoryboardUrl("your-mux-playback-id", { width: 640 });Real-world example: Content moderation
Let's take a look at how @mux/ai handles production complexity. Say you're building a UGC platform and need to moderate uploads, as soon as the asset becomes ready, but prior to publishing:
import { getModerationScores } from "@mux/ai/workflows";
export async function handleWebhook(req, res) {
const event = req.body;
if (event.type === 'video.asset.ready') {
const result = await getModerationScores(event.data.id, provider: "openai", { thresholds: { sexual: 0.7, violence: 0.8 } });
if (result.exceedsThreshold) {
await flagForReview(assetId);
} else {
await publishVideo(assetId);
}
}
}Under the hood, this generates thumbnails at optimal intervals, submits them to OpenAI's moderation API, handles rate limiting and retries automatically, and returns structured scores.
Want to try Hive's visual moderation instead for more granular moderation analysis? Just swap the provider:
const result = await getModerationScores(assetId, { provider: "hive" });Translation workflows that actually work
Translating captions isn't just running text through an API — you need to preserve timing, maintain VTT formatting, handle text expansion, and upload results back to Mux. We've done all of that:
import { translateCaptions } from "@mux/ai/workflows";
const result = await translateCaptions(assetId, "en", "es", {
provider: "anthropic"
});
console.log(result.uploadedTrackId); // New Spanish subtitle trackThis fetches your English captions, translates them with Claude, uploads them to S3, and adds a new track to your Mux asset. It works with 30+ languages out of the box.
Want audio dubbing too? We've integrated ElevenLabs:
import { translateAudio } from "@mux/ai/workflows";
const result = await translateAudio(assetId, "es", {
provider: "elevenlabs",
numSpeakers: 0 // Auto-detect
});The workflow downloads audio, sends it to ElevenLabs for dubbing, waits for processing, and adds the dubbed track to your asset. Days of implementation work are now a few lines of code.
Built to be durable, using Workflow DevKit “use workflow”
AI workflows (video or otherwise) have a frustrating habit of failing halfway through. Your moderation check passes, you're generating chapters, and then — network timeout, rate limit, random 500 from a provider which is having a bad day. Now you're stuck. Do you restart from scratch and pay for that moderation check again? Or do you write a bunch of state management code to remember where you left off?
This is where durable execution changes everything. @mux/ai is platform-agnostic, standalone, and works anywhere Node.js runs, but it's especially powerful with Workflow DevKit from Vercel for reliable, long-running pipelines. Workflow DevKit gives you automatic retries, state persistence across failures, and observability—without writing any of the orchestration code yourself.
All @mux/ai workflows are compatible with Workflow DevKit right out of the box. They're exported with "use workflow" directives, so you can call them directly as workflow steps:
import { start } from 'workflow/api';
import { getSummaryAndTags } from '@mux/ai/workflows';
const assetId = 'YOUR_ASSET_ID';
const run = await start(getSummaryAndTags, [assetId]);
// Optionally, wait for the workflow run return value:
// const result = await run.returnValueOr compose them into larger workflows:
import { start } from "workflow/api";
import { getSummaryAndTags } from '@mux/ai/workflows';
async function processVideoSummary(assetId: string) {
'use workflow'
const summary = await getSummaryAndTags(assetId);
const emailResp = await emailSummaryToAdmins(summary);
return { assetId, summary, emailResp }
}
async function emailSummaryToAdmins(assetId: string) {
'use step';
return { sent: true }
}
// This will call the processVideoSummary workflow that is defined above
// in that workflow, it calls `getSummaryAndTags()` workflow
const run = await start(processVideoSummary, [assetId]);Each step runs reliably with automatic retries. If something fails, the workflow resumes from where it left off. You don't lose the work you've already paid for. The execution is distributed across multiple serverless function invocations, so long-running AI operations never hit timeout limits.
Want to learn more about building durable video AI pipelines? Check out our deep dive: Building Reliable Video AI Workflows with @mux/ai and Vercel Workflow
Try it out, and get contributing!
@mux/ai is fully open source under Apache 2.0 — use it, modify it, fork it, whatever you need. We're building this in the open with live eval results, public CI, and comprehensive test coverage. Found a bug or built a workflow we haven't? We want your PRs. The eval framework makes it easy to validate your changes work as expected.
@mux/ai is available now from NPM in public beta:
npm install @mux/aiCheck out the documentation on GitHub for complete guides and examples. If you have ideas for new workflows or run into issues, open an issue or drop a message to the wonderful Mux support team.
This is just the first step in our investment in making Mux a great place to build at the intersection of video and AI. While @mux/ai gives developers maximum flexibility now, we're continuing to actively work on more primitives and features, and working on bringing the best that AI has to offer to Mux Video natively - keep your eye on the blog to see what we're launching in early 2026 – I'm sure you'll love it.
Video + AI shouldn't be hard. With @mux/ai, we're taking a big step in making it easier for everyone.



