Ahead of a full launch of Tenstorrent’s new generation of cluster-scale systems next week, EE Times got a first look at the company’s video generation demo. The Wan2.2-14B-based demo generated a five-second video (720p, 81 frames, 40 steps, based on an optimized model) from a text prompt in three seconds when EE Times tried it out. The company said its record of 2.4 seconds is around 10× faster than the same production-grade model running on other leading hardware.
The demo runs an optimized version of Wan2.2-14B created by partner Prodia, a fast image and video generation AI cloud company. It runs on four new BlackHole-generation Galaxy servers (128 Tenstorrent BlackHole chips).


Tenstorrent’s hardware architecture for AI unifies compute, memory, and networking to enable very large systems to run a single software program. This is achieved without proprietary interconnects or reconfiguration, and isn’t workload-specific.
While there’s a lot of focus on code generation today, video generation will take off imminently and drive entire markets, Tenstorrent senior fellow Jasmina Vasiljevic told EE Times.
“We’re pretty excited about video,” Vasiljevic said. “We see that with more Galaxies, we can tackle bigger resolutions and longer videos.”
One of the reasons video applications had stalled is that they required too much compute, she said.
“It used to take many minutes to generate a single video; now we can produce five seconds of video in two and a half seconds, which is better than real time,” she said. “Breaking the real-time barrier will unlock new verticals.”
A big bottleneck for video generation inference today is iterative denoising, which can be parallelized across frames but is ultimately a sequential process. New research is exploring autoregressive or predictive models that predict future tokens or frames similarly to how text LLMs make predictions, which is less sequential in nature. Tenstorrent sees the future of video generation as a combination of both approaches.
“The likelihood of these algorithms evolving aggressively to combine is very high,” Vasiljevic said. “We’re excited because [Tenstorrent] is good at both.”
Video customers
Tenstorrent CEO Jim Keller told EE Times that the company is working with some video-focused customers, including some of the big frontier AI labs.
“We’re pretty sure we’re the only ones who’ve been able to do [generation of video] at this kind of speed, and at these cost numbers,” he said. “Our mission right now is just to keep making it easier to do and easier to deploy, but the actual models run pretty solidly and are stable.”
Many video customers are running minor tweaks or iterations of core models, he said, which can run on top of the work Tenstorrent has already done on models such as Wan2.2-14B.
“The engine underneath is pretty common, which is cool because it makes porting relatively straightforward,” he said. “The big diffusion models have been around for a while, and everybody knows how they work, but running them across 128 chips in parallel at high utilization is new.”
Common ground across today’s video generation models means customers can get up and running quickly, especially given Tenstorrent’s fully open-source software stack. But the real aim is for the architecture to be general-purpose enough to handle the next generation of models, too, Keller said.
“Our mission has always been general-purpose AI computing… we’re anti-specialized,” he said. “With a lot of chips, we have a lot of SRAM, but all the chips also have DRAM, and they are highly networked together, so our platform is more general-purpose. The world is talking about specialized, specialized, specialized, and that should be terrifying [everyone] because as models change, that specialized hardware is not going to work.”
Benchmarks for other workloads are expected at Tenstorrent’s at-scale compute launch next week.
- We updated the article on 22/4/26 after Tenstorrent contacted us to clarify the amount of chips needed to run the video generation demo.
See also:
Layoffs At Tenstorrent As Startup Pivots Towards Developer Sales
Sally Ward-Foxton covers AI for EETimes.com and EETimes Europe magazine. Sally has spent the last 18 years writing about the electronics industry from London. She has written for Electronic Design, ECN, Electronic Specifier: Design, Components in Electronics, and many more news publications. She holds a Masters' degree in Electrical and Electronic Engineering from the University of Cambridge. Follow Sally on LinkedIn


