Lumai Productizes Lens-Based Optical Computer

British startup Lumai is productizing its lens-based optical computer for matrix-multiply acceleration in AI inference. This is the first time an optical computing system has successfully run billion-parameter models and goes a long way towards demonstrating the technology’s commercial viability, Phil Burr, head of product at Lumai, told EE Times.

The company has engineered out the reasons existing photonic compute solutions have failed, Burr said.

“We get scalability because we compute in 3D volume, so we can have massive parallelism,” he said. “We can use industry standard components and materials, albeit customized, but we’re not having to go through a whole design cycle for a new material.”

Lumai’s optical accelerator does not use integrated photonics. Input vectors are encoded into 1,024 laser light sources and duplicated by lenses. The encoded data streams then pass through an electronic display with darker and lighter pixels encoding the weights. Passing the light through the display multiplies it, and then a final lens combines the results as a form of addition. (More here about how it works.)

Pressure Sensors: Turning Environmental Signals into Smart Actions

By Andrej Seb, Staff Engineer, Infineon Technologies 04.29.2026

Input Capacitor Challenges in High-Density PD and GaN Chargers — YMIN Capacitor Solutions

By Shanghai Yongming Electronic Co.,Ltd 04.28.2026

By Rejoy Surendran, Market Strategy Manager & Xinpei Cao, Sr. Principal, Application Engineering, Henkel 04.27.2026

This system uses practically zero energy for computation, but requires energy to convert between the electrical and optical domains, and power the lasers and detectors, for example. Lumai said its technology can offer 50× the performance of today’s GPUs with a 90% reduction in power.

“Power is the limit for data centers at the moment,” Burr said. “Our solution is addressing that limit by being a lot more efficient… We are delivering more compute and more tokens within the same power budgets.”

Image of a look inside a a Lumai Iris server. — Inside a Lumai Iris server. (Source: Lumai)

Lumai’s efficiency is in part due to how it handles large matrices, 2048 by 2048, in single operations.

“In optics, the dominant power [consumption] is from the conversion between electrical and optical, and back again,” Burr said. “But the good thing is that power is linear with the size of the vector in the matmul, whereas the performance is proportional to the square. So as you increase the matrix size, the efficiency goes up.”

A line graph comparing "Compute" and "Energy" performance across increasing "Matrix Size." The "Compute" curve rises exponentially as matrix size increases, while the "Energy" curve shows a much lower, more linear rate of growth, illustrating high computational efficiency. — Free-space optical computing is more energy efficient for large matrix operations (Source: Lumai)

Lumai’s setup has a digital processor (a CPU) that handles some non-linear operations and offloads matrix multiplication to the optical system via a hardware-aware orchestration layer. This layer decides which parts of the workload run on the CPU and which are converted to optical. In some cases, parts of the algorithm that are particularly sensitive to accuracy could run on the CPU rather than suffer conversion to the analog domain, Burr said. For Llama, 90% of the workload runs in the optical domain, he added.

In between the CPU and the optical engine, an FPGA handles conversion between the electrical and optical domains. A design for an ASIC to replace these two elements is underway and will appear in Lumai’s next generation of products.

Lumai’s Iris Nova inference server, which contains a single first-gen optical engine, will be offered to hyperscale customers as a way of evaluating the technology. Today, it runs Llama as a demo.

An image of Lumai's Iris Nova server. — Lumai’s Iris Nova server (Source: Lumai)

“We have very much focused on Llama, which is driven by customers,” Burr said. “Llama is open-source; it’s a benchmark customers are using to assess performance. But we’re continuing to expand the workloads we can run.”

Burr expects Iris Nova servers to be stood up in test clusters by the end of 2026. The next iteration, Iris Aura, will combine multiple optical engines in a rack. After that, Iris Tetra will enable cluster-scale deployments. Tetra, due in 2029, will be able to reach 100 TOPS/W (INT8), providing 1 exaOPS within a 10kW power budget, according to the company.

“Part of the reason for that fairly fast iteration is that we want to get systems out there,” Burr said. “We want customers to be able to evaluate them, run their software on them. We want to be able to create scale-up clusters.”

Customer feedback so far has been that the gradual introduction of new technologies helps avoid issues integrating systems further down the line, Burr said.

Lumai’s evaluation server Iris Nova is set up for full Llama inference, but in general, the technology lends itself to efficient prefill in a disaggregated data center, Burr said, since prefill is generally compute-bound. This is particularly efficient in scenarios with many users and for longer input context lengths in applications like agentic AI and enterprise AI.

“At the moment we are doing both prefill and decode, and Iris Nova could be deployed as a prefill and decode solution, but given the move towards disaggregation and the fact that we really excel at compute-bound problems, then clearly prefill is the place to position Iris Nova,” Burr said. “It partly depends on the model; there are some models which are much more compute-heavy, and in principle, we could create a version that had much more memory and then position that for decode. But the optics really shines at the compute part.”

See also: