Niv-AI Raises $12 Million to Solve AI Power Inefficiencies

Niv-AI Raises $12 Million to Solve AI Power Inefficiencies

Simon Glairy stands at the forefront of the rapidly evolving intersection between high-performance computing and energy infrastructure. As a recognized expert in risk management and AI-driven assessment within the Insurtech and energy sectors, he has dedicated his career to understanding how emerging technologies strain traditional physical systems. With the rise of massive “AI factories,” the challenge is no longer just about raw processing power, but about the volatile electrical demands that these frontier models place on our aging power grids. Glairy provides a deep dive into how precise measurement and intelligent synchronization can recover billions in lost revenue while stabilizing the infrastructure that powers our digital future.

This conversation explores the technical bottlenecks of modern GPU clusters, the shift toward millisecond-scale power sensing, and the development of an “intelligence layer” designed to harmonize data center consumption with grid capacity.

Data centers often throttle GPU usage by nearly 30% to manage millisecond-scale power surges. How do these rapid spikes specifically disrupt the relationship between processors and the electrical grid, and what are the primary ways this lost capacity impacts the return on investment for expensive chips?

When you are running thousands of GPUs in concert, the transition between intensive computation and inter-chip communication happens in the blink of an eye, creating violent power demand surges at the millisecond scale. These spikes are so sudden and unpredictable that data center operators often have to keep a massive buffer of “just-in-case” electricity or invest in incredibly expensive temporary energy storage to prevent a total shutdown. To play it safe, many facilities throttle their processors by as much as 30%, which means nearly a third of that incredibly expensive hardware is essentially sitting idle just to avoid tripping the circuit. This creates a massive financial leak because, as the industry saying goes, every unused watt is revenue lost, directly degrading the return on investment for chips that cost tens of thousands of dollars each. It’s a frustrating scenario where the hardware is capable of incredible speed, but the electrical “leash” is keeping it from ever reaching its full potential.

Deploying rack-level sensors provides millisecond-level granularity on power consumption during deep learning tasks. What unique power profiles are you identifying across different types of computation, and how does this data allow for more precise mitigation compared to traditional data center management tools?

Traditional management tools are far too slow, often looking at averages over seconds or minutes, which completely misses the “heartbeat” of a deep learning model. By using rack-level sensors that capture data at the millisecond level, we can see the unique electrical signatures of specific tasks, such as the massive draw during backpropagation versus the lighter load of simple inference. This granularity allows us to see exactly when and why a GPU cluster is hungry for power, moving away from guesswork and toward a surgical understanding of energy flow. With this data, we can identify “slack” in the system where power is being wasted and develop mitigation techniques that unlock existing capacity without needing a single new power line. It turns the data center from a blunt instrument into a finely tuned orchestra where every millisecond of energy is accounted for and utilized.

Using an AI model to synchronize power loads essentially creates an “intelligence layer” for data center engineers. How do you train such a system to predict surges before they happen, and what specific steps are required to ensure the system doesn’t introduce its own latency?

The training process involves feeding the AI model vast streams of millisecond-level data from various deep learning workloads so it can learn the precursors to a surge. This “copilot” for engineers is designed to recognize patterns in how GPUs communicate, effectively predicting a spike just before it hits the grid. To ensure we don’t introduce latency, the intelligence layer acts as a predictive orchestrator rather than a reactive gatekeeper, making adjustments in parallel with the computation tasks. We are building this specifically to handle the “two sides of the rope”—on one side helping the center utilize more of the power they are already paying for, and on the other, creating a more responsible and predictable profile for the utility provider. The goal is a seamless synchronization where the power supply and the computational demand move in lockstep, eliminating the jerky, inefficient cycles we see today.

Hyperscalers currently face significant land-use and supply chain obstacles when trying to expand infrastructure. In what ways can optimizing existing power draw reduce the need for new construction, and how does better synchronization help ease the grid’s “fear” of massive data center consumption?

We simply cannot continue to build data centers at the current pace due to the sheer lack of available land and the heavy delays in the supply chain for transformers and other high-voltage equipment. By optimizing the power draw within existing footprints, we can effectively “find” extra capacity that was previously hidden by inefficient throttling or safety margins. This is vital because the electrical grid is actually afraid of the data center; utilities worry that a sudden, massive surge from a frontier lab could destabilize local power distribution for everyone else. Better synchronization creates a smoothed-out, reliable power profile that makes data centers look like stable, predictable loads rather than volatile energy hogs. When we prove that a facility can operate at peak efficiency without risking a blackout, it reduces the friction between hyperscalers and local regulators, making the existing infrastructure far more valuable.

With operational systems heading to U.S. data centers in the coming months, what specific performance benchmarks are you targeting? How will these initial deployments influence the way frontier labs manage the communication between thousands of GPUs working in concert?

Our primary target for the next six to eight months is to prove that we can significantly bridge that 30% gap in squandered power while maintaining 100% system stability in real-world U.S. data centers. We are looking for a measurable increase in total computational throughput per watt, essentially allowing frontier labs to squeeze more “intelligence” out of the same electrical bill. These initial deployments will serve as a blueprint for how massive GPU clusters communicate, shifting the focus from just raw speed to coordinated energy efficiency. If we can successfully manage the communication overhead between thousands of processors without triggering safety throttles, it will change the fundamental architecture of how AI models are trained at scale. It transforms the “intelligence layer” from a luxury into a mandatory component of any high-performance computing environment.

What is your forecast for GPU power management?

I believe we are entering an era where power management will be treated as a primary computational constraint, just as important as memory bandwidth or clock speed. Within the next three to five years, I forecast that “energy-aware” scheduling will become the industry standard, where AI models and the physical grid will be linked by a real-time software bridge that dynamically adjusts workloads based on grid health and thermal limits. We will move away from static power caps and toward a fluid, predictive system that allows data centers to operate much closer to their physical limits without the fear of failure. Ultimately, the winners in the AI race won’t just be the ones with the most chips, but the ones who can most effectively choreograph the massive amounts of electricity those chips demand.

Subscribe to our weekly news digest.

Join now and become a part of our fast-growing community.

Invalid Email Address
Thanks for Subscribing!
We'll be sending you our best soon!
Something went wrong, please try again later