Fractile's Next-Gen AI Inference Hardware

329,559 followers

3w Edited

Inference is one of AI’s most important bottlenecks. As models move from answering questions to completing complex work, the ability to run them faster, cheaper, and at greater scale will define what the next generation of AI applications can do. Fractile is building next-generation inference hardware designed for the workloads now emerging across AI - from agentic coding to scientific discovery and enterprise automation. Read more from the Accel team on why we believe Fractile can become one of the world’s most important AI infrastructure companies and why we're co-leading the company’s Series B with Factorial Funds and Founders Fund here: https://lnkd.in/eMDAq_AE

Fractile: The future of AI inference accel.com

3 Comments

Dr. Sanket Ranjanlal Dhurandhar, graphic

Dr. Sanket Ranjanlal Dhurandhar 2w

Strong thesis. The AI race is no longer just about building smarter models — it’s about running intelligence efficiently at scale. Inference optimization will likely define the economics of next-generation AI platforms. Investments like this recognize that the next wave of AI winners may not be model creators, but those enabling continuous, production-grade AI execution. Optimizing inference is effectively optimizing the business model of AI itself — making this a strategically compelling bet.

1 Reaction

RaikaLabs 3w

The next major AI breakthroughs may come less from larger models and more from infrastructure that can run complex workloads faster, cheaper, and sustainably at scale.

See more comments

To view or add a comment, sign in

More Relevant Posts

Michael Intrator
3w
Report this post
Proud of the CoreWeave engineering teams behind this one. AI is increasingly being defined by inference economics, and this result reflects the investments we've made across the full stack and the exceptional work our teams are doing every day to build for that future.
Brian Venturo

Chief Strategy Officer at CoreWeave, Inc.
3w

At CoreWeave, we believe AI infrastructure needs to be built differently: optimized for real workloads, not just theoretical performance. Our results in Artificial Analysis's inference benchmark are a strong validation of that approach. CoreWeave delivered the top combination of inference speed and price-performance. This ranking represents the years we’ve spent investing in the full stack: from hardware and interconnects to runtime and model optimization. Because in production AI, none of these layers work in isolation, and customers feel every inefficiency. Seeing that work translate into real-world performance and economics is incredibly rewarding. Proud of the team, and even more excited about what this unlocks for customers creating the next generation of AI products. Read more on our blog: https://lnkd.in/eeCVHKPM
5 Comments
Like Comment
To view or add a comment, sign in
Brian Venturo
3w
Report this post
At CoreWeave, we believe AI infrastructure needs to be built differently: optimized for real workloads, not just theoretical performance. Our results in Artificial Analysis's inference benchmark are a strong validation of that approach. CoreWeave delivered the top combination of inference speed and price-performance. This ranking represents the years we’ve spent investing in the full stack: from hardware and interconnects to runtime and model optimization. Because in production AI, none of these layers work in isolation, and customers feel every inefficiency. Seeing that work translate into real-world performance and economics is incredibly rewarding. Proud of the team, and even more excited about what this unlocks for customers creating the next generation of AI products. Read more on our blog: https://lnkd.in/eeCVHKPM
4 Comments
Like Comment
To view or add a comment, sign in
Jack Bailin
3w
Report this post
CoreWeave just topped Artificial Analysis' inference benchmark for speed AND price-performance. Not by accident. This is the result of years of obsession over every layer of the AI stack showing up directly in the numbers. The results speak for themselves. Couldn't be more proud of the team. 🚀
Brian Venturo

Chief Strategy Officer at CoreWeave, Inc.
3w

At CoreWeave, we believe AI infrastructure needs to be built differently: optimized for real workloads, not just theoretical performance. Our results in Artificial Analysis's inference benchmark are a strong validation of that approach. CoreWeave delivered the top combination of inference speed and price-performance. This ranking represents the years we’ve spent investing in the full stack: from hardware and interconnects to runtime and model optimization. Because in production AI, none of these layers work in isolation, and customers feel every inefficiency. Seeing that work translate into real-world performance and economics is incredibly rewarding. Proud of the team, and even more excited about what this unlocks for customers creating the next generation of AI products. Read more on our blog: https://lnkd.in/eeCVHKPM
Like Comment
To view or add a comment, sign in
Mohmmad Hafeez
1w
Report this post
Just finished reading “Ray: A Distributed Framework for Emerging AI Applications” and it completely changed the way I look at scalable AI systems. What makes this interesting is how it tackles one of the biggest problems in modern AI.... handling dynamic workloads efficiently at massive scale. Instead of separating training, simulation, and serving into different systems, Ray unifies them into a single distributed framework using dynamic task scheduling, actor-based computation, and distributed object storage. The result? AI systems capable of executing millions of tasks per second with low latency while still being fault tolerant and scalable. A really insightful read for understanding where the infrastructure behind next-generation AI is heading.
Like Comment
To view or add a comment, sign in
OpenClaw Development

50 followers
1mo
Report this post
AI just proved you don't need massive models to crush reasoning tasks anymore. A tiny 7-million parameter model is now outpacing giants a thousand times its size on benchmarks like the ARC Prize. That's recursive reasoning in action – not bigger brains, but smarter loops at inference time.[1] This comes from fresh papers on Hierarchical Reasoning Models (HRMs) and Tiny Recursive Models (TRMs). They smash state-of-the-art on ARC Prize 1 and 2 with just 27 million parameters, trained lean. Standard LLMs hit a wall on deep reasoning because they lack that recursive depth. But add inference-time recursion, and small models get the compute power to punch way above their weight.[1] For businesses, this is massive. We're shifting from brute-force scaling to efficient smarts. OpenAI's already racing ahead, blasting past their 10GW infrastructure goal for Stargate – they've added over 3GW in the last 90 days alone to feed exploding AI demand.[2] Demand's skyrocketing because tools like these make automation viable everywhere, not just for tech giants. Think about it. Routine tasks that bog down teams? Compressed overnight. Leaders wasting hours on commodity work? Now twice as productive, but only if they redirect that time right. This signals the real automation wave: not replacing jobs wholesale, but freeing humans for durable work – the stuff that reframes questions, not just answers them.[3] Meanwhile, compute builds like Stargate show the infrastructure's catching up fast. Smaller, recursive models mean any company can deploy serious AI without million-dollar GPU farms. We're heading to an era where inefficiency gets obliterated, and smart automation runs the backend quietly. This is exactly the kind of breakthrough Katy at Gitwix handles daily, streamlining chaos into smooth ops. How's your team auditing tasks for AI compression? What's the one routine you'd kill off first? #AI #AIAutomation #FutureOfWork
Like Comment
To view or add a comment, sign in
William Elliott
3w
Report this post
Recursive Exits stealth mode with $650M in funding, building AI that doesn't just learn but improves its own improvement algorithms. The company claims its system can rewrite its own training procedures to achieve compound performance gains over time, bypassing the scaling bottlenecks that have plagued large model development. This is a bet that the next frontier isn't bigger models or more data, but recursive self-optimization. If Recursive's approach works, it could redefine how we think about model capability curves and chip away at the "human in the loop" assumption for research itself. For AI engineers and investors watching the compute efficiency race, this signals a shift from brute force to synthetic creativity in architecture design. Full analysis is available here: https://lnkd.in/gcXU7rCg Is trusting an AI to improve its own training loop a breakthrough or a control problem waiting to happen?
Like Comment
To view or add a comment, sign in
Pratik Dhanave
5d
Report this post
Came across a thoughtful piece from QuantumBlack (AI by McKinsey) on why so many GenAI projects stall between a great demo and a dependable production system. Sharing a few points that stuck with me. Their core argument: when GenAI projects struggle in production, the problem is usually not the model — it's the structure around it. Once agentic logic gets woven into a real pipeline, prompts, tool calls, and routing logic end up scattered through code that assumes deterministic execution. The result is failures that are hard to explain and runs that are hard to reproduce. The point I found most interesting is that we've seen this pattern before. They draw a parallel to early machine learning pipelines — experimentation-first, fragile, hard to reproduce — and note that the fix wasn't better algorithms but engineering discipline: structure, reproducibility, and observability. Their take is that GenAI has hit the same inflection point, just much faster. A few of the practices they highlight: Keep a stable, deterministic backbone that controls what runs and in what order, while letting agents handle reasoning inside well-defined steps. Make agent configuration explicit — declaring the model, prompt version, and tools up front, so you can understand why an agent behaved a certain way. Treat prompts and evaluation sets as versioned data you can inspect and compare. Build observability and evaluation into the system itself rather than bolting them on later. Keep framework boundaries clean, so new tools can be adopted without rewriting the whole system. Their closing thought, which I thought was a nice way to put it: a prototype is something one person can run, while a production system is something a whole team can understand, observe, evaluate, and evolve safely. Worth a read if you're working on getting agentic systems past the demo stage. Reference "Generative AI workflows need engineering discipline to scale beyond the demo" — QuantumBlack, AI by McKinsey https://lnkd.in/dXj7QUEX #GenAI #AgenticAI #MLOps #AIEngineering #MachineLearning
Like Comment
To view or add a comment, sign in
John Michaels [Open Networker]
3w
Report this post
Great article by Julien Kervizic on the emergence of “Agentic FinOps” and why AI engineers will increasingly need to think beyond model capability alone. As agentic systems grow in complexity, every retry loop, orchestration step, and reasoning chain becomes a direct operational cost. The challenge is no longer just building intelligent systems, but building systems that balance: * quality * latency * reliability * economic sustainability A strong perspective on how AI engineering is evolving from pure capability optimisation toward cost-aware system design. https://lnkd.in/evgZfM9Z

Agentic FinOps: Why AI Engineers Must Learn Cost Discipline medium.com
Like Comment
To view or add a comment, sign in
Utpal D.
6d
Report this post
This WSJ piece by Andy Kessler makes a really insightful point that complements the ideas in my recent post on the Pricing Conundrum about the 2026 AI renewal cliff. It’s worth a read (unlocked link below). Its Global Crossing analogy to today’s AI pricing setup is exactly what I was talking about: demand for AI can and will grow, even to the levels predicted, but not at current prices. And AI pricing strategy is going to have to adjust quickly as input costs drop substantially. I argued that renewed AI contracts this year (and next) may not necessarily mean the original value story is holding up; analysts should pay more attention to the details behind the renewal contracts, such as whether they expanded, reduced, converted to usage-based pricing, were tied to narrower outcomes, or pushed through by inertia. When (not if, as Kessler points out) token costs fall significantly, CFOs and procurement teams will have greater leverage. AI adoption and use will increase, but today’s pricing math and the math a year or 18 months from now will be very different. My post: https://lnkd.in/g86pE7jW WSJ piece: https://lnkd.in/gniN6T3H

Opinion | The Hallucinatory AI Math wsj.com

1 Comment
Like Comment
To view or add a comment, sign in
Alex M.
3w
Report this post
Transformers by Hugging Face isn't just a library - it's a revolution in machine learning infrastructure. By providing an intuitive framework that spans multiple domains, it's democratizing advanced AI development for researchers and practitioners worldwide. The library's ability to abstract complex model architectures while maintaining performance is genuinely transformative. How is your team leveraging cutting-edge ML frameworks to accelerate innovation?
Like Comment
To view or add a comment, sign in

329,559 followers

View Profile Connect

Fractile's Next-Gen AI Inference Hardware

More Relevant Posts

Explore related topics

Explore content categories