Skip to main content
Startups 1 min read 462 views

Inferact Launches with $150M to Commercialize vLLM Inference Engine

Creators of popular open source project spin out VC-backed startup at $800M valuation to optimize AI model deployment.

TD

TechDrop Editorial

Share:

Inferact, a startup founded by the creators of the popular open source vLLM project, launched on January 22 with $150 million in seed funding at an $800 million valuation.

Backing

Andreessen Horowitz and Lightspeed led the round, with participation from Databricks' venture capital arm and UC Berkeley Chancellor's Fund.

Mission

Inferact aims to optimize the inference phase of AI models—a critical bottleneck in deploying large language models efficiently. The company will continue supporting the open source vLLM project while layering proprietary services including enterprise support, managed deployments, and specialized extensions.

Technical Capabilities

vLLM optimizes RAM usage and can boost inference speeds by generating multiple tokens at once rather than one at a time. These optimizations can significantly reduce response latency for users.

Industry Context

Co-founder Woosuk Kwon wrote: "We see a future where serving AI becomes effortless." The company represents the AI industry's shift from training bottlenecks to inference bottlenecks, as deployed models need to serve millions of users efficiently.

Inferact plans to release new performance optimizations and support for emerging AI architectures, enabling vLLM to run on more types of data center hardware.

Related Articles