Inferact Launches with $150M to Commercialize vLLM Inference Engine
Creators of popular open source project spin out VC-backed startup at $800M valuation to optimize AI model deployment.
Inferact, a startup founded by the creators of the popular open source vLLM project, launched on January 22 with $150 million in seed funding at an $800 million valuation.
Backing
Andreessen Horowitz and Lightspeed led the round, with participation from Databricks' venture capital arm and UC Berkeley Chancellor's Fund.
Mission
Inferact aims to optimize the inference phase of AI models—a critical bottleneck in deploying large language models efficiently. The company will continue supporting the open source vLLM project while layering proprietary services including enterprise support, managed deployments, and specialized extensions.
Technical Capabilities
vLLM optimizes RAM usage and can boost inference speeds by generating multiple tokens at once rather than one at a time. These optimizations can significantly reduce response latency for users.
Industry Context
Co-founder Woosuk Kwon wrote: "We see a future where serving AI becomes effortless." The company represents the AI industry's shift from training bottlenecks to inference bottlenecks, as deployed models need to serve millions of users efficiently.
Inferact plans to release new performance optimizations and support for emerging AI architectures, enabling vLLM to run on more types of data center hardware.
Related Articles
Google Gemini 3.1 Flash-Lite Targets Enterprise Scale at $0.25 Per Million Tokens
Google has launched Gemini 3.1 Flash-Lite in preview, the fastest and most cost-efficient model in its Gemini 3 family, priced at just $0.25 per million input tokens with 2.5x faster time-to-first-token than its predecessor. The model targets high-volume enterprise workloads where cost and latency matter more than peak capability.
Mandiant Founder Kevin Mandia Raises $190 Million for AI Cybersecurity Startup Armadin
Kevin Mandia, who sold Mandiant to Google for $5.4 billion in 2022, has raised a record-breaking $190 million in combined seed and Series A funding for Armadin, a startup building autonomous AI security agents. Backed by Accel, GV, Kleiner Perkins, and the CIA's In-Q-Tel, Armadin is already working with Fortune 100 companies.
Nscale Raises $2 Billion Series C — the Largest Funding Round in European Tech History
London-based AI infrastructure company Nscale closes a $2 billion Series C at a $14.6 billion valuation — the largest funding round in European history — backed by Citadel, Dell, NVIDIA, and Nokia, with former Meta COO Sheryl Sandberg joining the board.