← Directory · AI Infrastructure · AI Chips · Groq

Groq

AI Infrastructure AI Chips

Groq is an AI inference company headquartered in Mountain View, California, founded in 2016 by Jonathan Ross — the inventor of the Google TPU — to deliver the world fastest …

Visit site ↗

Headquarters United States

Headcount 201-500 employees

Updated Jul 2026

Overview Related Contact

What Groq Does

Groq is an AI inference company headquartered in Mountain View, California, founded in 2016 by Jonathan Ross — the inventor of the Google TPU — to deliver the world fastest inference for large language models. Groq raised $1.75 billion in total funding at a $6.9 billion valuation before NVIDIA agreed in December 2025 to pay $20 billion to license Groq Language Processing Unit (LPU) technology in the largest technology licensing deal in semiconductor history.

Groq continues to operate independently under CEO Simon Edwards, expanding GroqCloud as a public inference API platform. The LPU is a deterministic streaming dataflow architecture purpose-built for transformer model inference: unlike GPUs that handle diverse workloads with shared memory and unpredictable scheduling, the LPU executes attention and feed-forward layers as fixed-function hardware pipelines with zero memory bandwidth bottlenecks, delivering 5-10 times faster token generation than GPU-based alternatives.

GroqCloud benchmarks demonstrate 1,345 tokens per second on Llama-3 8B and 662 tokens per second on Qwen-3 32B, with sub-100ms first-token latency at scale. Pricing starts at $0.05 per million input tokens, making Groq among the most cost-efficient inference APIs alongside its speed advantage.

The platform supports leading open-source models including Meta Llama, Mistral, Gemma, and Qwen. Groq serves AI developers building real-time applications where inference speed is the critical bottleneck: voice AI systems requiring natural conversation cadence, code generation tools needing sub-second completions, agentic AI systems executing multi-step reasoning quickly, and customer-facing applications where latency directly affects user experience.

NVIDIA integration of the LPU architecture into its Groq 3 LPX inference accelerator validates the fundamental superiority of streaming dataflow for inference workloads and positions GroqCloud as the reference benchmark for AI inference performance.

Company Snapshot

Website Visit official site ↗

Headquarters United States

Employee range 201-500 employees

Profile updated Jul 5, 2026

Categories 2

Research Signals

275 Summary words

2 Categories

4 Related vendors

— Verified

Partner resource Hiring & careers

Hiring AI marketing talent?

MarketingManagerJobs.com lists marketing manager roles for teams scaling demand generation, content, product marketing, lifecycle, and growth. Useful for AI companies building a go-to-market team.

Browse marketing manager jobs ↗

Is this your company?

Editorial guides

How Groq compares in its category

Read our independently researched buyer's guides to see where Groq sits against the other leading vendors, how the category works, and what to check before shortlisting.

AI Infrastructure Best AI Infrastructure Companies Read the guide →

Related categories

Learn more about these categories

AI Infrastructure AI Infrastructure Companies that build the compute foundation of artificial intelligence — the AI accelerators (GPUs and custom silicon), GPU cloud platforms, … View 0 companies → AI Chips AI Chips Companies designing the semiconductors that train and run artificial intelligence — GPUs, wafer-scale processors, LPUs, and other AI accelerators. These … View 0 companies →

Groq

What Groq Does

How Groq compares in its category

Learn more about these categories

Related companies

Put your AI company in front of buyers