AI inference · x402 native

Inference without
the invoice.

beamr serves frontier models on demand and settles every request instantly with x402. No subscriptions, no minimums — just pay for the tokens you use.

Trusted by builders at 200+ AI startups

Inference,
evolved.

Open weightsFrontier Vision100+ models

Model garden

Call any frontier or open-weight model through a single endpoint — swap GPT-class, Llama, Qwen, or your own fine-tune in one line.

Pay per callUSDC native No minimumsInstant settle

x402 billing

Every request settles the instant it completes. No subscriptions, no invoices — just metered tokens paid in USDC over x402.

Sub-50msGlobal edge AutoscaleNo cold start

Edge inference

Models run at the edge, close to your users. Warm pools and autoscaling keep latency low under any load.

  • Unified API
  • x402 settlement
  • Elastic GPUs
  • Observability

Unified API

Call any model — frontier or open-weight — through one OpenAI-compatible endpoint. Switch models with a single parameter.

x402 settlement

Every request is metered and settled the moment it completes — paid in USDC. No invoices, no subscriptions, no surprises.

Elastic GPUs

Capacity scales with demand. Warm pools kill cold starts and route spikes to the fastest available region automatically.

Full observability

Trace every call — latency, tokens, spend and model version — streamed live to your dashboard and exportable on demand.

chat.ts
01const beam = new Beamr(apiKey)
02
03const res = await beam.chat({
04 model: "llama-3.3-70b",
05 messages,
06 settle: "x402"
07 })
receipt
Requestchat.completion
Tokens1,284
Cost$0.0021 USDC
Settledblock #19,402,118
Status✓ confirmed
scaling.policy
regionauto
min_replicas0
max_replicas240
cold_startnone
p50_latency41 ms
usage · today
Tokens4.2M
Spend$8.74
Avg latency47 ms
Top modelqwen-2.5-72b
Errors0.00%

Built for builders.

Join the early-access network and start shipping pay-per-call AI in minutes. No credit card — your first requests are on us.

𝕏