Model Inference API - Search News

19h

OrcaRouter Launches the Open LLM API Router -- Zero Markup, MIT-Licensed, 100+ Models

Today, Continuum AI released OrcaRouter and OrcaRouter Lite — a unified inference layer that routes across 200+ frontier and open-source language models, with zero markup on BYOK traffic.

The Next Web

OpenAI launches GPT-Realtime-2 and two new voice API models

The three are GPT-Realtime-2, a successor to the company’s existing realtime voice model with what OpenAI describes as GPT-5-class reasoning; GPT-Realtime-Translate, a live translation model with more ...

Firefox maker torches Google for building Prompt API into browser

The Prompt API, as Google describes it, "gives web pages the ability to directly prompt a browser-provided language model." ...

Security Boulevard

Open vs. Closed Weight Models and Why You Need Confidential Inference Either Way

The open vs. closed AI model debate misses the bigger issue. Confidential inference secures model weights and data during runtime.

10d

DigitalOcean Launches Inference Engine with New Capabilities for Production AI, Including Inference Router for Efficient Scaling of Agentic Workloads

DigitalOcean (NYSE: DOCN) today announced the launch of its Inference Engine, a set of new production capabilities that give AI builders exceptional performance and unified control over how they run, ...

11d

Grath Launches Topa, a Managed Inference Platform for Financial Reconciliation

The company behind the Grath reconciliation platform introduces AI infrastructure for financial services teams building their ...

i-SCOOP

Nebius AI cloud for training and inference at scale

Explore Nebius, the AI cloud built for GPU intensive training, scalable inference, managed ML tools and real world AI ...

The Next Web

Nebius paid $643 million for 20 people because inference is where the money is

Nebius pays $643M for Eigen AI, a 20-person MIT spinout that maximises tokens per GPU. In the neocloud race, inference optimisation is the competitive edge.

Meet ZAYA1-8B, a super efficient, open reasoning model trained on AMD Instinct MI300 GPUs

The real headline is what ZAYA1-8B was trained on: a full stack of AMD Instinct MI300 graphics processing units (GPUs), the ...

Ventureburn

DeepInfra Raises $107M To Scale Global Inference Infrastructure

DeepInfra raises $107M to expand global inference capacity, support new AI models, and enhance developer tooling across its ...

FintechNewsSG

Agnes AI Opens Zenmux API Access, Launches Token Plans for Developers

Agnes AI expands Zenmux API access and launches low-cost token plans, advancing its integrated multimodal AI platform.

Vietnam Investment Review on MSN

Phancy Group scales computing resources to boost API business

HONG KONG SAR - Media OutReach Newswire - 30 April 2026 - Phancy Group Co., Ltd. (Stock Code: 6682.HK) announced today that it proposes to purchase GPU servers and related accessories at a total ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results