Mantle
Inference API
Private model access, behind one endpoint.
Use an issued bearer token with OpenAI-compatible routes. Requests are routed by model name to the right inference service.
-
Models
/v1/models -
Gemma 4 31B
/gemma-4-31b/v1/chat/completions -
Gemma 4 26B A4B
/gemma-4-26b-a4b/v1/chat/completions -
Pattern
/{model-name}/v1/...