Staxly
ai-api

Together AI

Open-source LLM infra — inference + fine-tuning + dedicated GPUs + image/video/audio

Full-stack infra for open-source AI: serverless inference (200+ models), dedicated GPU endpoints, fine-tuning, image/video/audio models, Code Sandbox.

Together AI websiteDocs ↗

Pricing

TierPriceNotes
Pay-as-you-goFreePer-token pricing for serverless inference. No minimum.
Dedicated EndpointsFreeSingle-tenant GPU endpoints billed hourly.
Batch API (50% off)Free50% discount for async batch processing on most serverless models.
Reserved GPU ClustersFree6+ day commitments with discounted reserved rates.
EnterpriseCustomCustom. Private deployments, VPC, SLAs, dedicated support.

Limits

TierMetricValueNotes
audio whisper$0.0015/min (Whisper Large v3)Whisper
code interpreter$0.03 per 60-min sessionCode Interpreter
deepseek r1$3/M input + $7/M outputDeepSeek-R1
fine tune 70 100b$2.90-$8.00 per 1M tokensFine-tune large models
fine tune up to 16b$0.48-$1.35 per 1M tokensFine-tune small models
gemma 3n e4b$0.06/M input + $0.12/M outputGemma 3n E4B (cheapest)
gemma 4 31b$0.20/M input + $0.50/M outputGemma 4 31B
glm 5 1$1.40/M input + $4.40/M outputGLM-5.1
gpu b200 single$9.95/hr (1x B200 180GB)B200 dedicated
gpu h100 single$3.99/hr (1x H100 80GB)H100 dedicated
image flux pro$0.03/image (FLUX.2 pro)FLUX.2 pro
image flux schnell$0.0027/imageFLUX.1 schnell
llama 3 3 70b$0.88/M I/OLlama 3.3 70B
qwen3 5 397b$0.60/M input + $3.60/M outputQwen3.5 397B
qwen3 5 9b$0.10/M input + $0.15/M outputQwen3.5 9B (budget)
storage rate$0.16/GiB/monthStorage
tts cartesia$65/1M characters (Cartesia Sonic-3)TTS
video veo3$1.60/video (Google Veo 3.0)Veo 3.0

Features

Developer interfaces

SlugNameKindVersion
code-sandboxCode Sandbox / Interpreterrestv1
dedicated-endpointsDedicated Endpointsrestv1
cliTogether CLIcli1.x
sdk-nodetogether-jssdk0.x
sdk-pythontogether-pythonsdk1.x
rest-apiTogether REST API (OpenAI-compat)restv1

Compare Together AI with

Staxly is an independent catalog of developer platforms. Outbound links to Together AI are plain references to their official pages. Pricing is verified at publication time — reconfirm on the vendor site before buying.