Inference Models

Name	Model ID	Input Price ($/1M)	Output Price ($/1M)	Description
Mistral Nemo 12B Instruct	mistral-nemo-12b-instruct	0.04	0.10	Provider: Inference, Context: 16000, Output Limit: 4096
Google Gemma 3	gemma-3	0.15	0.30	Provider: Inference, Context: 125000, Output Limit: 4096
Osmosis Structure 0.6B	osmosis-structure-0.6b	0.10	0.50	Provider: Inference, Context: 4000, Output Limit: 2048
Qwen 3 Embedding 4B	qwen3-embedding-4b	0.01	0.00	Provider: Inference, Context: 32000, Output Limit: 2048
Qwen 2.5 7B Vision Instruct	qwen-2.5-7b-vision-instruct	0.20	0.20	Provider: Inference, Context: 125000, Output Limit: 4096
Llama 3.2 11B Vision Instruct	llama-3.2-11b-vision-instruct	0.06	0.06	Provider: Inference, Context: 16000, Output Limit: 4096
Llama 3.1 8B Instruct	llama-3.1-8b-instruct	0.03	0.03	Provider: Inference, Context: 16000, Output Limit: 4096
Llama 3.2 3B Instruct	llama-3.2-3b-instruct	0.02	0.02	Provider: Inference, Context: 16000, Output Limit: 4096
Llama 3.2 1B Instruct	llama-3.2-1b-instruct	0.01	0.01	Provider: Inference, Context: 16000, Output Limit: 4096