DeepSeek V3.1

deepseek-v3.1

DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.

Provider	Source	Input Price ($/1M)	Output Price ($/1M)	Description
vercel	vercel	Input: $0.30	Output: $1.00	DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.
poe	poe	Input: $7,800.00	Output: -	Latest Update: Terminus Enhancement This model has been updated with the Terminus release, addressing key user-reported issues while maintaining all original capabilities: - Language consistency: Reduced instances of mixed Chinese-English text and abnormal characters - Enhanced agent capabilities: Optimized performance of the Code Agent and Search Agent Core Capabilities DeepSeek-V3.1 is a hybrid model supporting both thinking mode and non-thinking mode, built upon the original V3 base checkpoint through a two-phase long context extension approach. Technical Specifications Context Window: 128k tokens File Support: PDF, DOC, and XLSX files File Restrictions: Does not accept audio and video files
nvidia	models-dev	Input: $0.00	Output: $0.00	Provider: Nvidia, Context: 128000, Output Limit: 8192
siliconflowcn	models-dev	Input: $0.27	Output: $1.00	Provider: SiliconFlow (China), Context: 164000, Output Limit: 164000
chutes	models-dev	Input: $0.20	Output: $0.80	Provider: Chutes, Context: 163840, Output Limit: 65536
azure	models-dev	Input: $0.56	Output: $1.68	Provider: Azure, Context: 131072, Output Limit: 131072
siliconflow	models-dev	Input: $0.27	Output: $1.00	Provider: SiliconFlow, Context: 164000, Output Limit: 164000
iflowcn	models-dev	Input: $0.00	Output: $0.00	Provider: iFlow, Context: 128000, Output Limit: 64000
synthetic	models-dev	Input: $0.56	Output: $1.68	Provider: Synthetic, Context: 128000, Output Limit: 128000
submodel	models-dev	Input: $0.20	Output: $0.80	Provider: submodel, Context: 75000, Output Limit: 163840
azurecognitiveservices	models-dev	Input: $0.56	Output: $1.68	Provider: Azure Cognitive Services, Context: 131072, Output Limit: 131072
deepinfra	litellm	Input: $0.27	Output: $1.00	Source: deepinfra, Context: 163840
sambanova	litellm	Input: $3.00	Output: $4.50	Source: sambanova, Context: 32768
wandb	litellm	Input: $55,000.00	Output: $165,000.00	Source: wandb, Context: 128000

DeepSeek V3.1

Available at 14 Providers