← Back to all models

DeepSeek V3.1

deepseek-v3.1

DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.

Available at 14 Providers

Provider Source Input Price ($/1M) Output Price ($/1M) Description Free
vercel vercel Input: $0.30 Output: $1.00 DeepSeek-V3.1 is post-trained on the top of DeepSeek-V3.1-Base, which is built upon the original V3 base checkpoint through a two-phase long context extension approach, following the methodology outlined in the original DeepSeek-V3 report. We have expanded our dataset by collecting additional long documents and substantially extending both training phases. The 32K extension phase has been increased 10-fold to 630B tokens, while the 128K extension phase has been extended by 3.3x to 209B tokens. Additionally, DeepSeek-V3.1 is trained using the UE8M0 FP8 scale data format to ensure compatibility with microscaling data formats.
poe poe Input: $7,800.00 Output: - Latest Update: Terminus Enhancement This model has been updated with the Terminus release, addressing key user-reported issues while maintaining all original capabilities: - Language consistency: Reduced instances of mixed Chinese-English text and abnormal characters - Enhanced agent capabilities: Optimized performance of the Code Agent and Search Agent Core Capabilities DeepSeek-V3.1 is a hybrid model supporting both thinking mode and non-thinking mode, built upon the original V3 base checkpoint through a two-phase long context extension approach. Technical Specifications Context Window: 128k tokens File Support: PDF, DOC, and XLSX files File Restrictions: Does not accept audio and video files
nvidia models-dev Input: $0.00 Output: $0.00 Provider: Nvidia, Context: 128000, Output Limit: 8192
siliconflowcn models-dev Input: $0.27 Output: $1.00 Provider: SiliconFlow (China), Context: 164000, Output Limit: 164000
chutes models-dev Input: $0.20 Output: $0.80 Provider: Chutes, Context: 163840, Output Limit: 65536
azure models-dev Input: $0.56 Output: $1.68 Provider: Azure, Context: 131072, Output Limit: 131072
siliconflow models-dev Input: $0.27 Output: $1.00 Provider: SiliconFlow, Context: 164000, Output Limit: 164000
iflowcn models-dev Input: $0.00 Output: $0.00 Provider: iFlow, Context: 128000, Output Limit: 64000
synthetic models-dev Input: $0.56 Output: $1.68 Provider: Synthetic, Context: 128000, Output Limit: 128000
submodel models-dev Input: $0.20 Output: $0.80 Provider: submodel, Context: 75000, Output Limit: 163840
azurecognitiveservices models-dev Input: $0.56 Output: $1.68 Provider: Azure Cognitive Services, Context: 131072, Output Limit: 131072
deepinfra litellm Input: $0.27 Output: $1.00 Source: deepinfra, Context: 163840
sambanova litellm Input: $3.00 Output: $4.50 Source: sambanova, Context: 32768
wandb litellm Input: $55,000.00 Output: $165,000.00 Source: wandb, Context: 128000