Chinese AI developer DeepSeek has released its “experimental” latest model, which it said was more efficient to train and better at processing long sequences of text than previous iterations of its large language models.
The Hangzhou-based company called DeepSeek-V3.2-Exp an “intermediate step toward our next-generation architecture” in a post on developer forum Hugging Face.
That architecture will likely be DeepSeek’s most important product release since V3 and R1 shocked Silicon Valley and tech investors outside China.
The V3.2-Exp model includes a mechanism called DeepSeek Sparse Attention, which the Chinese firm says can cut computing costs and boost some types of model performance. DeepSeek said in a post on X on Monday that it is cutting API prices by “50%+”.
While DeepSeek’s nex