Remember when DeepSeek briefly shook up the entire artificial intelligence industry by launching its large language model, R1, that was trained for a fraction of the money that OpenAI and other big players were pouring into their models? Thanks to a new paper published by the DeepSeek AI team in the journal Nature , we finally know what it took to train DeepSeek 1: $294,000 and 512 Nvidia H800 chips. The reason it was able to spend less, it seems, is because of the team’s use of trial-and-error-based reinforcement learning techniques.

Most AI models tasked with performing reasoning tasks need to be trained on human-annotated data and demonstrations to “learn” how to solve certain problems, which is both expensive and time-consuming to scale as models are given more challenging tas

See Full Page