To put it simply:
The second inference model R2, between April and May. Based on the previous base model V3
Sometime in the second half of 2025, the next base model V4
R2 is developed by simply conducting more Reinforcement Learning. This is exactly what the DeepSeek team mentioned on Twitter in February: "RL is still in its early stages, and we will see'significant progress' this year."
As the RL data increases, the model's ability to solve complex reasoning tasks will continue to improve steadily, and some complex behavioral capabilities will emerge spontaneously, such as "reflection" and "exploring different methods."
Then, what can we expect from the next base model V4? And what’s the implications for the Capex story?