Deepseek has become viral.
The AI Chinese Deepseek laboratory has broken into the conscience for the general public this week after its Chatbot application rose to the top of the Apple App Store list. Deepseek AI models, which have been trained using calculation economical techniques, have led Wall Street analysts – and technologists – to wonder if the United States can maintain its advance in race D ‘Ia and if the request for chips of AA will support.
But where does Deepseek come from, and how did he reach international renown so quickly?
The origins of the Deepseek merchant
Deepseek is supported by High Fly Capital Management, a Chinese quantitative coverage fund that uses AI to clarify its commercial decisions.
AI enthusiast, Liang Wenfeng, co -founded the huge thief in 2015. Wenfeng, who started to pride himself in the trade while a student of the University of Zhejiang, launched the management of high -level capital As a hedge fund in 2019 focused on the development and deployment of AI algorithms.
In 2023, High-Flyer launched Deepseek as a laboratory dedicated to the search for AI tools separated from its financial activity. With High-Flyer as one of its investors, the laboratory took place in its own business, also called Deepseek.
From the first day, Deepseek built its own data groups for model training. But like the other AI companies in China, Deepseek was affected by American export prohibitions on equipment. To train one of its most recent models, the company was forced to use Nvidia H800 fleas, a less powerful version of a chip, the H100, available for American companies.
The Deepseek technical team is supposed to distort young. Business would aggressively recruit Doctorate of AI researchers of the best Chinese universities. Deepseek also hires people with no computer experience To help its technology better understand a wide range of subjects, according to the New York Times.
Strong Deepseek models
Deepseek unveiled its first series of models – Deepseek Coder, Deepseek LLM and Deepseek Chat – in November 2023. But it was only last spring, when the startup published its family of new generation Deepseek -V2 models, which The AI industry has started to take note.
Deepseek -V2, a text analysis and image analysis system for general use, performed well in various AI landmarks – and was much cheaper to operate than the models comparable to the time. This forced the interior competition of Deepseek, including Bytedance and Alibaba, to reduce the prices of use of some of their models and to make others completely free.
Deepseek-V3, launched in December 2024, only added to the notoriety of Deepseek.
According to the internal reference tests of Deepseek, Deepseek V3 surpasses the downloadable and openly available models like Meta’s Llama and the “closed” models which can only be accessible via an API, such as the GPT-4O of Openai.
Equally impressive is the model of Deepseek R1 R1. Released in January, Deepseek affirms that R1 occurs as well as the O1 model of Openai on key references.
Being a model of reasoning, R1 effectively checks the facts, which helps him to avoid some of the traps which normally trigger models. Reasoning models take a little more time – usually minutes to a few more minutes – to achieve solutions compared to a typical non -season model. The advantage is that they tend to be more reliable in fields such as physics, science and mathematics.
However, there is a disadvantage of R1, Deepseek V3 and other models of Deepseek. Being an AI developed by Chinese, they are subject to reference By the Chinese Internet regulator to ensure that his answers “embody the basic socialist values”. In the Deepseek Chatbot application, for example, R1 will not answer questions about Tiananmen Square or the autonomy of Taiwan.
A disturbing approach
If Deepseek has a business model, it is not clear what is this model, exactly. The company assesses its products and services well below the market value – and gives the others for free.
The way Deepseek says it, the breakthroughs of efficiency allowed him to maintain extremely competitiveness of costs. Some experts dispute However, the company’s figures provided.
In any case, the developers have taken the Deepseek models, which are not open source because the sentence is commonly understood but are available under permissive licenses which allow commercial use. According to Clem Delangue, the CEO of Hugging Face, one of the platforms housing the models of Deepseek, The developers on the face cuddles have created more than 500 R1 “derivatives” models which have accumulated 2.5 million combined downloads.
Deepseek’s success against larger and more established rivals has been described as “reversal of AI” And Inaugurate “a new era of AI fraud”. The success of the company was at least partly responsible for the decrease in the course of NVIDIA action on Monday and for cause an audience From the CEO of Openai, Sam Altman.
As for what the future of Deepseek could hold is not clear. Improved models are data. But the American government seems to be distrust of what he perceives as a harmful foreign influence.