AI

DeepSeek: Everything you need to know about the AI chatbot app

Deepseek has become viral.

The Chinese AI Lab Deepseek broke this week in regular consciousness after the chatbot -app to the top of the Apple App Store has risen. The Deepseek’s AI models, which have been trained with the help of arithmetic techniques, have led to Wall Street analysts and technologists asking whether the US can maintain its lead in the AI ​​race and whether the demand for AI chips will be able to to enforce.

But where did Deepseek came from and how did it rise so quickly to international fame?

Depth -Sprong of Deepseek

Deepseek is supported by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform its trade decisions.

In 2015, AI enthusiastic Liang Wengeng was co-founder of High Flyer.

In 2023, Highflyer Deepseek started as a laboratory that focuses on investigating AI tools that is independent of the financial activities. With a high flyer as one of his investors, the lab turned down in his own company, also known as Deepseek.

From the first day, Deepseek built its own data center clusters for model training. But just like other AI companies in China, Deepseek has been hit by American export bans on hardware. To train one of its more recent models, the company was forced to use Nvidia H800 chips, a less powerful version of a chip, the H100, available to American companies.

The technical team of Deepseek would Young Skew. The company Allegedly recruiting aggressive recruits Doctorate AI researchers from Top Chinese universities. Deepseek also hires people without any computer science background To help his technology better understand a wide range of topics, according to the New York Times.

See also  Zuck shrugs off DeepSeek, vows to spend hundreds of billions on AI

Deep Deep Seek models

Deepseek unveiled his first set of Model-Deepseek Coder, Deepseek LLM and Deepsek chat but chat but last spring in November 2023, when the startup its next-gene deepseek-V2 family of models released that the AI ​​industry, that AI -Industry began to note.

Deepseek-V2, a general text and image analysis system, performed well in various AI-benchmarks and at that time was much cheaper to be executed than comparable models. It forced the domestic competition from Deepseek, including Bytedance and Alibaba, to reduce usage prices for some of their models and to make others completely free.

Deepseek-V3, launched in December 2024, only added to the fame of Deepseek.

According to Deepseek’s internal benchmarkts, Deepseek V3 performs better than both downloadable, openly available models such as meta’s Lama and “closed” models that are only accessible via an API, such as the OpenAI GPT-4O.

Equally impressive is the R1 “reasoning” model of Deepseek. The Deepseek was released in January and claims that R1 is carrying out the O1 model of OpenAi on important benchmarks.

As a reasoning model, R1 effectively controls facts, which helps to prevent some of the pitfalls that normally stumble on models. Reasoning models take a little longer-mastery seconds to come up for longer-time to come up with solutions compared to a typical non-resurrection model. The advantage is that they tend to be more reliable in domains such as physics, science and mathematics.

However, there is a disadvantage of R1, Deepseek V3 and Deepsek’s other models. As a Chinese-developed AI, they are subject to benchmarking By the Chinese internet control giver to ensure that his answers ‘the socialist values ​​of the core embody’. In Deepseek’s Chatbot -App, for example, R1 does not answer questions about Tiananmen Square or the autonomy of Taiwan.

See also  The Ultimate Guide to Collaborative Robots

A disturbing approach

If Deepseek has a business model, it is not clear what that model is. The company praises its products and services well below the market value – and gives others away for free.

The way Deepseek tells it has enabled efficiency bribe to maintain the competitiveness of the extreme costs. Some experts dispute However, the figures that the company has delivered.

Anyway, developers have taken over the models of Deepseek, which are not open source because the expression is generally understood, but are available under permitted licenses that make commercial use possible. According to Clem Delangue, the CEO of Hugging Face, one of the platforms that Deepseek’s models organizes, Developers on a hug face have made more than 500 “derived” models of R1 They have combined 2.5 million downloads.

Deepseek’s success has been against larger and more established rivals Described as “ai upding ai” And Install in “A new era of AI Brinkmanship.” The success of the company was at least partially responsible for lowering the stock price of Nvidia by 18% on Monday and for Generate a public reaction from OpenAI CEO Sam Altman.

Regarding Deepseek’s future, it is not clear. Improved models are a given. But the American government seems to be Are wary of what it regards it as a harmful foreign influence.

WAN has an AI-oriented newsletter! Register here to get it in your inbox every Wednesday.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button