Tag: DeepSeek

  • DeepSeek Hit By Cyberattack As Users Flock To Chinese AI Startup

    DeepSeek Hit By Cyberattack As Users Flock To Chinese AI Startup

    Chinese startup DeepSeek said on Monday it will temporarily limit registrations due to a cyberattack after the company’s AI assistant amassed sudden popularity.

    The startup earlier in the day was also hit by outages on its website after its AI assistant became the top-rated free application available on Apple’s App Store in the United States.

    The company resolved issues relating to its application programming interface and users’ inability to log in to the website, according to its status page. The outages on Monday were the company’s longest in around 90 days and coincides with its sky-rocketing popularity.

    DeepSeek last week launched a free assistant it says uses less data at a fraction of the cost of incumbent players’ models, possibly marking a turning point in the level of investment needed for AI.

    Powered by the DeepSeek-V3 model, which its creators say “tops the leaderboard among open-source models and rivals the most advanced closed-source models globally”, the artificial intelligence application has surged in popularity among U.S. users since it was released on Jan. 10, according to app data research firm Sensor Tower.

    The milestone highlights how DeepSeek has left a deep impression on Silicon Valley, upending widely held views about U.S. primacy in AI and the effectiveness of Washington’s export controls targeting China’s advanced chip and AI capabilities.

    Technology stocks were hammered on Monday, sending the shares of Nvidia and Oracle plummeting.

    AI models from ChatGPT to DeepSeek require advanced chips to power their training. The Biden administration has since 2021 widened the scope of bans designed to stop these chips from being exported to China and used to train Chinese firms’ AI models.

    However, DeepSeek researchers wrote in a paper last month that the DeepSeek-V3 used Nvidia’s H800 chips for training, spending less than $6 million.

    Although this detail has since been disputed, the claim that the chips used were less powerful than the most advanced Nvidia products Washington has sought to keep out of China, as well as the relatively cheap training costs, has prompted U.S. tech executives to question the effectiveness of tech export controls.

    Little is known about the company behind DeepSeek, a small Hangzhou-based startup founded in 2023, when search engine giant Baidu released the first Chinese AI large-language model.

    Since then, dozens of Chinese tech companies large and small have released their own AI models, but DeepSeek is the first to be praised by the U.S. tech industry as matching or even surpassing the performance of cutting-edge U.S. models.

  • China’s DeepSeek Threatens ChatGPT’s Dominance Of AI Sector

    China’s DeepSeek Threatens ChatGPT’s Dominance Of AI Sector

    Chinese startup DeepSeek’s launch of its latest AI models, which it says are on a par or better than industry-leading models in the United States at a fraction of the cost, is threatening to upset the technology world order.

    The company has attracted attention in global AI circles after writing in a paper last month that the training of DeepSeek-V3 required less than $6 million worth of computing power from Nvidia H800 chips.

    DeepSeek’s AI Assistant, powered by DeepSeek-V3, has overtaken rival ChatGPT to become the top-rated free application available on Apple‘s App Store in the United States.

    This has raised doubts about the reasoning behind some U.S. tech companies’ decision to pledge billions of dollars in AI investment and shares of several big tech players, including Nvidia, have been hit.

    Below are some facts about the company shaking up the AI sector worldwide.

    Why is DeepSeek causing a stir? 

    The release of OpenAI’s ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their own chatbots powered by artificial intelligence.

    But after the release of the first Chinese ChatGPT equivalent, made by search engine giant Baidu, there was widespread disappointment in China at the gap in AI capabilities between U.S. and Chinese firms.

    The quality and cost efficiency of DeepSeek’s models have flipped this narrative on its head. The two models that have been showered with praise by Silicon Valley executives and U.S. tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta‘s most advanced models, the Chinese startup has said.

    They are also cheaper to use. The DeepSeek-R1, released last week, is 20 to 50 times cheaper to use than OpenAI o1 model, depending on the task, according to a post on DeepSeek’s official WeChat account.

    But some have publicly expressed scepticism about DeepSeek’s success story.

    Scale AI CEO Alexandr Wang said during an interview with CNBC on Thursday, without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington’s export controls that ban such advanced AI chips from being sold to Chinese companies. DeepSeek did not immediately respond to a request for comment on the allegation.

    Bernstein analysts on Monday highlighted in a research note that DeepSeek’s total training costs for its V3 model were unknown but were much higher than the $5.58 million the startup said was used for computing power. The analysts also said the training costs of the equally-acclaimed R1 model were not disclosed.

    Who is behind DeepSeek? 

    DeepSeek is a Hangzhou-based startup whose controlling shareholder is Liang Wenfeng, co-founder of quantitative hedge fund High-Flyer, based on Chinese corporate records.

    Liang’s fund announced in March 2023 on its official WeChat account that it was “starting again”, going beyond trading to concentrate resources on creating a “new and independent research group, to explore the essence of “AGI” (Artificial General Intelligence). DeepSeek was created later that year.

    ChatGPT makers OpenAI define AGI as autonomous systems that surpass humans in most economically valuable tasks.

    It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer has an office located in the same building as DeepSeek, and it also owns patents related to chip clusters used to train AI models, according to Chinese corporate records.

    High-Flyer’s AI unit said on its official WeChat account in July 2022 that it owns and operates a cluster of 10,000 A100 chips.

    How does Beijing view DeepSeek?

    DeepSeek’s success has already been noticed in China’s top political circles. On January 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessman and experts hosted by Chinese premier Li Qiang, according to state news agency Xinhua.

    Liang’s presence at the gathering is potentially a sign that DeepSeek’s success could be important to Beijing’s policy goal of overcoming Washington’s export controls and achieving self-sufficiency in strategic industries like AI.

    A similar symposium last year was attended by Baidu CEO Robin Li.

    (Reuters) 

  • ‪AI Smackdown: China’s DeepSeek Tops ChatGPT, Dominates US Apple Store Free Download Charts‬

    ‪AI Smackdown: China’s DeepSeek Tops ChatGPT, Dominates US Apple Store Free Download Charts‬

    Chinese startup DeepSeek’s AI Assistant on Monday overtook rival ChatGPT to become the top-rated free application available on Apple’s App Store in the United States.

    Powered by the DeepSeek-V3 model, which its creators say “tops the leaderboard among open-source models and rivals the most advanced closed-source models globally”, the artificial intelligence application has surged in popularity among U.S. users since it was released on Jan. 10, according to app data research firm Sensor Tower.

    The milestone highlights how DeepSeek has left a deep impression on Silicon Valley, upending widely held views about U.S. primacy in AI and the effectiveness of Washington’s export controls targeting China’s advanced chip and AI capabilities.

    AI models from ChatGPT to DeepSeek require advanced chips to power their training. The Biden administration has since 2021 widened the scope of bans designed to stop these chips from being exported to China and used to train Chinese firms’ AI models.

    However, DeepSeek researchers wrote in a paper last month that the DeepSeek-V3 used Nvidia’s H800 chips for training, spending less than $6 million.

    Although this detail has since been disputed, the claim that the chips used were less powerful than the most advanced Nvidia products Washington has sought to keep out of China, as well as the relatively cheap training costs, has prompted U.S. tech executives to question the effectiveness of tech export controls.

    Little is known about the company behind DeepSeek, a small Hangzhou-based startup founded in 2023, when search engine giant Baidu released the first Chinese AI large-language model.

    Since then, dozens of Chinese tech companies large and small have released their own AI models, but DeepSeek is the first to be praised by the U.S. tech industry as matching or even surpassing the performance of cutting-edge U.S. models.

    It offers a “PhD-level” AI at an economical rate of $2.19 per million output tokens compared to OpenAI’s $60 for similar usage. Some industry professionals have highlighted that this affordability has not been widely discussed, particularly in sell-side analyses. This omission could lead to uncertainties regarding the wider adoption and perception of DeepSeek’s capabilities.

    The company has been commended for its use of reinforcement learning techniques, which reportedly optimize training costs and reduce complexity. DeepSeek claims its 1.5 billion-parameter R1 model outperforms industry standards, including GPT-4 and Claude 3.5, in select tasks, with minimal hardware requirements. For instance, the R1 model can reportedly operate on devices as simple as an iPhone 16.

    Despite these breakthroughs, questions have been raised about DeepSeek’s originality. Critics have likened its approach to copying, citing statements from prominent figures like Sam Altman, who emphasized the risks of derivative models in achieving long-term success. Additionally, DeepSeek’s $6 million training budget has sparked skepticism regarding its scalability compared to competitors with larger investments.