DeepSeek’s R1 reportedly ‘more vulnerable’ to jailbreaking than other AI models

February 10, 2025

0 1 minute read

The newest model of Deepseek, the Chinese AI company that has shaken Silicon Valley and Wall Street, can be manipulated to produce harmful content, such as plans for a Bioweapon attack and a campaign to promote self-damage among teenagers, According to the Wall Street Journal.

Sam Rubin, senior vice president at Palo Alto Networks’ Threat Intelligence and Incident Response Division Unit 42, said the magazine that Deepseek is “more vulnerable to jailbreaking [i.e., being manipulated to produce illicit or dangerous content] Then other models. “

The magazine also tested the R1 model of Deepseek itself. Although there seemed to be fundamental guarantees, Journal said it successfully convinced Deepseek to design a social media campaign that, in the words of the chatbot, hunts for teenagers’ desire to belong, arm emotional vulnerability by algorithmic reinforcement. “

The chatbot was reportedly convinced to give instructions for a Bioweapon attack, to write a pro-hitler manifesto and write a phishing e-mail with MalwareCode. The magazine said that when Chatgpt got exactly the same instructions, it refused to satisfy.

It was previously reported That the Deepseek app is avoiding topics such as Tianamen Square or Taiwanese autonomy. And Anthropic CEO Dario Amodei recently said that Deepseek was performing ‘the worst’ on a safety test of Bioweapons.

Source link