Experts ask for stronger protective measures

Jailbreaks kick chatbots to avoid the safety rules and increase security and ethical concerns.

Hacked AI-powered chatbots represent serious security risks by revealing illegal knowledge that the models have absorbed during training, said researcher at Ben Gurion University.

In her study, it is emphasized how “Jailbreak” models (LLMS) (Jailbreak “models (LLMS) can be manipulated in order to create dangerous instructions, e.g. how to hack networks, make drugs or carry out other illegal activities.

The chatbots, including those operated by models of companies such as Openai, Google and Anthropic, are trained in huge Internet data records. While experiments are undertaken to exclude harmful material, AI systems can still internalize sensitive information.

Security controls are supposed to block the release of this knowledge, but the researchers showed how they can be avoided with specially made input requests.

The researchers developed a “universal jailbreak” that was able to impair several leading LLMs. As soon as the chatbots were avoided, the chatbots reacted consistently to queries that are said to have triggered protective measures.

They found some AI models that were openly advertised online as “Dark LLMs”, without generate ethical restrictions and willing to generate answers that support fraud or cybercrime.

Professor Lior Rokach and Dr. Michael Fire, who headed research, said that the growing accessibility of this technology lowers the barrier for malicious use. They warned that a dangerous knowledge could soon be accessible by everyone with a laptop or telephone.

Although the AI providers were informed about the jailbreak method, the researchers say that the reaction is overwhelming. Some companies rejected the concerns than outside the frame of BUG Bounty programs, while others did not answer.

The report urges tech companies to improve the safety of the models by examining training data using advanced firewalls and developing methods for the “misunderstanding” of machines to remove illegal content. Experts also called for clearer security standards and independent supervision.

Openaai said his latest models had improved the resistance to Jailbreaks, and Microsoft had associated with his latest security initiatives. Other companies have not yet commented.

Would you like to learn more about AI, technology and digital diplomacy? If so, ask our diplo chat bot!

Leave a Comment Cancel reply