One of Google's youngest Gemini Ai models is more noticeable in security

A recently published Google AI model is worse than its predecessor for certain safety tests after the company's internal benchmarking.

In a technical report published this week, Google shows that its Gemini 2.5 flash model generates text that violates its security guidelines than with Gemini 2.0 Flash. Gemini is 2.5 flash 4.1% or 9.6% back in two metrics, “SMS-Text security” and “image-to-text security”.

The security of text-to-text measures how often a model violates Google's guidelines, while security for image-to-text security evaluates how exactly the model keeps at these limits if they are requested using an image. Both tests are automated and not checked by humans.

In an email declaration, a Google spokesman confirmed that Gemini 2.5 flash “cuts off more worse in the security of text and image-to-text”.

These surprising benchmark results come when AI companies allows their models more -in other words, less likely that they refuse to react to controversial or sensitive issues. For his latest harvest from Lama models, Meta said that the models had coordinated in order not to support “some views of others” and to respond to more “discussed” political requests. Openai said at the beginning of this year that future models would optimize so as not to take any editorial attitude and offer several perspectives on controversial topics.

Sometimes these admission efforts have backladed. Techcrunch reported on Monday that the standard model OpenAS Chatgpt made it possible for minors to generate erotic conversations. Openai accused the behavior of a “mistake”.

According to Google's technical report, Gemini 2.5 Flash, which is still in the preview, follows the instructions faithfully as Gemini 2.0 Flash, including instructions that exceed problematic lines. The company claims that the regressions can partially be attributed to false positive results, but there is also that Gemini 2.5 Flash sometimes generates “injury -related content” when explicitly asked.

Techcrunch event

Berkeley, approx.
|
June 5th

Book now

“Of course there are tensions between [instruction following] On the violations of sensitive topics and security policy, which is reflected in our reviews, ”the report says.

Realmap's ratings, a benchmark that examines how models react to sensitive and controversial requests, also indicate that Gemini 2.5 Flash rejects the answering of controversial questions far less than Gemini 2.0 Flash. Techcrunch testing the model via the AI platform OpenRouter found that it will write uncomplicated essays to replace human judges with AI, to weaken the protection of the proper procedure in the United States and to implement widespread state monitoring programs of state state monitoring.

Thomas Woodside, co -founder of the Secure AI Project, said the limited details that Google gave in its technical report showed that more transparency is required for model tests.

“There is a compromise between instructions and guidelines, since some users may ask for content that violates guidelines,” Woodside told Techcrunch. “In this case, Google's latest flash model corresponds to more instructions and at the same time violates guidelines. Google does not provide many details about the specific cases in which guidelines were violated, although they say that they are no longer serious. Without knowing more, it is difficult for independent analysts to know whether there is a problem.”

Google has already been under fire because of its model security reporting practices.

The company needed weeks to publish a technical report for its most capable model, Gemini 2.5 Pro. When the report was finally published, he initially left out important details of the security tests.

On Monday, Google published a more detailed report with additional security information.

Leave a Comment Cancel reply