Summary
- It’s relatively simple to bypass safeguards built into AI photo editing tools like Sketch to Image and Reimagine.
- A Cornell Tech class focused on jailbreaking AI models demonstrated how fabricated images can be alarming and dangerous.
- The use of AI in photo editing — with zero expertise and little effort required — blurs the line between fiction and reality, posing potential risks right now.
Ultra-simple photo editing and image generation are among AI’s most novel and constantly refined abilities. Galaxy AI’s Sketch to Image opened the door to all sorts of ridiculous scenarios, and the Reimagine feature available on the Pixel 9 family through Magic Editor apparently works at least as well.
As The Verge discovered shortly after Reimagine launched, though, the feature’s protections against generating potentially dangerous, realistic images weren’t very robust. As students in Cornell Tech’s Red Teaming 101 course just proved, those guardrails haven’t gotten any better, as evidenced by relatively easy image generation featuring obliterated public transit, trash-covered parks, and an M1 Abrams tank rolling down the streets of New York (Source: Alexios Mantzarlis via BlueSky).
I didn’t care about Galaxy AI until it gave my dog a Pokémon bong
Sketch to image can do some very silly things
Bending the fabric of reality
AI is now basically better at photos than humans
The most eye-catching parts of these images are all AI-generated, i.e., fake.
The Cornell Tech course focuses on red teaming, or intentionally bypassing or destabilizing a service for the purposes of study. The students’ work showed that restrictions on keywords didn’t go very far in preventing the creation of alarming imagery. In one example, inserting an “M1 Abrams” into a picture easily bypassed a prohibition on generating a simple tank. Other red teamed results included destroyed ferries and trams, a trash-ridden public park, and a makeshift tent encampment outside NYC’s Roosevelt Island subway station.
It doesn’t take a particularly creative mind to see how these images could be used to spread fear and controversy… The onus remains on responsible tech platforms to build tools that can’t be abused in a couple of hours by a few creative minds. — Alexios Mantzarlis, Cornell
The Verge’s 2024 testing shed light on even more worrisome results, like fake pictures of bombs and seemingly toxic substances leaking out of schools. As Alexios Mantzarlis, Cornell Tech professor and director of its Security, Trust, and Safety Initiative, recently explained, “Google’s safety guardrails struggled in particular when the harmful use case was not the content of the prompt per se but its interplay with the context of the image.”
‘Ai is just another tool’ — until it’s not
Obviously, doctored images and Photoshop aren’t anything new. But, as The Verge pointed out last year, “This isn’t some piece of specialized software we went out of our way to use — it’s all built into a phone that my dad could walk into Verizon and buy.” For its part, Google is working to employ SynthID watermarking on Reimagine-altered images. But in the endless arms race of AI, researchers have already demonstrated that focused models can remove similar AI watermarks and forge fake ones.
A year or two ago, adding a convincing car crash to an image would have taken time, expertise, an understanding of Photoshop layers, and access to expensive software. Those barriers are gone. — Allison Johnson, The Verge
In case any unworried tech enthusiasts insist these tools won’t be put to nefarious uses, the line between reality and fiction isn’t just blurred within social and publishing media spheres, it’s nearly erased. As 404Media reports, a high-resolution image of Turkey’s Pikachu-clad protestor is already making the rounds, with countless viewers taking it at face value — except, it’s not real. Today, a beloved children’s character is being implicated in violent protests thanks to AI. Tomorrow, it could be you.