A graphic image of someone holding an AI nucleolus.

Is This Your AI? Researchers Crack AI Blackbox

Artificial intelligence (AI) systems power everything from chatbots to security cameras, yet many of the most advanced models operate as “black boxes.” Companies can use them, but outsiders can’t see how they were built, where they came from, or whether they contain hidden flaws.

This lack of transparency creates real risks. A model could contain security vulnerabilities or hidden backdoors. It could also be a lightly modified version of an open-source system — repackaged in violation of its license — with no easy way to prove it.

Researchers at the Georgia Institute of Technology have developed a new framework, ZEN, to help solve this problem. The tool can recover a model’s unique “fingerprint” directly from its memory, allowing experts to trace its origins and reconstruct how it was assembled.

“Analyzing a proprietary AI model without identifying where it came from and how it is constructed is like trying to fix a car engine with the hood welded shut,” said David Oygenblik, a Ph.D. student at Georgia Tech and the study’s lead author.

“ZEN not only X-rays the engine but also provides the complete wiring diagram.”

ZEN works by taking a snapshot of a running AI system and extracting information about both its mathematical structure and the code that defines it. It compares that fingerprint against a database of known open-source models to determine the system’s origin.

If it finds a match, ZEN identifies the exact changes and generates software patches that allow investigators to recreate a working replica of the proprietary model for testing.

That capability has major implications for both security and intellectual property protection.

“With ZEN, a security analyst can finally test a black-box model for hidden backdoors, and a company can gather concrete evidence to prove its software license was infringed,” Oygenblik said.

To evaluate the system, the research team tested ZEN on 21 state-of-the-art AI models, including Llama 3, YOLOv10, and other well-known systems.

ZEN correctly traced every customized model back to its original open-source foundation — achieving 100% attribution accuracy. Even when models had been heavily modified — differing by more than 83% from their original versions — ZEN successfully identified the changes and enabled full reconstruction for security testing.

The researchers will present their findings at the 2026 Network and Distributed System Security (NDSS) Symposium. The paper, Achieving Zen: Combining Mathematical and Programmatic Deep Learning Model Representations for Attribution and Reuse, was authored by Oygenblik, master’s student Dinko Dermendzhiev, Ph.D. students Filippos Sofias, Mingxuan Yao, Haichuan Xu, and Runze Zhang, post-doctorate scholars Jeman Park, and Amit Kumar Sikder, as well as Associate Professor Brendan Saltaformaggio.