📝 Selected Publications

( * indicates equal contribution. Full list of publications)

Preprint.
preprint
sym
Accepted.
ICML 2025
sym

The Jailbreak Tax: How Useful are Your Jailbreak Outputs?

Kristina Nikolić, Luze Sun, Jie Zhang, Florian Tramèr

[ICML 2025, spotlight]

SaTML 2025
sym

Position: Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data

Jie Zhang, Debeshee Das, Gautam Kamath, Florian Tramèr

[IEEE SaTML 2025]

CCS 2024
sym

Evaluations of Machine Learning Privacy Defenses are Misleading

Michael Aerni*, Jie Zhang*, Florian Tramèr

[ACM CCS 2024]

IEEE SP 2025, DLSP workshop

Position: Adversarial ML Problems Are Getting Harder to Solve and to Evaluate

Javier Rando*, Jie Zhang*, Nicholas Carlini, Florian Tramèr

[IEEE SP 2025, DLSP workshop]

ICLR 2025
sym
IEEE SP 2025, DLSP workshop
sym

Blind Baselines Beat Membership Inference Attacks for Foundation Models

Debeshee Das, Jie Zhang, Florian Tramèr

[IEEE SP 2025, DLSP workshop]

NeurIPS 2024
sym

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

Edoardo Debenedetti, Jie Zhang, Mislav Balunović, Luca Beurer-Kellner, Marc Fischer, Florian Tramèr

[NeurIPS 2024 Dataset $\&$ Benchmark Track]